- Last 7 days
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study offers valuable insights into the conformational dynamics of the nucleic acid recognition lobe of GeoCas9, a thermophilic Cas9 from Geobacillus stearothermophilus. The authors investigate the influence of local dynamics and allosteric regulation on guide RNA binding affinity and DNA cleavage specificity through molecular dynamics simulations, advanced NMR techniques, RNA binding studies, and mutagenesis. While the mutations studied do not lead to significant changes in GeoCas9 cleavage activity, the study provides convincing evidence for the role of allosteric mechanisms and interdomain communication in Cas9 enzymes, and will be of great interest to biochemists and biophysicists exploring these complex systems.
-
Reviewer #1 (Public review):
Summary:
In this study from Belato, Knight and co-workers, the authors investigated the Rec domain of a thermophilic Cas9 from Geobacillus stearothermophilus (GeoCas9). The authors investigated three constructs, two individual subdomains of Rec (Rec1 and Rec2) and the full Rec domain. This domain is involved in binding to the guide RNA of Cas9, as well as the RNA-DNA duplex that is formed upon target binding. The authors performed RNA binding and relaxation experiments using NMR for the wild-type domain as well as two-point mutants. They observed differences in RNA binding activities as well as the flexibility of the domain. The authors also performed molecular dynamics and functional experiments on full-length GeoCas9 to determine whether these biophysical differences affect the RNA binding or cleavage activity. Although the authors observed some changes in the thermal stability of the mutant GeoCas9-gRNA complex, they did not observe substantial differences in the guide RNA binding or cleavage activities of the mutant GeoCas9 variants.
Overall, this manuscript provides a detailed biophysical analysis of the GeoCas9 Rec domain. The NMR assignments for this construct should prove very useful, and can serve as the basis for future similar studies of GeoCas9 Rec domain mutants. While the two mutants tested in the study did not produce significant differences from wild-type GeoCas9, the study rules out the possibility that analogous mutations can be translated between type II-A and II-C Cas9 orthologs. Together, these findings may provide the grounds for future engineering of higher fidelity variants of GeoCas9
-
Reviewer #2 (Public review):
The manuscript from Belato et al., used advanced NMR approaches and a mutagenesis campaign probe the conformational dynamics of the recognition lobe (Rec) of the CRISPR Cas9 enzyme from G. stearothermophilus (GeoCas9). Using truncated and full-length constructs they assess the impacts of two different point mutations have on the redistribution and timescale of these motions and assess gRNA recognition and specificity. Single point mutations in the Rec domain in a Cas9 from a related species had profound impacts on- and off-target DNA editing, therefore the authors reasoned analogous mutations in GeoCas9 would have similar effects. However, despite a redistribution of local motions and changes in global stability, their chosen mutations had little impact on DNA editing in the context of the full-length enzyme.
In their revised manuscript, the authors were highly responsive to the reviewer's comments incorporating new experimental results including molecular dynamics simulations and RNA binding data using full-length GeoCas9, as well as reframing their discussion and conclusions in consideration of the new data. They were receptive to suggestions for clarification in both the text and methods section. With these changes, the manuscript has been significantly improved.
Their studies highlight the species-specific complexity of interdomain communication and allosteric mechanisms used by these multi-domain endonucleases. The noted strengths of the article remain, and despite the negative results, their approach will garner interest from investigators interested in understanding how the activity and specificity of these enzymes can be engineered to tune activity and limit off-target cleavage by these enzymes. Generally, the manuscript highlights the challenges of studying the effect of allosteric networks on protein function, particularly in multidomain proteins, and thus will be of broad interest to the community.
-
Reviewer #3 (Public review):
The authors explore the role of Rec domains in a thermophilic Cas9 enzyme. They report on the crystal structure of part of the recognition lobe, its dynamics from NMR spin relaxation and relaxation-dispersion data, its interaction mode with guide RNA, and the effect of two single-point mutations hypothesised to enhance specificity. They find that mutations have small effects on Rec domain structure and stability but lead to significant rearrangement of micro- to milli-second dynamics which does not translate into major changes in guide RNA affinity or DNA cleavage specificity, illustrating the inherent tolerance of GeoCas9. The work can be considered as a first step towards understanding motions in GeoCas9 recognition lobe, although no clear hotspots were discovered with potential for future rational design of enhanced Cas9 variants.
Strengths:
- Detailed biophysical and structural investigation, despite a few technical limitations inherent with working with complex targets, provides converging evidence that molecular dynamics embedded in the recognition lobes allow GeoCas9 to operate on a broad range of substrates.<br /> - Since the authors and others have shown that substrate specificity is dictated by equivalent hotspot mutations in other Cas9 variants, we are one step closer to understanding this phenomenon.
Weaknesses:
- Since the mutations investigated here do not significantly affect substrate binding or enzymatic activity, it is difficult to rationalize anything for enzyme engineering at this point.<br /> - Further investigation of the determinants of the observed dynamic modes, and follow-up with rationally designed mutations would hopefully allow to create a real model of the mechanism, but I do understand that this goes beyond the scope of this study.
-
Author response:
The following is the authors’ response to the previous reviews
Responses to final minor critiques following initial revision
Reviewer #1 (Recommendations for the authors):
The authors have generally done an excellent job of addressing my and the other reviewers' concerns. I have a few additional concerns that the authors could consider addressing through changes to the text:
We thank the Reviewer for this assessment and are glad to have addressed the major points.
- Regarding the gRNA used for NMR studies, I thank the authors for adding additional rationale for their design of the RNA used. However, I still believe that it is misleading to term this RNA as a "gRNA", given that it is mainly composed of a sequence that is arbitrary (the spacer) and the sections of the gRNA that are constant between all gRNAs are truncated in a way that removes secondary structure that is likely essential for specific contacts with the Rec domains. I do not believe the authors need to make alterations to any of their experiments. However, I do think their description of the "gRNA" should be updated to properly reflect that this RNA lacks any of the secondary structure present in a typical gRNA, much of which is necessary to confer specificity of binding between GeoCas9 and the gRNA. As mentioned in my previous review, this may be best achieved by adding a cartoon of the secondary structure of the full-length gRNA and highlighting the region that was used in the truncated "gRNA".
We understand the Reviewer’s point. For any experiment in which the gRNA was truncated (i.e. NMR or some MST studies), we have clarified the text and no longer call it a “gRNA.” We state initially that it is a portion of the gRNA and then call it simply an “RNA.”
For experiments using the full-length constructs, we have kept the term “gRNA,” as it remains appropriate.
We have also added a final Supplementary figure (S12) showing the structures of the truncated and full-length RNAs used, based on the _Geo_Cas9 cryo-EM structure and predicted with RNAfold.
- Lines 256-257: "The ~3-fold decrease in Kd...". I believe the authors are discussing the Kd's of the mutants relative to WT, in which case the Kd increased. Also, the fold-change appears closer to 2fold than to 3-fold.
Yes, the Reviewer makes a good catch. We have corrected this.
- Lines 407-408: "The mutations also diminished the stability of the full-length GeoCas9 RNP complex." This statement seems at odds with the authors' conclusions in the Results section that the full-length GeoCas9 variants had comparable affinities for the gRNAs (lines 376-382)
We agree that this seems contradictory. In the absence of full-length structures for all variants, we can’t definitively state what causes this. It could be that the mutation has an interesting allosteric effect on structure that does not affect RNA binding but induces the Cas9 protein to simply fall apart at lower temperatures, rendering the binding interaction moot. We have added a statement to this section.
- The authors chose to keep "SpCas9" for consistency with their prior work and the work of many several others, including Doudna et al and Zhang et al. However, I will note that their publications on GeoCas9, the Doudna lab did use SpyCas9 to ensure consistent nomenclature within the publications.
We have made the change to “_Spy_Cas9”
Reviewer #3 (Recommendations for the authors):
The authors clearly answered most of my concerns. I still have some technical questions about the analysis of CPMG-RD data but the numbers provided now seem to make sense. While I still think that crystal structures of the point mutant would make the conclusions more "bullet proof", I do appreciate the work associated with this and consider that the manuscript can be published as is.
We agree that additional magnetic fields could allow for additional models of CPMG data fitting and that additional crystal structures of the mutants could add to the conclusions. We appreciate the Reviewer recognizing the balance of the current results and potential future studies in signing off on publication.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study investigates neural circuits mediating motor responses to cold in Drosophila larvae. Using a combination of behavioral analysis, genetic manipulations, EM connectomics, and reporters of calcium activity, the authors provide solid evidence that specific sensory and central neurons are required for cold-induced body contraction. This paper may be of interest to neuroscientists interested in how nervous systems sense and respond to cold.
-
Reviewer #1 (Public review):
Summary.
The authors goal was to map the neural circuitry underlying cold sensitive contraction in Drosophila. The circuitry underlying most sensory modalities has been characterized but noxious cold sensory circuitry has not been well studied. Importantly, they analyze all downstream partner neurons connected to the Class III sensory neurons. The authors achieve their goal and map out sensory and post-sensory neurons involved in this behavior.
Strengths.
The manuscript provides compelling evidence for sensory and post sensory neurons involved in noxious cold sensitive behavior. They use both connectivity data and functional data to identify these neurons. This work is a clear advance in our understanding of noxious cold behavior. The experiments are done with a high degree of experimental rigor.
Weaknesses.
I find no major weaknesses in this work. It is a massive amount of data that clearly shows the role of the Class III neurons in cold-induced larval body contraction.
-
Reviewer #2 (Public review):
Patel et al perform the analysis of neurons in a somatosensory network involved in responses to noxious cold in Drosophila larva. Using a combination of behavioral experiments, Calcium imaging, optogenetics and synaptic connectivity analysis in the Drosophila larval they assess the function of circuit elements in the somatosensory network downstream of multimodal somatosensory neurons involved in innocuous and noxious stimuli sensing and probe their function in noxious cold processing, Consistent with their previous findings they find the multidendritic class III neurons , to be the key cold sensing neurons that are both required and sufficient for the CT behaviors response (shown to evoked by noxious cold). They further investigate the downstream neurons identified based on literature and connectivity from EM at different stages of sensory processing characterize the different phenotypes upon activating/silencing those neurons and monitor their responses to noxious cold. The work reveals diverse phenotypes for the different neurons studied and provides the groundwork for understanding how information is processed in the nervous system from sensory input to motor output and how information from different modalities is processed by neuronal networks. However, at times the writing could be clearer and some results interpretations more rigorous
-
Reviewer #3 (Public review):
Summary:
The authors follow up on prior studies where they have argued for the existence of cold nociception in Drosophila larvae. In the proposed pathway, mechanosensitive Class III multidendritic neurons are the noxious cold responding sensory cells. The current study attempts to explore the potential roles of second and third order neurons, based on information of the Class III neuron synaptic outputs that has been obtained from the larval connectome.
Strengths:
The major strength of the manuscript is the detailed discussion of the second and third order neurons that are downstream of the mechanosensory Class III multidendritic neurons. These will be useful in further studies of gentle touch mechanosensation and mechanonociception both of which rely on sensory input from these cells. Calcium imaging experiments on Class III activation with optogenetics support the wiring diagram.
Weaknesses:
The scientific premise is that a full body contraction in larvae that are exposed to noxious cold is a sensorimotor behavioral pathway. This premise is, to start with, questionable. A common definition of behavior is a set of "orderly movements with recognizable and repeatable patterns of activity produced by members of a species (Baker et al., 2001)." In the case of nociception behaviors, the patterns of movement are typically thought to play a protective role and to protect from potential tissue damage.
Does noxious cold elicit a set of orderly movements with a recognizable and repeatable pattern in larvae? Can the patterns of movement that are stimulated by noxious cold allow the larvae to escape harm? Based on the available evidence, the answer to both questions is seemingly no. In response to noxious cold stimulation many, if not all, of the muscles in the larva, simultaneously contract (Turner et al., 2016) and as a result the larva becomes stationary. In response to cold, the larva is literally "frozen" in place and it is incapable of moving away. This incapacitation by cold is the antithesis of what one might expect from a behavior that protects the animals from harm.
An extensive literature has investigated the physiological responses of insects to cold (reviewed in Overgaard and MacMillan, 2017). In numerous studies of insects across many genera (excluding cold adapted insects such as snow flies), exposure to very cold temperatures quickly incapacitates the animal and induces a state that is known as a chill coma. During a chill coma the insect becomes immobilized by the cold exposure, but if the exposure to cold is very brief the insect can often be revived without apparent damage. Indeed, it is common practice for many laboratories that use adult Drosophila for studies of behavior to use a brief chilling on ice as a form of anesthesia because chilling is less disruptive to subsequent behaviors than the more commonly used carbon dioxide anesthesia. If flies were to perceive cold as a noxious nociceptive stimulus, then this "chill coma" procedure would likely be disruptive to behavioral studies, but is not. Furthermore, there is no evidence to suggest that larval sensation of "noxious cold" is aversive.
The insect chill coma literature has investigated the effects of extreme cold on the physiology of nerves and muscle and the consensus view of the field is that the paralysis that results from cold is due to complex and combined action of direct effects of cold on muscle and on nerves (Overgaard and MacMillan, 2017). Electrophysiological measurements of muscles and neurons find that they are initially depolarized by cold, and after prolonged cold exposure they are unable to maintain potassium homeostasis and this eventually inhibits the firing of action potentials (Overgaard and MacMillan, 2017). The very small thermal capacitance of a Drosophila larva means that its entire neuromuscular system will be quickly exposed to the effect of cold in the behavioral assays under consideration here. It would seem impossible to disentangle the emergent properties of a complex combination of effects on physiology (including neuronal, glial, and muscle homeostasis) on any proposed sensorimotor transformation pathway.
Nevertheless, the manuscript before us makes a courageous attempt at attempting this. A number of GAL4 drivers tested in the paper are found to affect parameters of contraction behavior (CT) in cold exposed larvae in silencing experiments. However, notably absent from all of the silencing experiments are measurements of larval mobility following cold exposure. Thus, it is not known from the study if these manipulations are truly protecting the larvae from paralysis following cold exposure, or if they are simply reducing the magnitude of the initial muscle contraction that occurs immediately following cold (ie reducing CT). The strongest effect of silencing occurs with the 19-12-GAL4 driver which targets Class III neurons (but is not completely specific to these cells).
Optogenetic experiments for Class III neurons relying on the 19-12-GAL4 driver combined with a very strong optogenetic acuator (ChETA) show the CT behavior that was reported in prior studies. It should be noted that this actuator drives very strong activation, and other studies with milder optogenetic stimulation of Class III neurons have shown that these cells produce behavioral responses that resemble gentle touch responses (Tsubouchi et al 2012 and Yan et al 2013). As well, these neurons express mechanoreceptor ion channels such as NompC and Rpk that are required for gentle touch responses.
A major weakness of the study is that none of the second or third order neurons (that are downstream of CIII neurons) are found to trigger the CT behavioral responses even when strongly activated with the ChETA actuator (Figure 2 Supplement 2). These findings raise major concerns for this and prior studies and it does not support the hypothesis that the CIII neurons drive the CT behaviors.
Later experiments in the paper that investigate strong CIII activation (with ChETA) in combination with other second and third order neurons does support the idea activating those neurons can facilitate the body-wide muscle contractions. But many of the co-activated cells in question are either repeated in each abdominal neuromere or they project to cells that are found all along the ventral nerve cord, so it is therefore unsurprising that their activation would contribute to what appears to be a non-specific body-wide activation of muscles along the AP axis. As well, if these neurons are already downstream of the CIII neurons the logic of this co-activation approach is not particular clear. A more convincing experiment would be to silence the different classes of cells in the context of the optogenetic activation of CIII neurons to test for a block of the effects, a set of experiments that is notably absent from the study.
The authors argument that the co-activation studies support "a population code" for cold nociception is a very optimistic interpretation of a brute force optogenetics approach that ultimately results in an enhancement of a relatively non-specific body-wide muscle convulsion.
Comments on revisions:
The resubmitted version of this manuscript suffers from the same weaknesses that were raised in the prior round of review. The authors claim that muscles have been removed from the electrophysiological preparations of prior studies is overstated. A small subset of muscles are removed during their recording procedures and this does not rule out the possibility that mechanical forces that are generated by the remaining muscles are being sensed by the mechanosensory neurons.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
Summary.
The authors goal was to map the neural circuitry underlying cold sensitive contraction in Drosophila. The circuitry underlying most sensory modalities has been characterized but noxious cold sensory circuitry has not been well studied. The authors achieve their goal and map out sensory and post-sensory neurons involved in this behavior.
Strengths.
The manuscript provides convincing evidence for sensory and post sensory neurons involved in noxious cold sensitive behavior. They use both connectivity data and functional data to identify these neurons. This work is a clear advance in our understanding of noxious cold behavior. The experiments are done with a high degree of experimental rigor.
Positive comments
- Campari is nicely done to map cold responsive neurons, although it doesn't give data on individual neurons.
- Chrimson and TNT experiments are nicely done.
- Cold temperature activates basin neurons, it's a solid and convincing result.
Weaknesses.
Among the few weaknesses in this manuscript is the failure to trace the circuit from sensory neuron to motor neuron; and to ignore analysis of the muscles driving, cold induced contraction. Authors also need to elaborate more on the novel aspects of their work in the introduction or abstract.
We have performed a more thorough em connectivity analysis of the CIII md neuron circuit (Figure 1A, Figure 1 – Figure supplement 1, Figure 10A). We now report all premotor neurons that are connected to CIII md neurons along with two additional projection/commandlike neurons. These additional premotor neurons (A01d3, A02e, A02f, A02g, A27k, and A31k) that are primarily implicated in locomotion were not required for cold nociception (Figure 5 – Figure supplement 2). Collectively, we have tested the requirement in cold nociception for ~94% synapses between CIII md->premotor neurons and all tested premotor with available driver lines. The requirement in cold nociception was also assessed for the two projection/command-like neurons dLIP7 and A02o neurons, which are required for sensory integration and directional avoidance to noxious touch, respectively (Figure 7 – Figure supplement 2) (Hu et al., 2017; Takagi et al., 2017). Silencing dLIP7 neurons resulted in modest reduction in cold-evoked behaviors, meanwhile A02o neurons were not required for cold nociception (Figure 7 – Figure supplement 2). To complete the analysis from thermosensation to evoked behavior, we analyzed cold-evoked Ca<sup>2+</sup> responses of larval musculature (Figure 10). Premotor neurons, which are connected to CIII md neurons, target multiple muscle groups (DL, DO, LT, VL, and VO) (Figure 10A). Individual larval segments have unique cold-evoked Ca<sup>2+</sup> responses, where the strongest cold-evoked Ca<sup>2+</sup> occurs in the central abdominal segments (Figure 10B-D). Inhibiting motor neuron activity or using an anesthetic (ethyl ether), there is a negligible cold-evoked Ca<sup>2+</sup> response compared to controls (Figure 10 – Figure supplement 1). Analysis of cold-evoked Ca<sup>2+</sup> in individual muscles reveal unique Ca<sup>2+</sup> dynamics for individual muscle groups (Figure 10E-H).
Major comments.
- Class three sensory neuron connectivity is known, and role in cold response is known (turner 16, 18). Need to make it clearer what the novelty of the experiments are.
In figure 1, we are trying to guide the audience to CIII md neuron circuitry and emphasize the necessity and sufficiency CIII md neurons in cold nociception. Previously, only transient (GCaMP6) cold-evoked Ca<sup>2+</sup> were reported (Turner et al., 2016, 2018). However, here using CaMPARI, we performed dendritic spatial (sholl) analysis of cold-evoked Ca<sup>2+</sup> responses (Figure 1B-C). During the revision, we evaluated both CIII- and cold-evoked CT throughout larval development (Figure 1G, H). All in all, the findings from the first figure reiterate and replicate previous findings for the role of CIII md neuron in cold nociception. CIII md connectivity might be known, however, we investigated the functional and physiological roles of individual circuit neurons.
- Why focus on premotor neurons in mechano nociceptive pathways? Why not focus on PMNs innervating longitudinal muscles, likely involved in longitudinal larval contraction? Especially since chosen premotor neurons have only weak effects on cold induced contraction?
We assessed requirements for all premotor neurons that are connected to CIII md neurons and for which there are validated driver lines. Only premotor neurons (DnB, mCSI and Chair-1), which were previously initially implicated in mechanosensation, were also required for cold nociception. Premotor neurons previously implicated in locomotion (A01d3, A02e, A02f, A02g, A27k, and A31k) are not required for cold-evoked behaviors (Figure 5 – Figure supplement 2).
Reviewer #2 (Public Review):
Patel et al perform the analysis of neurons in a somatosensory network involved in responses to noxious cold in Drosophila larvae. Using a combination of behavioral experiments, Calcium imaging, optogenetics, and synaptic connectivity analysis in the Drosophila larval they assess the function of circuit elements in the somatosensory network downstream of multimodal somatosensory neurons involved in innocuous and noxious stimuli sensing and probe their function in noxious cold processing, Consistent with their previous findings they find the multidendritic class III neurons, to be the key cold sensing neurons that are both required and sufficient for the CT behaviors response (shown to evoked by noxious cold). They further investigate the downstream neurons identified based on literature and connectivity from EM at different stages of sensory processing characterize the different phenotypes upon activating/silencing those neurons and monitor their responses to noxious cold. The work reveals diverse phenotypes for the different neurons studied and provides the groundwork for understanding how information is processed in the nervous system from sensory input to motor output and how information from different modalities is processed by neuronal networks. However, at times the writing could be clearer and some results interpretations more rigorous.
Specific comments
(1) In Figure 1 -supplement 6D-F (Cho co-activation)
The authors find that Ch neurons are cold sensitive and required for cold nociceptive behavior but do not facilitate behavioral responses induced but CIII neurons
The authors show that coactivating mdIII and cho inhibits the CT (a typically observed coldinduced behavioral response) in the second part of the stimulation period, while Cho was required for cold-induced CT. Different levels of activation of md III and Cho (different light intensities) could bring some insights into the observed phenotypes upon Cho manipulation as different levels activate different downstream networks that could correspond to different stimuli. Also, it would be interesting to activate chordotonal during exposure to cold to determine how a behavioral response to cold is affected by the activation of chordotonal sensory neurons.
Modulating both CIII md and Ch activation to assess the contribution of individual sensory neuron’s role in thermosensation would certainly shed unique insights. However, we believe that such analyses are beyond the scope of the current manuscript and better suited to future followup studies.
(2) Throughout the paper the co-activation experiments investigate whether co-activating the different candidate neurons and md III neurons facilitates the md III-induced CT response. However, the cold noxious stimuli will presumably activate different neurons downstream than optogenetic activation of MdIII and thus can reveal more accurately the role of the different candidate neurons in facilitating cold nociception.
We agree that the CIII md neuron activation of the downstream circuitry would be different from the cold-evoked activation of neurons downstream of primary sensory neurons. We believe that our current finding lay foundations for future works to evaluate how multiple sensory neurons work in concert for generating stimulus specific behavioral responses.
(3) Use of blue lights in behavioral and imaging experiments
Strong Blue and UV have been shown to activate MDIV neurons (Xiang, Y., Yuan, Q., Vogt, N. et al. Light-avoidance-mediating photoreceptors tile the Drosophila larval body wall. Nature 468, 921-926 (2010). https://doi.org/10.1038/nature09576) and some of the neurons tested receive input from MdIV.
In their experiments, the authors used blue light to optogenetically activate CDIII neurons and then monitored Calcium responses in Basin neurons, premotor neurons, and ascending neurons and UV light is necessary for photoconversion in Campari Experiments. Therefore, some of the neurons monitored could be activated by blue light and not cdIII activation. Indeed, responses of Basin-4 neurons can be observed in the no ATR condition (Fig 3HI) and quite strong responses of DnB neurons. (Figure 6E) How do authors discern that the effects they see on the different neurons are indeed due to cold nociception and not the synergy of cold and blue light responses could especially be the case for DNB that could have in facilitating the response to cold in a multisensory context (where mdIV are activated by light).
In addition, the silencing of DNB neurons during cold stimulation does not seem to give very robust phenotypes (no significant CT decrease compared to empty GAL4 control).
It would be important to for example show that even in the absence of blue light the DNB facilitates the mdIII activation or cold-induced CT by using red light and Chrimson for example or TrpA activation (for coactivation with md III).
Alternatively, in some other cases, the phenotype upon co-activation could be inhibited by blue light (e.g. chair-1 (Figure 5 H-I)).
More generally, given the multimodal nature of stimuli activating mdIV , MdIII (and Cho) and their shared downstream circuitry it is important to either control for using the blue light in these stimuli or take into account the presence of the stimulus in interpreting the results as the coactivation of for example Cho and mdIII using blue lights also could activate mdIV (and downstream neurons, alter the state of the network that could inhibit the md III induced CT responses.
Assessing the differences in behavioral phenotypes in the different conditions could give an idea of the influence of combining different modalities in these assays. For example, did the authors observe any other behaviors upon co-activation of MDIII and Cho (at the expense of CT in the second part of the stimulation) or did the larvae resume crawling? Blue light typically induces reorientation behavior. What about when co-activating mdIII and Basin-4?
Using Chrimson and red light or TrpA in some key experiments e.g. with Cho, Basin-4, and DNB would clarify the implication of these neurons in cold nociception
We agree that exposure to a bright light source results in avoidance behaviors in Drosophila larvae, which is primarily mediated by CIV md neurons. However, the light intensities used in our assays is much milder than the ones required to activate sensory neurons. Specifically, based on Xiang et al. 470nm light does not evoke any electrical response at the lowest tested light intensity (0.74mWmm<sup>-2</sup>), whereas our light intensity used in behavioral experiments was much lower at 0.15mWmm<sup>-2</sup>. Additionally, we assessed larval mobility and turning for control conditions ±ATR and also sensory neuron activation. As expected, there is an increase in larval immobility upon CIII md neurons activation (Author response image 1). Only activation of CIV md neurons resulted in light-evoked turning, meanwhile remaining conditions did show stimulus time locked turning response (Author response image 1). Furthermore, we tested whether the intensity of 470nm light used in our behavior experiments was enough to result in light-evoked Ca<sup>2+</sup> response in CIII md and CIV md neurons. We expressed RCaMP in sensory neurons using a pan-neural driver (GMR51C10<sup>GAL4</sup>). There was no detectable increase in light-evoked Ca<sup>2+</sup> response in either CIII md or CIV md neuron (Author response image 1).
Furthermore, we also tested multiple optogenetic actuators (ChR2, ChR2-H134R, and CsChrimson) and two CIII md driver lines (19-12<sup>Gal4</sup> and R83B04<sup>Gal4</sup>). Regardless of the optogenetic actuator used or the wavelength of the light used, we observe light-evoked CT responses (Figure 1– Figure supplement 6). We found using CsChrimson raises several procedural challenges with our current experimental setup. In our hands, CsChrimson showed extreme sensitivity to any amount ambient white light intensities, whereas others have used infrared imaging to counteract ambient light sensitivity. Our imaging setup is equipped with visible spectrum imaging and cannot be retrofitted record infrared light sources. Thus, we have limited the use of CsChrimson to optogenetic-Ca<sup>2+</sup> imaging experiments, where we are not recording larval behavior.
The use of TrpA1 would require heat stimulation for activating the channels, which in turn would impact downstream circuit neurons that are shared amongst sensory neurons.
For CaMPARI experiments, the PC light was delivered using a similar custom filter cube, which was used in the original CaMPARI paper (Fosque et al., 2015). This filter cube delivers 440nm wavelength as the PC light. PC light exposure in absence of cold stimulus does not result in differential CaMPARI conversion between CIII md and CIV md (F<sub>red/green</sub> = 0.086 and 0.097, respectively). For the same condition, Ch neurons have high CaMPARI, but it is expected as they function in proprioception. Therefore, the chances of downstream neurons being solely activated by PC light remain low. The differential baseline CaMPARI F<sub>red/green</sub> ratios of individual circuit neurons could be a result of varying resting state cytosolic Ca<sup>2+</sup> concentrations.
Lastly, for optogenetic-GCaMP experiments, where we use CIII md>CsChrimson and Basin-2/-4 or DnB>GCaMP to visualize CIII md evoked Ca<sup>2+</sup> responses in downstream neuron. Xiang et al. reported that confocal laser excitation for GCaMP does not activate CIV md neurons, which is consistent with what we have observed as well.
Author response image 1.
(A) For optogenetic experiments, percent turning was assessed in control conditions and sensory neuron activation. Only CIV md neurons activation results in an increase in bending response. Other conditions do not blue light-evoked turning. (A’) We assessed larval turning based on ellipse fitting using FIJI, the aspect ratio of the radii is indicative of larval bending state. We empirically determined that radii ratio of <2.5 represents a larval turning/bending. This method of ellipse fitting has previously been used to identify C. elegans postures using WrMTrck in FIJI (Nussbaum-Krammer et al., 2015). (B) Percent immobility for all control conditions plus sensory activation driver lines. Only CIII md neuron activation leads to sustained stimulus-locked increase in immobility. There’s also no blue light-evoked reductions in mobility, indicating that there was not increase in larval movement due to blue light. (C) We assessed CIII md (ddaF) and CIV md (ddaC) neurons response to blue light with similar light intensity that was used in behavioral optogenetic experiments. There is no blue light evoked increase in RCaMP fluorescence.
(4) Basins
- Page 17 line 442-3 "Neural silencing of all Basin (1-4) neurons, using two independent driver lines (R72F11GAL4 and R57F07<sup>GAL4</sup>).
Did the authors check the expression profile of the R57F07 line that they use to probe "all basins"? The expression profile published previously (Ohyama et al, 2015, extended data) shows one basin neuron (identified as basin-4 ) and some neurons in the brain lobes. Also, the split GAL4 that labels Basin-4 (SS00740) is the intersection between R72F11 and R57F07 neurons. Thus the R57F07 likely labels Basin-4 and if that is the case the data in Figure 2 9 and supplement) and Figure 3 related to this driver line, should be annotated as Basin-4, and the results and their interpretation modified to take into account the different phenotypes for all basins and Basin-4 neurons.
Due to the non-specific nature of R57F07<sup>GAL4</sup> in labeling Basin-4 and additional neuron types, we have decided to remove the driver line from our current analysis. We would need to perform further independent investigations to identify the other cell types and validate their role in cold nociception.
Page 19 l. 521-525 I am confused by these sentences as the authors claim that Basin-4 showed reduced Calcium responses upon repetitive activation of CDIII md neurons but then they say they exhibit sensitization. Looking at the plots in FIG 3 F-I the Basin-4 responses upon repeated activation seem indeed to decrease on the second repetition compared to the first. What is the sensitization the authors refer to?
We have rephrased this section.
On Page 47-In this section of the discussion, the authors emit an interesting hypothesis that the Basin-1 neuron could modulate the gain of behavioral responses. While this is an interesting idea, I wonder what would be the explanation for the finding that co-activation of Cho and MDIII does not facilitate cold nociceptive responses. Would activation of Basin-1 facilitate the cold response in different contexts (in addition to CH0-mediated stimuli)?
Page 48 Thus the implication of the inhibitory network in cold processing should be better contextualized.
The authors explain the difference in the lower basin-2 Ca- response to Cold/ mdIII activation (compared to Basin-4) despite stronger connectivity, due a stronger inputs from inhibitory neurons to Basin-2 (compared to Basin-4). The previously described inhibitory neurons that synapse onto Basin-2 receive rather a small fraction of inputs from the class III sensory neurons. The differences in response to cold could be potentially assigned to the activation of the inhibitory neurons by the cold-sensing cho- neurons. However, that cannot explain the differences in responses induced by class III neurons. Do the authors refer to additional inhibitory neurons that would receive significant input from MdIII?
Alternative explanations could exist for this difference in activation: electrical synapses from mdIII onto Basin-4, and by stronger inputs from mdIV (compared to Basin-2 in the case of responses to Cold stimulus (Cold induces responses in md IV sensory neurons). Different subtypes of CD III may differentially respond to cold and the cold-sensing ones could synapse preferentially on basin-4 etc.
A possible explanation for lack of CT facilitation when Ch and CIII md neurons are both activated are likely the competing sensory inputs going into Basins and yet unknown role of the inhibitory network between sensory and Basin neurons in cold nociception (Jovanic et al., 2016). Mechanical activation of Ch leads to several behavioral responses (hunch, back-up, pause, crawl, and/or bend) and transition between behaviors (Kernan et al., 1994; Tsubouchi et al., 2012; Zhang et al., 2015; Turner et al., 2016, 2018; Jovanic et al., 2019; Masson et al., 2020).
Meanwhile, primary CIII md-/cold-evoked is CT (Turner et al., 2016, 2018, Patel et al., 2022, Himmel et al., 2023). Certain touch- versus cold- evoked behaviors are mutually exclusive, where co-activation of Ch and CIII md likely leads to competing neural impulses leading to lack of any single behavioral enhancement. Furthermore, the mini circuit motif between Ch and Basins consisting of feedforward, feedback and lateral inhibitory neurons that play a role in behavioral selection and transitions might impact the overall output of Basin neurons. Upon Ch and CIII md neuron co-activation, the cumulative Basin neuronal output may be biased towards increased behavioral transitions instead of sustained singular behavior response.
While we posited one possible mechanism explaining the differences between cold- or CIII mdevoked Ca<sup>2+</sup> responses in Basin 2 and 4 neurons, where we suggest the differences in evoked Ca<sup>2+</sup> responses may arise due to differential connectivity of TePns and inhibitory network neurons to Basin 2 and/or 4. Furthermore, ascending A00c neurons are connected to descending feedback SEZ neuron, SeIN128, which have connectivity to Basins (1-3 and strongest with Basin 2), A02o, DnB, Chair-1 and A02m/n (Ohyama et al., 2015; Zhu et al., 2024). However, how the 5 different subtypes of CIII md neurons respond to cold is unknown. Electrical recordings of the dorsal CIII md neurons revealed that within & between neuron subtypes there’s variability in temperature sensitivity of individual neurons, where population coding results in fine-tuned central temperature representation (Maksymchuk et al., 2022). Evaluating the role of how individual CIII md subtypes Basin activation could reveal important insights into the precise relationship between CIII md and multisensory integration Basin neurons. However, as of yet there are no known CIII md neuron driver lines that mark a subset of CIII md neurons thus limiting further clarification on how primary sensory information is transduced to integration neurons.
(5) A00c
Page 26 Figure 4F-I line While Goro may not be involved in cold nociception the A00c (and A05q) seems to be.
A00c could convey information to other neurons other than Goro and thus be part of a pathway for cold-induced CT.
A deeper look into A00c connectivity reveals that there is a reciprocal relationship between A00c and SEZ descending neuron, SeIN128 (Ohyama et al., 2015; Zhu et al., 2024). Additionally, this feedback SEZ descending neuron synapse onto A02o, A05q, Basins (highest connectivity to Basin 2 and weak connectivity to Basin 1 & 3), and select premotor neurons (Chair-1, DnB, and A02m/n) (Ohyama et al., 2015; Zhu et al., 2024). Interestingly, SEZ feedback neuron likely plays a role in the observed cold-/CIII md neuron evoked differential calcium activity and behavioral requirement amongst Basin-2 and -4 in cold nociception. We have added this to our discussion section.
(6) Page 31 766-768 the conclusion that "premotor function is required for and can facilitate cold nociception" seems odd to stress as one would assume that some premotor neurons would be involved in controlling the behavioral responses to a stimulus. It would be more pertinent in the summary to specify which premotor neurons are involved and what is their function
We have updated the section regarding premotor neurons’ role in cold nociception and now there’s a more specific concluding statement.
(7) There are several Split GAL4 used in the study (with transgenes inserted in attP40 et attP2 site). A recent study points to a mutation related to attP40 that can have an effect on muscle function: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9750024/. The controls used in behavioral experiments do not contain the attP40 site. It would be important to check a control genotype bearing an attP40 site and characterize the different parameters of the CT behavior to cold and take this into account in interpreting the results of the experiments using the SplitGAL4 lines
We have performed control experiments bearing empty attP40;attP2 sites in our neural silencing experiments. The observed muscle phenotypes were present in larvae bearing homozygous copies attP40/attP40 (van der Graaf et al., 2022). However, in our experiments, none of the larvae that we tested behaviorally had homozygous attP40;attP2 insertions. We have updated Table 1 to now include insertion sites.
Reviewer #3 (Public Review):
Summary:
The authors follow up on prior studies where they have argued for the existence of cold nociception in Drosophila larvae. In the proposed pathway, mechanosensitive Class III multidendritic neurons are the noxious cold responding sensory cells. The current study attempts to explore the potential roles of second and third order neurons, based on information of the Class III neuron synaptic outputs that have been obtained from the larval connectome.
Strengths:
The major strength of the manuscript is the detailed discussion of the second and third order neurons that are downstream of the mechanosensory Class III multidendritic neurons. These will be useful in further studies of gentle touch mechanosensation and mechanonociception both of which rely on sensory input from these cells. Calcium imaging experiments on Class III
activation with optogenetics support the wiring diagram.
Weaknesses:
The scientific premise is that a full body contraction in larvae that are exposed to noxious cold is a sensorimotor behavioral pathway. This premise is, to start with, questionable. A common definition of behavior is a set of "orderly movements with recognizable and repeatable patterns of activity produced by members of a species (Baker et al., 2001)." In the case of nociception behaviors, the patterns of movement are typically thought to play a protective role and to protect from potential tissue damage.
Does noxious cold elicit a set of orderly movements with a recognizable and repeatable pattern in larvae? Can the patterns of movement that are stimulated by noxious cold allow the larvae to escape harm? Based on the available evidence, the answer to both questions is seemingly no. In response to noxious cold stimulation many, if not all, of the muscles in the larva, simultaneously contract (Turner et al., 2016), and as a result the larva becomes stationary. In response to cold, the larva is literally "frozen" in place and it is incapable of moving away. This incapacitation by cold is the antithesis of what one might expect from a behavior that protects the animals from harm.
Extensive literature has investigated the physiological responses of insects to cold (reviewed in Overgaard and MacMillan, 2017). In numerous studies of insects across many genera (excluding cold adapted insects such as snow flies), exposure to very cold temperatures quickly incapacitates the animal and induces a state that is known as a chill coma. During a chill coma, the insect becomes immobilized by the cold exposure, but if the exposure to cold is very brief the insect can often be revived without apparent damage. Indeed, it is common practice for many laboratories that use adult Drosophila for studies of behavior to use a brief chilling on ice as a form of anesthesia because chilling is less disruptive to subsequent behaviors than the more commonly used carbon dioxide anesthesia. If flies were to perceive cold as a noxious nociceptive stimulus, then this "chill coma" procedure would likely be disruptive to behavioral studies but is not. Furthermore, there is no evidence to suggest that larval sensation of "noxious cold" is aversive.
The insect chill coma literature has investigated the effects of extreme cold on the physiology of nerves and muscles and the consensus view of the field is that the paralysis that results from cold is due to complex and combined action of direct effects of cold on muscle and on nerves (Overgaard and MacMillan, 2017). Electrophysiological measurements of muscles and neurons find that they are initially depolarized by cold, and after prolonged cold exposure they are unable to maintain potassium homeostasis and this eventually inhibits the firing of action potentials (Overgaard and MacMillan, 2017). The very small thermal capacitance of a Drosophila larva means that its entire neuromuscular system will be quickly exposed to the effect of cold in the behavioral assays under consideration here. It would seem impossible to disentangle the emergent properties of a complex combination of effects on physiology (including neuronal, glial, and muscle homeostasis) on any proposed sensorimotor transformation pathway.
Nevertheless, the manuscript before us makes a courageous attempt at attempting this. A number of GAL4 drivers tested in the paper are found to affect parameters of contraction behavior (CT) in cold exposed larvae in silencing experiments. However, notably absent from all of the silencing experiments are measurements of larval mobility following cold exposure. Thus, it is not known from the study if these manipulations are truly protecting the larvae from paralysis following cold exposure, or if they are simply reducing the magnitude of the initial muscle contraction that occurs immediately following cold (ie reducing CT). The strongest effect of silencing occurs with the 19-12-GAL4 driver which targets Class III neurons (but is not completely specific to these cells).
Optogenetic experiments for Class III neurons relying on the 19-12-GAL4 driver combined with a very strong optogenetic acuator (ChETA) show the CT behavior that was reported in prior studies. It should be noted that this actuator drives very strong activation, and other studies with milder optogenetic stimulation of Class III neurons have shown that these cells produce behavioral responses that resemble gentle touch responses (Tsubouchi et al 2012 and Yan et al 2013). As well, these neurons express mechanoreceptor ion channels such as NompC and Rpk that are required for gentle touch responses. The latter makes the reported Calcium responses to cold difficult to interpret in light of the fact that the strong muscle contractions driven by cold may actually be driving mechanosensory responses in these cells (ie through deformation of the mechanosensitive dendrites). Are the cIII calcium signals still observed in a preparation where cold induced muscle contractions are prevented?
A major weakness of the study is that none of the second or third order neurons (that are downstream of CIII neurons) are found to trigger the CT behavioral responses even when strongly activated with the ChETA actuator (Figure 2 Supplement 2). These findings raise major concerns for this and prior studies and it does not support the hypothesis that the CIII neurons drive the CT behaviors.
Later experiments in the paper that investigate strong CIII activation (with ChETA) in combination with other second and third order neurons does support the idea activating those neurons can facilitate body-wide muscle contractions. But many of the co-activated cells in question are either repeated in each abdominal neuromere or they project to cells that are found all along the ventral nerve cord, so it is therefore unsurprising that their activation would contribute to what appears to be a non-specific body-wide activation of muscles along the AP axis. Also, if these neurons are already downstream of the CIII neurons the logic of this coactivation approach is not particularly clear. A more convincing experiment would be to silence the different classes of cells in the context of the optogenetic activation of CIII neurons to test for a block of the effects, a set of experiments that is notably absent from the study.
The authors argument that the co-activation studies support "a population code" for cold nociception is a very optimistic interpretation of a brute force optogenetics approach that ultimately results in an enhancement of a relatively non-specific body-wide muscle convulsion.
We have responded extensively to reviewer 3’s comments in our provisional response to address the critiques regarding conceptual merit of this paper.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study identifies a role for YAP in regulating tumor cell growth and drug response with differential effects noted based upon growth conditions in monolayer vs spheroid culture. This work has the potential to define more biologically relevant cell culture model systems for drug resistance and define targetable pathways to overcome drug resistance. The findings described are important to the cancer biology field and the evidence supporting the key findings is convincing.
-
Reviewer #1 (Public review):
Summary:
In this study, Nakagawa and colleagues report the observation that YAP is differentially localized, and thus differentially transcriptionally active, in spheroid cultures versus monolayer cultures. YAP is known to play a critical role in the survival of drug-tolerant cancer cells, and as such, the higher levels of basally activated YAP in monolayer cultures lead to higher fractions of surviving drug-tolerant cells relative to spheroid culture (or in vivo culture). The findings of this study, revealed through convincing experiments, are elegantly simple and straightforward, yet they add significantly to the literature in this field by revealing that monolayer cultures may actually be a preferential system for studying residual cell biology simply because the abundance of residual cells in this format is much greater than in spheroid or xenograft models. The potential linkage between matrix density and stiffness and YAP activation, while only speculated upon in this manuscript, is intriguing and a rich starting point for future studies.
Although this work, like any important study, inspires many interesting follow-on questions, I am limiting my questions to only a few minor ones, which may potentially be explored either in the context of the current study or in separate, follow-on studies.
Strengths:
The major strengths of the work are described above.
Weaknesses:
Rather than considering the following points as weaknesses, I instead prefer to think of them as areas for future study:
(1) Given the field's intense interest in the biology and therapeutic vulnerabilities of residual disease cells, I suspect that one major practical implication of this work could be that it inspires scientists interested in working in the residual disease space to model it in monolayer culture. However, this relies upon the assumption that drug-tolerant cells isolated in monolayer culture are at least reasonably similar in nature to drug-tolerant cells isolated from spheroid or xenograft systems. Is this true? An intriguing experiment that could help answer this question would be to perform gene expression profiling on a cell line model in the following conditions: monolayer growth, drug tolerant cells isolated from monolayer growth conditions, spheroid growth, drug tolerant cells isolated from spheroid growth conditions, xenograft tumors, and drug tolerant cells isolated from xenograft tumors. What are the genes and programs shared between drug-tolerant cells cultured in the three conditions above? Which genes and programs differ between these conditions? Data from this exercise could help provide additional, useful context with which to understand the benefits and pitfalls of modeling residual tumor cell growth in monolayer culture.
(2) In relation to the point above, there is an interesting and established connection between mesenchymal gene expression and YAP/TAZ signaling. For example, analyses of gene expression data from human tumors and cell lines demonstrate an extremely strong correlation between these two gene expression programs. Further, residual persister cancer cells have often been characterized as having undergone an EMT-like transition. From the analysis above, is there evidence that residual tumor cells with increased YAP signaling also exhibit increased mesenchymal gene expression?
-
Reviewer #2 (Public review):
The manuscript by Nakagawa R, et al describes a mechanism of how NSCLC cells become resistant to EGFR and KRAS G12C inhibition. Here, the authors focus on the initial cellular changes that occur to confer resistance and identify YAP activation as a non-genetic mechanism of acute resistance.
The authors performed an initial xenograft study to identify YAP nuclear localization as a potential mechanism of resistance to EGFRi. The increase in the stromal component of the tumors upon Afatinib treatment leads the authors to explore the response to these inhibitors in both 2D and 3D culture. The authors extend their findings to both KRAS G12C and BRAF inhibitors, suggesting that the mechanism of resistance may be shared along this pathway.
The paper would benefit from additional cell lines to determine the generalizability of the findings they presented. While the change in the localization of YAP upon Afatinib treatment was identified in a xenograft model, the authors do not return to animal models to test their potential mechanism, and the effects of the hyperactivated S127A YAP protein on Afatinib sensitivity in culture are modest. Also, combination studies of YAP inhibitors and EGFR/RAS/RAF inhibitors would have strengthened the studies.
-
Author response:
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this study, Nakagawa and colleagues report the observation that YAP is differentially localized, and thus differentially transcriptionally active, in spheroid cultures versus monolayer cultures. YAP is known to play a critical role in the survival of drug-tolerant cancer cells, and as such, the higher levels of basally activated YAP in monolayer cultures lead to higher fractions of surviving drug-tolerant cells relative to spheroid culture (or in vivo culture). The findings of this study, revealed through convincing experiments, are elegantly simple and straightforward, yet they add significantly to the literature in this field by revealing that monolayer cultures may actually be a preferential system for studying residual cell biology simply because the abundance of residual cells in this format is much greater than in spheroid or xenograft models. The potential linkage between matrix density and stiffness and YAP activation, while only speculated upon in this manuscript, is intriguing and a rich starting point for future studies.
Although this work, like any important study, inspires many interesting follow-on questions, I am limiting my questions to only a few minor ones, which may potentially be explored either in the context of the current study or in separate, follow-on studies.
We appreciate Reviewer #1's comments that our work is of importance to the field and particularly that it will "...add significantly to the literature in this field by revealing that monolayer cultures may actually be a preferential system for studying residual cell biology..." We have sought to highlight the importance of how our findings could be applied to study resistance mechanisms at various points in the manuscript.
Strengths:
The major strengths of the work are described above.
Weaknesses:
Rather than considering the following points as weaknesses, I instead prefer to think of them as areas for future study:
(1) Given the field's intense interest in the biology and therapeutic vulnerabilities of residual disease cells, I suspect that one major practical implication of this work could be that it inspires scientists interested in working in the residual disease space to model it in monolayer culture. However, this relies upon the assumption that drug-tolerant cells isolated in monolayer culture are at least reasonably similar in nature to drug-tolerant cells isolated from spheroid or xenograft systems. Is this true? An intriguing experiment that could help answer this question would be to perform gene expression profiling on a cell line model in the following conditions: monolayer growth, drug tolerant cells isolated from monolayer growth conditions, spheroid growth, drug tolerant cells isolated from spheroid growth conditions, xenograft tumors, and drug tolerant cells isolated from xenograft tumors. What are the genes and programs shared between drug-tolerant cells cultured in the three conditions above? Which genes and programs differ between these conditions? Data from this exercise could help provide additional, useful context with which to understand the benefits and pitfalls of modeling residual tumor cell growth in monolayer culture.
We thank the reviewer for suggesting valuable future studies. We agree that the proposed experiments represent important next steps in understanding the role of YAP and other pathways in primary resistance. We believe, however, these experiments are both beyond the scope of the current manuscript and beyond what can reasonably be addressed in a revision. The distinct challenges associated with comparing in vivo and in vitro conditions would require significant optimization of single-cell approaches, especially given the robust cell death driven by afatinib treatment in vivo. Given the complexity of in vivo experimentation, we are concerned that such studies may not guarantee biologically meaningful insights. Nonetheless, we agree that this is a compelling direction for future research. If common gene expression patterns could be identified despite these challenges, such studies could help validate monolayer culture as a relevant model for investigating residual disease.
(2) In relation to the point above, there is an interesting and established connection between mesenchymal gene expression and YAP/TAZ signaling. For example, analyses of gene expression data from human tumors and cell lines demonstrate an extremely strong correlation between these two gene expression programs. Further, residual persister cancer cells have often been characterized as having undergone an EMT-like transition. From the analysis above, is there evidence that residual tumor cells with increased YAP signaling also exhibit increased mesenchymal gene expression?
We agree with the reviewer that a connection between YAP/TAZ activity and EMT is likely, given prior studies exploring correlations between these two gene signatures. We believe, however, exploring EMT represents a distinct research direction from the primary focus of the current manuscript. We are concerned exploration of EMT, especially in the absence of corresponding preclinical models or mechanistic data directly linking EMT to therapy resistance in our models, could distract from the main conclusions of the manuscript. While we plan to stain for EMT-associated markers in the residual cancer tissue from the in vivo studies, it remains unclear whether such data would meaningfully contribute to the revised manuscript, regardless of the outcome.
Reviewer #2 (Public review):
The manuscript by Nakagawa R, et al describes a mechanism of how NSCLC cells become resistant to EGFR and KRAS G12C inhibition. Here, the authors focus on the initial cellular changes that occur to confer resistance and identify YAP activation as a non-genetic mechanism of acute resistance.
The authors performed an initial xenograft study to identify YAP nuclear localization as a potential mechanism of resistance to EGFRi. The increase in the stromal component of the tumors upon Afatinib treatment leads the authors to explore the response to these inhibitors in both 2D and 3D culture. The authors extend their findings to both KRAS G12C and BRAF inhibitors, suggesting that the mechanism of resistance may be shared along this pathway.
The paper would benefit from additional cell lines to determine the generalizability of the findings they presented. While the change in the localization of YAP upon Afatinib treatment was identified in a xenograft model, the authors do not return to animal models to test their potential mechanism, and the effects of the hyperactivated S127A YAP protein on Afatinib sensitivity in culture are modest. Also, combination studies of YAP inhibitors and EGFR/RAS/RAF inhibitors would have strengthened the studies.
We thank the reviewer for their insightful comments. In this manuscript, we present data from 5 cell lines representing the EGFR/BRAF/KRAS pathway, demonstrating the generalizability of YAP-driven decreased cancer cell sensitivity to targeted inhibitors when cultured in 2D compared to spheroid counterparts. While expanding this analysis to a larger panel of cell lines is beyond the scope of the current study, we believe our findings provide a strong rationale for future investigations, including high-throughput screens conducted by other research groups and pharmaceutical companies, to recognize the value in screening spheroid cell cultures. We hope this work helps shift the field of cancer therapeutics toward screening approaches that better reflect tumor biology into drug discovery pipelines and believe this could be one of the most impactful and enduring contributions of our study.
Reviewer #2 also mentions that "...combination studies of YAP inhibitors and EGFR/RAS/RAF inhibitors would have strengthened the studies..." The concept that YAP/TAZ inhibitors (i.e. TEAD inhibitors) could be additive or synergistic in 2D culture is one that is being actively tested across several groups and in pharma. Several recent examples include a publication by Hagenbeek, et al., Nat. Cancer, 2023 (PMID: 37277530) showing that a TEAD inhibitor overcomes KRASG12C inhibitor resistance. Additional, recent work by Pfeifer, et al., Comm. Biol., 2024 (PMID: 38658677) suggests a similar effect between EGFR inhibitors and a different TEAD inhibitor. While neither of these studies extensively probes cell death pathways in the way performed in our studies, they nevertheless provide strong evidence that indeed TEAD + targeted EGFR/RAF/RAS inhibition in 2D have additive, if not synergistic, effects. We feel that these recent published studies affirm our findings and repeating such experiments is unlikely to add much new information. We thus feel they are beyond the scope of our present studies.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides valuable insights into the role of sensory stimulation in neurogenesis in the mammalian olfactory epithelium, where new olfactory sensory neurons are continually born throughout an animal's lifespan. The authors show that exposure to two different musk-related odors specifically increases the birth rates of those neurons that respond to these odors. This potentially results in adaptive changes in the subtype composition of the olfactory sensory neuron population. Solid evidence, well supported by control experiments, is presented to support these findings, though further work is needed to confirm that this phenomenon generalizes to olfactory sensory neurons expressing other types of odorant receptor and to explore the mechanisms underlying the stimulus specificity of neurogenesis.
-
Reviewer #1 (Public review):
Summary
Olfactory sensory neurons (OSNs) in the olfactory epithelium detect myriads of environmental odors that signal essential cues for survival. OSNs are born throughout life and thus represent one of the few neurons that undergo life-long neurogenesis. Until recently, it was assumed that OSN neurogenesis is strictly stochastic with respect to subtype (i.e. the receptor the OSN chooses to express). However, a recent study showed that olfactory deprivation via naris occlusion selectively reduced birthrates of only a fraction of OSN subtypes and indicated that these subtypes appear to have a special capacity to undergo changes in birthrates in accordance with the level of olfactory stimulation. These previous findings raised the interesting question of what type of stimulation influences neurogenesis, since naris occlusion does not only reduce the exposure to potentially thousands of odors, but also to more generalized mechanical stimuli via preventing airflow.
In this study, the authors set out to identify the stimuli that are required to promote the neurogenesis of specific OSN subtypes. Specifically, they aim to test the hypothesis if discrete odorants selectively stimulate the same OSN subtypes whose birthrates are affected. This would imply a highly specific mechanism in which exposure to certain odors can "amplify" OSN subtypes responsive to those odors suggesting that OE neurogenesis serves, in part, an adaptive function.
To address this question, the authors focused on a family of OSN subtypes that had previously been identified to respond to musk-related odors and that exhibit higher transcript levels in the olfactory epithelium of mice exposed to males compared to mice isolated from males. First, the authors confirm via a previously established cell birth dating assay in unilateral naris occluded mice that this increase in transcript levels actually reflects a stimulus-dependent birthrate acceleration of this OSN subtype family. In a series of experiments (in unilateral occluded and non-occluded mice) using the same birth dating assay, they show that several subtypes of this OSN family, but not other "control" subtypes exhibit increased birthrates in response to adolescent male exposure, but not to female exposure.
In the core experiment of the study, they expose unilaterally naris occluded and non-occluded mice to two musk-related odors and two "control" odors (that do not activate musk-responsive OSN subtypes) to test if these odors specifically accelerate the birth rates of OSN types that are responsive to these odors. This experiment reveals that (for the tested odors and OSN subtypes) indeed birthrates are only affected by discrete odorants that stimulate these OSN subtypes (with a complex relationship between birth rate acceleration and odor concentrations) suggesting that OE neurogenesis may serve, in part, as an adaptive function
Strength:
The scientific question is valid and opens an interesting direction. The previously established cell birth dating assay in naris occluded and non-occluded mice is well performed and accompanied by several control experiments addressing potential other interpretations of the data.
In this revised version, the authors added several new experiments addressing the previous concern that only the effect of one specific odor (muscone) on musk-responsive OSN subtypes had been tested to make the general claim that discrete odors specifically accelerate the birth rate of OSN subtypes they stimulate. Now the authors demonstrate that another musk-related odor (ambretone) also induces this effect and that other non-musk odors do not. In addition, they show that two other OSN subtypes that do not respond to musk-related odors are not affected. These experiments further substantiate the above claim.
Weakness:
(1) The main research question of this study was to test if discrete odors specifically accelerate the birth rate of OSN subtypes they stimulate, i.e. does muscone only accelerate the birth rate of OSNs that express muscone-responsive ORs, or vice versa is the birthrate of muscone-responsive OSNs only accelerated by odors they respond to?<br /> As mentioned under "strength" the authors added several experiments to further substantiate their claim. While these controls are very important to show that the observed effect is indeed specific for musk-related odors on musk-responsive OSN subtypes, these experiments still only focus on one closely related family of musk-responsive OSN subtypes. To understand if this phenomenon is a more generalized mechanism and plays a role for other OSN subtypes beyond this small family of related receptors, further experiments showing this effect for other OSN subtypes are critical.
(2) Previous concerns (#2, #4, #5 and #6) about a lack of increase in UNO effect size for olfr1440 under any muscone concentrations, strong fluctuations of newborn neurons on the closed side as well as a seemingly contradicting statement that overstimulation possibly reflects reduced survival have been addressed by adding potential explanations to the text.
In addition, the previous remark (#3) that certain phrases gave the misleading impression that musk-related odors are indeed excreted into male mouse urine at certain concentrations was addressed not only by re-phrasing, but by performing additional experiments. Although these did not deliver clear results (because of technical difficulties), interesting possibilities are discussed.
-
Reviewer #3 (Public review):
Summary
Neurogenesis in the mammalian olfactory epithelium continues throughout the animal's lifespan, replacing damaged or dying olfactory sensory neurons. It has been tacitly assumed that the replacement of olfactory receptor (OR) subtypes occurs randomly. However, anecdotal evidence has suggested otherwise. In this study, Santoro and colleagues challenge this assumption by systematically exploring three key questions: Is there enrichment of specific OR subtypes during neurogenesis? Is this enrichment influenced by sensory stimuli? Does enrichment result from differential generation of OR types or differential cell death regulated by neural activity? The authors present convincing evidence that muscone stimulus selectively enhances the neurogenesis of OSNs expressing muscone receptors, suggesting that the selection of ORs in regenerating neurons is not random.
Strengths
This study is the first in formulating and systematically testing the selective promotion hypothesis. It is a comprehensive and systematic examination of multiple musk receptors under conditions of unilateral naris occlusion and various stimuli. The controls are properly done. The quality of in situ hybridization and immunofluorescent staining is high, allowing the authors to effectively estimates the number of OR types. The data convincingly demonstrate that increased expression of musk receptors in response to male odor or muscone stimulation.
Weaknesses
The revised version has addressed most initial weaknesses. However, some inconsistencies remain, raising questions about the proposed model. For instance, in the unilateral naris occlusion experiment, although average expression of non-musk receptors shows no significant change, receptors seem to fall into two groups with one subset increasing and another decreasing (Fig. 1E). This suggests naris occlusion may regulate OR expression independently of odor-induced activity, a possibility that remains unaddressed.
There is curiosity regarding receptors for the male odor SBT, olfr912, and olfr1295, which exhibit increased expression on the occluded side. The explanation that SBT exposure shortens these neurons' lifespan lacks substantiation and is inconsistent with later data. For example, Figure 3-figure supplement 3 I does not show olfr912 changes on the UNO side, and Figure 4-figure supplement 4 shows no significant decrease in olfr912 expression in SBT-exposed mice. Additionally, single-cell RNASeq experiments did not examine olfr912 and olfr1295 expression.
While the study convincingly demonstrates muscone's selective stimulation of olfr235-expressing OSNs, it lacks exploration of the mechanisms underlying this specificity. The discussion of signaling pathways remains too generic.
In summary, while the study offers significant insights into neurogenesis and OR subtype enrichment, further investigation into underlying mechanisms and addressing existing inconsistencies would strengthen its conclusions.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer #1 (Public Review):
Summary:
Olfactory sensory neurons (OSNs) in the olfactory epithelium detect myriads of environmental odors that signal essential cues for survival. OSNs are born throughout life and thus represent one of the few neurons that undergo life-long neurogenesis. Until recently, it was assumed that OSN neurogenesis is strictly stochastic with respect to subtype (i.e. the receptor the OSN chooses to express).
However, a recent study showed that olfactory deprivation via naris occlusion selectively reduced birthrates of only a fraction of OSN subtypes and indicated that these subtypes appear to have a special capacity to undergo changes in birthrates in accordance with the level of olfactory stimulation. These previous findings raised the interesting question of what type of stimulation influences neurogenesis, since naris occlusion does not only reduce the exposure to potentially thousands of odors but also to more generalized mechanical stimuli via preventing airflow.
In this study, the authors set out to identify the stimuli that are required to promote the neurogenesis of specific OSN subtypes. Specifically, they aim to test the hypothesis that discrete odorants selectively stimulate the same OSN subtypes whose birthrates are affected. This would imply a highly specific mechanism in which exposure to certain odors can "amplify" OSN subtypes responsive to those odors suggesting that OE neurogenesis serves, in part, an adaptive function.
To address this question, the authors focused on a family of OSN subtypes that had previously been identified to respond to musk-related odors and that exhibit higher transcript levels in the olfactory epithelium of mice exposed to males compared to mice isolated from males. First, the authors confirm via a previously established cell birth dating assay in unilateral naris occluded mice that this increase in transcript levels actually reflects a stimulus-dependent birthrate acceleration of this OSN subtype family. In a series of experiments using the same assay, they show that one specific subtype of this OSN family exhibits increased birthrates in response to juvenile male exposure while a different subtype shows increased birthrates to adult mouse exposure. In the core experiment of the study, they finally exposed naris occluded mice to a discrete odor (muscone) to test if this odor specifically accelerates the birth rates of OSN types that are responsive to this odor. This experiment reveals a complex relationship between birth rate acceleration and odor concentrations showing that some muscone concentrations affect birth rates of some members of this family and do not affect two unrelated OSN subtypes.
In addition to the results nicely summarized by the reviewer, which focus on experiments to examine the effects of odor stimulation on unilateral naris occluded (UNO) mice, an important part of the present study are experiments on non-occluded (i.e., non-UNO-treated) mice. These experiments show: 1) that the exposure of non-occluded mice to odors from adolescent male mice selectively increases quantities of newborn OSNs of the musk-responsive subtype Olfr235 (Figure 3G, H; previously Figure 6), 2) the exposure of non-occluded female mice to 2 different musk odorants (muscone, ambretone) selectively increases quantities of newborn OSNs of 3 musk responsive subtypes: Olfr235, Olfr1440 and Olfr1431 (Figure 4D-F; previously Figure 6), and 3) the exposure of non-occluded adult female mice to a musk odorants selectively increases quantities of newborn OSNs of musk responsive subtypes (Figure 5; previously Fig. S7). We have reorganized the revised manuscript to more prominently and clearly present the experimental design and findings of these experiments. We have also made changes to clarify (via schematics) the experimental conditions used (i.e., UNO, non-UNO, odor exposure) in each experiment.
Strengths:
The scientific question is valid and opens an interesting direction. The previously established cell birth dating assay in naris occluded mice is well performed and accompanied by several control experiments addressing potential other interpretations of the data.
Weaknesses:
(1) The main research question of this study was to test if discrete odors specifically accelerate the birth rate of OSN subtypes they stimulate, i.e. does muscone only accelerate the birth rate of OSNs that express muscone-responsive ORs, or vice versa is the birthrate of muscone-responsive OSNs only accelerated by odors they respond to?
This question is only addressed in Figure 5 of the manuscript and the results only partially support the above claim. The authors test one specific odor (muscone) and find that this odor (only at certain concentrations) accelerates the birth rate of some musk-responsive OSN subtypes, but not two other unrelated control OSN subtypes. This does not at all show that musk-responsive OSN subtypes are only affected by odors that stimulate them and that muscone only affects the birthrate of musk-responsive OSNs, since first, only the odor muscone was tested and second, only two other OSN subtypes were tested as controls, that, importantly, are shown to be generally stimulus-independent OSN subtypes (see Figure 2 and S2).
As a minimum the authors should have a) tested if additional odors that do not activate the three musk-responsive subtypes affect their birthrate b) choose 2-3 additional control subtypes that are known to be stimulus-dependent (from their own 2020 study) and test if muscone affects their birthrates.
We appreciate these suggestions. Within the revised manuscript, we have described and included the results from several new experiments:
(1) As noted by the reviewer, we had previously tested the effects of exposure to only one exogenous musk odorant, muscone, on quantities of newborn OSNs of the musk-responsive subtypes Olfr235, Olfr1440, and Olfr1431. To test whether the effects observed with muscone exposure occur with other musk odorants, we assessed the effects of exposure to ambretone (5-cyclohexadecenone), a musk odorant previously found to robustly activate musk-responsive OSNs (Sato-Akuhara et al., 2016; Shirasu et al., 2014), on quantities of newborn OSNs of 3 musk-responsive subtypes Olfr235, Olfr1440, and Olfr1431, as well as the SBT-responsive subtype Olfr912, in the OEs of non-occluded female mice. Exposure to ambretone was found to significantly increase quantities of newborn OSNs of all 3 musk-responsive subtypes (Figure 4D-F) but not the SBT-responsive subtype (Figure 4–figure supplement 4C-left), indicating that a variety of musk odorants can accelerate the birthrates of musk responsive subtypes.
(2) To verify that exogenous non-musk odors do not increase quantities of newborn OSNs of musk responsive OSN subtypes (point a, above), we quantified newborn OSNs of 3 musk-responsive subtypes, Olfr235, Olfr1440, and Olfr1431, in non-occluded female mice that were exposed to the non-musk odorants SBT or IAA. As expected, neither of these odorants significantly affected the birthrates of the subtypes tested (Figure 4D-F).
(3) To confirm that exogenous musk odors do not accelerate the birthrates of non-musk responsive OSN subtypes that were previously found to undergo stimulation-dependent neurogenesis (point b, above), we quantified newborn OSNs of 2 such subtypes, Olfr827 and Olfr1325, in non-occluded female mice that were exposed to muscone. As expected, exposure to muscone did not significantly affect the birthrates of either of these subtypes (Figure 4–figure supplement 4C-middle, right).
(4) To provide additional confirmation that only some OSN subtypes have a capacity to exhibit increases in newborn OSN quantities in the presence of odors that activate them, we compared quantities of newborn OSNs of the SBT-responsive subtype Olfr912 in non-occluded females that were either exposed to 0.1% SBT versus unexposed controls. As expected, exposure of SBT caused no significant increase in quantities of newborn Olfr912 OSNs (Figure 4–figure supplement 4C-left).
(2) The finding that Olfr1440 expressing OSNs do not show any increase in UNO effect size under any muscone concentration (Figure 5D, no significance in line graph for UNO effect sizes, middle) seems to contradict the main claim of this study that certain odors specifically increase birthrates of OSN subtypes they stimulate. It was shown in several studies that olfr1440 is seemingly the most sensitive OR for muscone, yet, in this study, muscone does not further increase birthrates of OSNs expressing olfr1440. The effect size on birthrate under muscone exposure is the same as without muscone exposure (0%).
In contrast, the supposedly second most sensitive muscone-responsive OR olfr235 shows a significant increase in UNO effect size between no muscone exposure (0%) and 0.1% as well as 1% muscone.
Findings that quantities of newborn Olfr1440 OSNs do not show a significantly greater UNO effect size in the OEs from mice exposed to muscone compared to control mice was also somewhat surprising to us. We think that there are two potential explanations for this result: 1) Unlike subtype Olfr235, subtype Olfr1440 exhibits a significant open-side bias in newborn OSN quantities in UNO-treated adolescent females even in the absence of exposure to muscone. We speculate that this subtype (as well as subtype Olfr1431) is stimulated by odors that are emitted by female mice at the adolescent stage, and/or by another environmental source. This may limit the influence of muscone exposure on the UNO effect size. 2) There is compelling evidence that odors within the environment can enter the closed side of the OE transnasally [via the nasopharyngeal canal (Kelemen, 1947)] and/or retronasally (via the nasopharynx) in UNO-treated mice [reviewed in (Coppola, 2012)]. Thus, it is conceivable that chronic exposure of UNO-treated mice to muscone results in the eventual entry on the closed side of the OE of muscone at concentrations sufficient to promote neurogenesis. If Olfr1440 is more sensitive to muscone than Olfr235 [e.g., (Sato-Akuhara et al., 2016; Shirasu et al., 2014)], OSNs of this subtype may be especially sensitive to small amounts of odors that enter the closed side of the OE transnasally and/or retronasally. These explanations are supported by the following results:
- UNO-treated females exposed to 0.1% muscone show higher quantities of newborn Olfr1440 OSNs on both the open and closed sides of the OE in muscone exposed females compared to their unexposed counterparts (Figure 4–figure supplement 1A-middle). Similar results were also observed for newborn Olfr235 OSNs (Figure 4C-middle), albeit to a lesser extent, perhaps due to the lower sensitivity of this subtype to muscone.
- In non-occluded female mice, exposure to 0.1% muscone was found to significantly increase quantities of newborn Olfr1440 OSNs, as well as newborn Olfr235 and Olfr1431 OSNs (Figure 4D-F in revised manuscript; Figure 6 in original version). Similar results were also observed upon exposure to ambretone, another musk odor (Figure 4D-F). These experiments strongly support the hypothesis that musk odors selectively increase birthrates of OSN subtypes that they stimulate.
We have addressed these points within the results section of the revised manuscript.
(3) The authors introduce their choice to study this particular family of OSN subtypes with first, the previous finding that transcripts for one of these musk-responsive subtypes (olfr235) are downregulated in mice that are deprived of male odors. Second, musk-related odors are found in the urine of different species. This gives the misleading impression that it is known that musk-related odors are indeed excreted into male mouse urine at certain concentrations. This should be stated more clearly in the introduction (or cited, if indeed data exist that show musk-related odors in male mouse urine) because this would be a very important point from an ethological and mechanistic point of view.
In addition, this would also be important information to assess if the chosen muscone concentrations fall at all into the natural range.
These are important points, which have addressed within the revised manuscript:
(1) Within the introduction, we have now stated that the emission of musk odors by mice has not been documented. We have also added extensive discussions of what is known about the emission of musk odors by mice in a new subsection within Results, as well as within the Discussion section. Most prominently, we have cited one study (Sato-Akuhara et al., 2016) that noted unpublished evidence for the emission of Olfr1440-activating compounds from male preputial glands: “Indeed, our preliminary experiments suggest that there are unidentified compounds that activate MOR215-1 in mouse preputial gland extracts.” Another study, which used histomorphology, metabolomic and transcriptomic analyses to compare the mouse preputial glands to muskrat scent glands, found that the two glands are similar in many ways, including molecular composition (Han et al., 2022). However, the study did not identify known musk compounds within mouse preputial glands.
(2) Based on the reviewer’s feedback and our own curiosity, we used GC-MS to analyze both mouse urine and preputial gland extracts for the presence of known musk odorants, particularly those known to activate Olfr235 and Olfr1440 (Sato-Akuhara et al., 2016). Although we were unable to find evidence for known musk odorants in mouse urine extracts (possibly due to insufficient sensitivity of the assay employed), we found that preputial gland extracts contain GC-MS signals that are structurally consistent with known musk odorants. A limitation of this approach, however, is that the conclusive identification of specific musk odorants in extracts derived from mouse urine and tissues requires comparisons to pure standards, many of which we could not readily obtain. For example, we were unable to obtain a pure sample of cycloheptadecanol, a musk molecule with a predicted potential match to a signal identified within preputial gland extracts. Another limitation is that although several known musk odorants have been found to activate Olfr235 and Olfr1440 OSNs, it is conceivable that structurally distinct odorants that have not yet been identified might also activate them. The findings from these experiments have been included in a new figure within the revised manuscript (Appendix 2–figure 1).
Related: If these are male-specific cues, it is interesting that changes in OR transcripts (Figure 1) can already be seen at the age of P28 where other male-specific cues are just starting to get expressed. This should be discussed.
We agree that the observed changes in quantities of newborn OSNs of musk-responsive subtypes in mice exposed to juvenile male odors deserves additional discussion. We have included a more extensive discussion of this observation in both the Results and Discussion sections of the revised manuscript.
(4) Figure 5: Under muscone exposure the number of newborn neurons on the closed sides fluctuates considerably. This doesn't seem to be the case in other experiments and raises some concerns about how reliable the naris occlusion works for strong exposure to monomolecular odors or what other potential mechanisms are at play.
We agree that the variability in quantities of newborn OSNs of musk-responsive subtypes on the closed side of the OE of UNO-treated mice deserves further discussion. As noted above, we suspect that these fluctuations are due, at least in part, to transnasal and/or retronasal odor transfer via the nasopharyngeal canal (Kelemen, 1947) and nasopharynx, respectively [reviewed in (Coppola, 2012)], which would be expected to result in exposure of the closed OE to odor concentrations that rise with increasing environmental concentrations. In support of this, quantities of newborn Olfr235 and Olfr1440 OSNs increase on both the open and closed sides with increasing muscone concentration (except at the highest concentration, 10%, in the case of Olfr1440) (Figure 4C-middle, Figure 4–figure supplement 1A-middle). It is conceivable that reductions in newborn Olfr1440 OSN quantities observed in the presence of 10% muscone reflect overstimulation-dependent reductions in survival. Our findings from UNO-based experiments are consistent with expectations that naris occlusion does not completely block exposure to odorants on the closed side, particularly at high concentrations. However, they also appear consistent with the hypothesis that exposure to musk odors promotes the neurogenesis of musk-responsive OSN subtypes.
Considering the limitations of the UNO procedure, it is important to note that the present study also includes experimental exposure of non-occluded animals to both male odors (Figure 3G, H) and exogenous musk odorants (Figures 4D-F). Findings from the latter experiments provide strong evidence that exposure to multiple musk odorants (muscone, ambretone) causes selective increases in the birthrates of multiple musk-responsive OSN subtypes (Olfr235, Olfr1440, Olfr1431).
We have included within the Results section of the revised manuscript a discussion of how observed effects of muscone exposure of UNO-treated mice may be influenced by transnasal/ retronasal odor transfer to the closed side of the OE.
(5) In contrast to all other musk-responsive OSN types, the number of newborn OSNs expressing olfr1437 increases on the closed side of the OE relative to the open in UNO-treated male mice (Figure 1). This seems to contradict the presented theory and also does not align with the bulk RNAseq data (Figure S1).
Subtype Olfr1437 is indeed an outlier among musk-responsive subtypes that were previously found to be more highly represented in the OSN population in 6-month-old sex-separated males compared to females (Appendix 1–figure 1)(C. van der Linden et al., 2018; Vihani et al., 2020). Somewhat unexpectedly, our findings from scRNA-seq experiments show slightly greater quantities of immature Olfr1437 OSNs on the closed side of the OE in juvenile males (Figure 1D, E of the revised manuscript, which now includes data from a second OE). Perhaps more informatively considering the small number of iOSNs of specific subtypes in the scRNA-seq datasets, EdU birthdating experiments show no difference in newborn Orlfr1437 OSN quantities on the 2 sides of the OE from UNO-treated juvenile males (Figure 2G). It is unclear to us why subtype Olfr1437 does not show open-side biases in newborn OSN quantities in juvenile male mice, but potential explanations include:
- Age: Findings based on bulk RNA-seq that musk responsive OSN subtypes are more highly represented in mice exposed to male odors analyzed mice that were 6 months old (C. van der Linden et al., 2018) or > 9 months old (Vihani et al., 2020) at the time of analysis. By contrast, the present study primarily analyzed mice that were juveniles (PD 28) at the time of scRNA-seq analysis (Figure 1) or EdU labeling (Figure 2G). It is conceivable that different musk-responsive subtypes are selectively responsive to distinct odors that are emitted at different ages. In this scenario, odors that increase the birthrates of Olfr235, Olfr1440, and Olfr1431 OSNs may be emitted starting at the juvenile stage, while those that increase the birthrate of Olfr1437 OSNs may be emitted in adulthood. In potential support of this, juvenile males exposed to their adult parents at the time of EdU labeling showed a slightly greater (although not statistically significantly different) UNO effect size in quantities of newborn Olfr1437 OSNs compared to controls (Figure 3–figure supplement 3).
- Capacity for stimulation-dependent neurogenesis: It is also conceivable that, unlike other musk-responsive OSN subtypes, Olfr1437 OSNs lack the capacity for stimulation-dependent neurogenesis (like the SBT-responsive subtype Olfr912, for example). If so, this would imply that increased representations of Olfr1437 OSNs observed in mice exposed to male odors for long periods (C. van der Linden et al., 2018; Vihani et al., 2020) may be due to male odor-dependent increases in the lifespans of Olfr1437 OSNs.
Within the Discussion section of the revised manuscript, we have discussed the findings concerning Olfr1437.
(6) The authors hypothesize in relation to the accelerated birthrate of musk-responsive OSN subtypes that "the acceleration of the birthrates of specific OSN subtypes could selectively enhance sensitivity to odors detected by those subtypes by increasing their representation within the OE". However, for two other OSN subtypes that detect male-specific odors, they hypothesize the opposite "By contrast, Olfr912 (Or8b48) and Olfr1295 (Or4k45), which detect the male-specific non-musk odors 2-sec-butyl-4,5-dihydrothiazole (SBT) and (methylthio)methanethiol (MTMT), respectively, exhibited lower representation and/or transcript levels in mice exposed to male odors, possibly reflecting reduced survival due to overstimulation."
Without any further explanation, it is hard to comprehend why exposure to male-derived odors should, on one hand, accelerate birthrates in some OSN subtypes to potentially increase sensitivity to male odors, but on the other hand, lower transcript levels and does not accelerate birth rates of other OSN subtypes due to overstimulation.
We agree that this point deserves further explanation. Within the revised manuscript, we have expanded the Introduction and Results to describe evidence from previous studies that exposure to stimulating odors causes two categories of changes to specific OSN subtypes: elevated representations or reduced representations within the OSN population. In one study (C. J. van der Linden et al., 2020), UNO treatment was found to cause a fraction of OSN subtypes to exhibit lower birthrates and representations on the closed side of the OE relative to the open. By contrast, another fraction of OSN subtypes exhibited higher representations on the closed side of the OEs of UNO-treated mice, but no difference in birthrates between the two sides. The latter subtypes were found to be distinguished by their receipt of extremely high levels of odor stimulation, suggesting that reduced odor stimulation via naris occlusion may lengthen their lifespans. In support of the possibility that Olfr912 (and Olfr1295), which detect SBT and MTMT, respectively (Vihani et al., 2020), which are emitted specifically by male mice (Lin et al., 2005; Schwende et al., 1986), UNO treatment was previously found to increase total Olfr912 OSN quantities on the closed side compared to the open side in sex-separated males (C. van der Linden et al., 2018), a finding confirmed in the present study (Figure 3–figure supplement 1H).
Taken together, findings from previous studies as well as the current one indicate that olfactory stimulation can accelerate the birthrates and/or reduced the lifespans of OSNs, depending on the specific subtypes and odors within the environment. As we have now indicated in the Discussion, we do not yet know what distinguishes subtypes that undergo stimulation-dependent neurogenesis, but it is conceivable that they detect odors with a particular salience to mice. Thus, observations that some odorants (e.g., musks) cause stimulation-dependent neurogenesis while others do not (e.g., SBT) might reflect an animal’s specific need to adapt its sensitivity to the former. Alternatively, it is conceivable that stimulation-dependent reductions in representations of subtypes such as Olfr912 and Olfr1295 reflect a fundamentally different mode of plasticity that is also adaptive, as has been hypothesized (C. van der Linden et al., 2018; Vihani et al., 2020).
Reviewer #1 (Recommendations For The Authors):
To support the main claim, several controls are necessary as mentioned under point 1 of the public review.
As outlined in our responses to the public review, new experiments within the revised manuscript indicate the following:
(1) Accelerated birthrates of 3 different musk responsive OSN subtypes (Olfr235, Olfr1440, Olfr1431) are observed in non-occluded mice following exposure to multiple exogenous musk odorants (muscone, ambretone) (Figure 4D-F).
(2) Exposure of non-occluded mice to non-musk odors (SBT, IAA) does not accelerate the birthrates of musk responsive OSN subtypes (Olfr235, Olfr1440, Olfr1431) (Figure 4D-F).
(3) Exposure of mice to exogenous musk odors (muscone, ambretone) does not accelerate the birthrates of non-musk responsive OSN subtypes (e.g., Olfr912), including those previously found to undergo stimulation-dependent neurogenesis (Olfr827, Olfr1325) (Figure 4–figure supplement 4C).
(4) Only a fraction of OSN subtypes have a capacity to undergo accelerated neurogenesis in the presence of odors that activate them (e.g., Olfr912 birthrates are not accelerated by SBT exposure) (Figure 4–figure supplement 4C-left).
In addition, this study could be considerably improved by showing that the proposed mechanism applies beyond a single OSN subtype (olfr235), especially since the most sensitive OR subtype (expressing olfr1440) does not align with the main claim. The introduction states that this is difficult because the ligands for many ORs are unknown including all subtypes previously found to undergo stimulation-dependent neurogenesis referring to your 2020 study. While this reviewer agrees that the lack of deorphanization is a significant hurdle in the field, the 2020 study states that about 4% of all ORs (which should equal >40 ORs) show a stimulus-dependent down-regulation on the closed side, not only the 7 ORs which are closer examined (Figure 1). It would tremendously improve the impact of the current study to show that the proposed effect applies also to one of these other >40 ORs.
We appreciate this question, as it alerted us to some shortcomings in how our findings were presented within the original manuscript. We respectfully disagree that only findings regarding subtype Olfr235 align with the main hypothesis of this study, which is that discrete odors can selectively promote the neurogenesis of sensory neuron subtypes that they stimulate. Specifically, we would like to draw attention to experiments on non-occluded female mice exposed to exogenous musk odorants (muscone, ambretone; revised Figures 4D-F; previously, Figure 6). Findings from these experiments provide compelling evidence that exposure to musk odorants causes selective increases in the birthrates of three different musk-responsive OSN subtypes: Olfr235, Olfr1440, and Olfr1431. Thus, we would suggest that results from the present study already show that the proposed mechanism applies to more than the just Olfr235 subtype. However, we agree with what we think is the essence of the reviewer’s point: that it is important to determine the extent to which this mechanism applies to OSN subtypes that are responsive to other (i.e., non-musk) odorants. While, as noted by the reviewer, our previous study identified several OSN subtypes that undergo stimulation-dependent neurogenesis (as well as many others that predicted to do so)(C. J. van der Linden et al., 2020), we are not aware of ligands that have been identified with high confidence for those subtypes. Although we are in the process of conducting experiments to identify additional odor/subtype pairs to which the mechanism described in this study applies, the early-stage nature of these experiments precludes their inclusion in the present manuscript.
The ethological and mechanistic relevance of the current study could be significantly improved by showing that musk-related odors that activate olfr235 are actually found in male mouse urine (and additionally are not found in female mouse urine). Otherwise, the implicated link between the acceleration of OSN birthrates by exposure to male odors and acceleration by specific monomolecular odors does not hold, raising the question of any natural relevance (e.g. the proposed adaptive function to increase sensitivity to certain odors).
As noted in our responses to the public review, we have addressed this important point within the revised manuscript as follows:
(1) We have included an extensive discussion of what is known about the emission of musk-like odors by mice.
(2) We have used GC-MS to analyze both mouse urine and preputial gland extracts for the presence of known musk compounds. Although inconclusive, we report that preputial glands contain signals that are structurally consistent with known musk compounds. The findings of these experiments have been included in the revised manuscript (new Appendix 2–figure 1), along with a discussion of their limitations.
Reviewer #2 (Public Review):
In their paper entitled "In mice, discrete odors can selectively promote the neurogenesis of sensory neuron subtypes that they stimulate" Hossain et al. address lifelong neurogenesis in the mouse main olfactory epithelium. The authors hypothesize that specific odorants act as neurogenic stimuli that selectively promote biased OR gene choice (and thus olfactory sensory neuron (OSN) identity). Hossain et al. employ RNA-seq and scRNA-seq analyses for subtype-specific OSN birthdating. The authors find that exposure to male and musk odors accelerates the birthrates of the respective responsive OSNs. Therefore, Hossain et al. suggest that odor experience promotes selective neurogenesis and, accordingly, OSN neurogenesis may act as a mechanism for long-term olfactory adaptation.
We appreciate this summary but would like to underscore that a mechanism involving biased OR gene choice is just one of two possibilities proposed in the Discussion section to explain how odorant stimulation of specific subtypes accelerates the birthrates of those subtypes.
The authors follow a clear experimental logic, based on sensory deprivation by unilateral naris occlusion, EdU labeling of newborn neurons, and histological analysis via OR-specific RNA-FISH. The results reveal robust effects of deprivation on newborn OSN identity. However, the major weakness of the approach is that the results could, in (possibly large) parts, depend on "downregulation" of OR subtype-specific neurogenesis, rather than (only) "upregulation" based on odor exposure. While, in Figure 6, the authors show that the observed effects are, in part, mediated by odor stimulation, it remains unclear whether deprivation plays an "active" role as well. Moreover, as shown in Figure 1C, unilateral naris occlusion has both positive and negative effects in a random subtype sample.
In our view, the present study involves two distinct and complementary experimental designs: 1) odor exposure of UNO-treated animals and 2) odor exposure of non-occluded animals. Here we address this comment with respect to each of these designs:
(1) For experiments performed on UNO-treated animals, we agree that observed differences in birthrates on the open and closed sides of the OE reflect, largely, a deceleration (i.e., downregulation) of the birthrates of these subtypes on the closed side relative to the open (as opposed to an acceleration of birthrates on the open side). Our objective in using this design was to test the extent to which specific OSN subtypes undergo stimulation-dependent neurogenesis under various odor exposure conditions. According to the main hypothesis of this study, a lower birthrate of a specific OSN subtype on the closed side of the OE compared to the open is predicted to reflect a lower level of odor stimulation on the closed side received by OSNs of that subtype. However (and as described in our responses to reviewer #1), a limitation of this design is that environmental odorants, especially at high concentrations, are likely to stimulate responsive OSNs on the closed side of the OE in addition to the open side due to transnasal and/or retronasal air flow.
(2) Experiments performed on non-occluded animals were designed to provide critical complementary evidence that specific OSN subtypes undergo accelerated neurogenesis in the presence of specific odors. Using this design, we have found compelling evidence that:
- Exposure of non-occluded mice to male odors causes the selective acceleration of the birthrate of Olfr235 OSNs (Figure 3G, H).
- Exposure of non-occluded female mice to two different musk odorants (muscone and ambretone) selectively accelerates the birthrates three different musk responsive subtypes: Olfr235, Olfr1440, and Olf1431 (Figure 4D-F and Figure 4–figure supplement 4C).
We have reorganized the revised manuscript to more clearly present the most important experimental findings using these two experimental designs. We have also highlighted (via schematics) the experimental conditions (e.g., UNO, non-occlusion, odor exposure) used for each experiment.
Another weakness is that the authors build their model (Figure 8), specifically the concept of selectivity, on a receptor-ligand pair (Olfr912 that has been shown to respond, among other odors, to the male-specific non-musk odors 2-sec-butyl-4,5-dihydrothiazole (SBT)) that would require at least some independent experimental corroboration. At least, a control experiment that uses SBT instead of muscone exposure should be performed.
We agree that this important concern deserves additional control experiments and discussion. We have addressed this concern within the revised manuscript as follows:
- Within the Results section, we have added multiple new control experiments (detailed in response to Reviewer #1), including the one recommended above. As suggested, we quantified newborn OSNs of the SBT-responsive subtype Olfr912 in non-occluded females that were either exposed to 0.1% SBT or unexposed controls. Exposure of SBT was found to cause no significant increase in quantities of newborn Olfr912 OSNs (newly added Figure 4–figure supplement 4C-left). These findings further support the model in Figure 7 (previously Figure 8) that only a fraction of OSN subtypes have a capacity to undergo accelerated neurogenesis in the presence of odors that activate them.
- Also within the Results section, we have made efforts to better highlight relevant control experiments that were included in the original version, particularly those showing that quantities of newborn Olfr912 OSNs are not affected by UNO in mice exposed to male odors (Figure 2H and Figure 3–figure supplement 1G; previously Figure 2F and Figure 3H) or by exposure of non-occluded females to male odors (Figure 3H; previously Figure 6E). Since Olfr235 is responsive to component(s) of male odors (C. van der Linden et al., 2018; Vihani et al., 2020), these results indicate that this subtype does not have the capacity of stimulation-dependent neurogenesis, which is consistent with our previous findings that only a fraction of subtypes have this capacity (C. J. van der Linden et al., 2020).
In this context, it is somewhat concerning that some results, which appear counterintuitive (e.g., lower representation and/or transcript levels of Olfr912 and Olfr1295 in mice exposed to male odors) are brushed off as "reflecting reduced survival due to overstimulation." The notion of "reduced survival" could be tested by, for example, a caspase3 assay.
This is a point that we agree deserves further discussion. Please see the explanation that we have outlined above in response to Reviewer #1.
Within the revised manuscript, we have expanded the Introduction to describe evidence from previous studies that exposure to stimulating odors causes two categories of changes to specific OSN subtypes: elevated representations or reduced representations within the OSN population. We outline evidence from previous studies that Olfr912 and Olfr1295 belong to the latter category, and that the representations of these subtypes are likely reduced by male odor overstimulation-dependent shortening of OSN lifespan.
Important analyses that need to be done to better be able to interpret the findings are to present (i) the OR+/EdU+ population of olfactory sensory neurons not just as a count per hemisection, but rather as the ratio of OR+/EdU+ cells among all EdU+ cells; and (ii) to the ratio of EdU+ cells among all nuclei (UNO versus open naris). This way, data would be normalized to (i) the overall rate of neurogenesis and (ii) any broad deprivation-dependent epithelial degeneration.
We have addressed this concern in two ways within the revised manuscript:
(1) We have noted within the Methods section that the approach of using half-sections for normalization has been used in multiple previous studies for quantifying newborn (OR+/EdU+) and total (OR+) OSN abundances (Hossain et al., 2023; Ibarra-Soria et al., 2017; C. van der Linden et al., 2018; C. J. van der Linden et al., 2020). Additionally, within the figure legends and Methods, we have more thoroughly described the approach used, including that it relies on averaging the quantifications from at least 5 high-quality coronal OE tissue sections that are evenly distributed throughout the anterior-posterior length of each OE and thereby mitigates the effects of section size and cell number variation among sections. In the case of UNO treated mice, the open and closed sides within the same section are paired, which further reduces the effects of section-to section variation. We have found that this approach yields reproducible quantities of newborn and total OSNs among biological replicate mice and enables accurate assessment of how quantities of OSNs of specific subtypes change as a result of altered olfactory experience, a key objective of this study.
(2) To assess whether the use of alternative approaches for normalizing newborn OSN quantities suggested by the reviewers would affect the present study’s findings, we compared three methods for normalizing the effects of exposure to male odors or muscone on quantities of newborn Olfr235 OSNs in the OEs of both UNO-treated and non-occluded mice: 1) OR+/EdU+ OSNs per half-section (used in this study), 2) OR+/EdU+ OSNs per total number of EdU+ cells (reviewer suggestion (i)), and 3) OR+/EdU+ OSNs per unit of DAPI+ area (an approximate measure of nuclei number; reviewer suggestion (ii)). The three normalization methods yielded statistically indistinguishable differences in assessing the effects of exposure of either UNO-treated or non-occluded mice to male odors (newly added Figure 2–figure supplement 2 and Figure 3–figure supplement 2), or of exposure of non-occluded mice to muscone (newly added Figure 4–figure supplement 3). Based on these findings, and the considerable time that would be required to renormalize all data in the manuscript, we have chosen to maintain the use of normalization per half-section.
Finally, the paper will benefit from improved data presentation and adequate statistical testing. Images in Figures 2 - 7, showing both EdU labeling of newborn neurons and OR-specific RNA-FISH, are hard to interpret. Moreover, t-tests should not be employed when data is not normally distributed (as is the case for most of their samples).
We have made extensive changes within the revised manuscript to increase the clarity and interpretability of the figures, including:
(1) Addition of a split-channel, high-magnification view of a representative image that shows the overlap of FISH and EdU signals (Figure 2D).
(2) Addition of experimental schematics and timelines corresponding to each set of experiments.
In the revised manuscript, several changes to the statistical tests have been made, as follows:
(1) To assess deviation from normality of the histological quantifications of newborn and total OSNs of specific subtypes in this study, all datasets were tested using the Shapiro-Wilk test for non-normality and the P values obtained are included in Supplementary file 1 (figure source data). Of the 274 datasets tested, 253 were found to have Shapiro-Wilk P values > 0.05, indicating that the vast majority (92%) do not show evidence of significant deviation from a normal distribution.
(2) A general lack of deviation of the datasets in this study from a normal distribution is further supported by quantile-quantile (QQ) plots, which compare actual data to a theoretically normal distribution (Appendix 4–figure 1). The datasets analyzed were separated into the following categories:
a. Quantities of newborn OSNs in UNO treated mice (Appendix 4-figure 1A)
b. Quantities of total OSNs in UNO treated mice (Appendix 4-figure 1B)
c. Quantities of newborn OSNs in non-occluded mice (Appendix 4-figure 1C)
d. UNO effect sizes for newborn or total OSNs (Appendix 4-figure 1D)
(3) Results of both parametric and non-parametric statistical tests of comparisons in this study have been included in Supplementary file 2 (statistical analyses). In general, the results from parametric and non-parametric tests are in good agreement.
(4) Statistical analyses of differences in OSN quantities in the OEs of non-occluded mice or UNO effect sizes in UNO-treated mice subjected more than two different experimental conditions have now been performed using one-way ANOVA tests, FDR-adjusted using the 2-stage linear step-up procedure of Benjamini, Krieger and Yekutieli.
Reviewer #2 (Recommendations for the Authors):
The manuscript by Hossain et al. would benefit from a thorough revision. Here, we outline several points that should be addressed:
Figure 3E - I & Figure 4E&F: Red lines that connect mean values are misleading.
Within the revised manuscript, the UNO effect size graphs have been modified for clarity, including removal of the lines between mean values except for those comparing changes over time post EdU injection (Figure 6 and Figure 6-figure supplement 1). For these latter graphs, we think that lines help to illustrate changes in effect sizes over time.
Figure 3E - I: UNO effect sizes (right) should be tested via ANOVA.
In the revised manuscript, statistical analyses of UNO effect sizes in UNO-treated mice subjected more than two different experimental conditions were done using one-way ANOVA tests, FDR-adjusted using the 2-stage linear step-up procedure of Benjamini, Krieger and Yekutieli (Figure 2-figure supplement 2; Figure 3; Figure 3-figure supplement 1; Figure 4; Figure 4-figure supplements 1, 2). The same tests were used for analysis of differences in OSN quantities in the OEs of non-occluded mice subjected more than two different experimental conditions (Figure 3; Figure 3-figure supplement 2; Figure 4; Figure 4-figure supplements 3, 4). For comparisons of differences in quantities of newborn OSNs of musk-responsive subtypes at 4 and 7 days post-EdU between non-occluded mice exposed and unexposed to muscone, a two sample ANOVA - fixed-test, using F distribution (right-tailed) was used (Figure 6; Figure 6-figure supplement 1).
Images in Figures 2 - 7, showing both EdU labeling of newborn neurons and OR-specific RNA-FISH: Colabeling is hard / often impossible to discern. Show zoom-ins and better explain the criteria for "colabeling" in the methods.
In the revised manuscript an enlarged and split-channel view of an image showing multiple newborn Olfr235 OSNs (OR+/EdU+) has been added (Figure 2D). A detailed description of the criteria for OR+/EdU+ OSNs is provided in Methods under the section “Histological quantification of newborn and total OSNs of specific subtypes.”
Figure 1C: add Olfr912.
As a control group for iOSN quantities of musk-responsive subtypes in Figure 1, we selected random subtypes that are expressed in the same zones: 2 and 3. Olfr912 OSNs were not included because this subtype was not randomly chosen, nor is it expressed the same zones (Olfr912 is expressed in zone 4). We also note that the scRNA-seq analysis was done to allow an initial exploration of the hypothesis that some OSN subtypes with that are more highly represented in mice exposed to male odors show stimulation-dependent neurogenesis. Considering that the scRNA-seq datasets contain only small numbers of iOSNs of specific subtypes, we think they are more useful for analyzing changes in birthrates within groups of subtypes (e.g., musk responsive, random) rather than individual subtypes.
The time of OE dissection is different for data shown in Figure 1 (P28) as compared to other figures (P35). Please comment/discuss.
Within the Results section of the revised manuscript, we have now clarified that the PD 28 timepoint chosen for EdU birthdating in the histological quantification of newborn OSNs of specific subtypes is analogous to the PD 28 timepoint chosen for identification of immature (Gap43-expressing) OSNs in the scRNA-seq samples. In the case of EdU birthdating, it is necessary to provide a chase period of sufficient length to enable robust and stable expression of an OR, which defines the subtype. A chase period of 7 days was chosen based on a previous study (C. J. van der Linden et al., 2020). Hence, a dissection date of PD 35 was chosen.
Figure 3F&G: please discuss the female à female effects
In the Results and Discussion sections of the revised manuscript, we discuss our observation that the Olfr1440 and Olfr1431 subtypes show significantly higher quantities of newborn OSNs on the open side compared to closed sides in UNO-treated females. We speculate that these subtypes may receive some odor stimulation in juvenile females, perhaps via musk or related odors emitted by females themselves or from elsewhere within the environment.
Figure 4E (and other examples): male à male displays two populations (no effect versus effect); please explain/speculate.
For some UNO effect sizes, there appears to be high degree of variation among mice, and, in some cases, this diversity appears to cause the data to separate into groups. We assessed whether this diversity might reflect mice that came from different litters, but this is not the case. Rather, we speculate that the observed diversity most likely reflects low representations of newborn OSNs of some subtypes and/or under specific conditions. The data referred to by the reviewer (now Figure 3–figure supplement 3D), for example, shows UNO effect sizes for quantities of newborn Olfr1431 OSNs, which has the lowest representation among the musk-responsive subtypes analyzed in this study.
Figure 5C-E: It is unclear why strong muscone concentrations (10%) have no effect, whereas no muscone sometimes (D&E) has an effect.
As discussed in response to comments from Reviewer #1, we speculate that fluctuations in UNO effect sizes in muscone-exposed mice, particularly at high muscone concentrations, may be due, at least in part, to transnasal and/or retronasal air flow [reviewed in (Coppola, 2012)], which would be expected to result in exposure of the closed side of the OE to muscone concentrations that increase with increasing environmental concentrations. In support of this, quantities of newborn Olfr235 (Figure 4C-middle) and Olfr1440 (Figure 4–figure supplement 1A-middle) OSNs increase on both the open and closed sides with increasing muscone concentration (except at the highest concentration, 10%, in the case of Olfr1440). We speculate that reductions in newborn Olfr1440 OSN quantities observed in the presence of 10% muscone may reflect overstimulation-dependent reductions in survival.
As emphasized above, our study also includes experiments on non-occluded animals (Figures 3, 4, 5). Findings from these experiments provide additional evidence that exposure to multiple musk odorants (muscone, ambretone) causes selective increases in the birthrates of multiple musk-responsive OSN subtypes (Olfr235, Olfr1440, Olfr1431).
We have included an extensive interpretation of UNO-based experiments, including their limitations, within the Results section of the revised manuscript.
Figure S1: please explain the large error bars regarding "Transcript level".
We have clarified that the error bars in this figure, which is now Appendix 1–figure 1, correspond to 95% confidence intervals.
The figure captions could be improved for ease of reading.
Figure captions have been revised for increased clarity.
Figure 4: Include Olfr235 data for consistency.
All OSN subtypes analyzed for the effects of exposure to adult mice on UNO-induced open-side biases in quantities of newborn OSNs have been included in a single figure, which is now Figure 3–figure supplement 3.
Figure S6F&G: Do not run statistics on n = 2 (G) or 3 (F) samples.
We have removed statistical test results from comparisons involving fewer than 4 observations.
Reviewer #3 (Public Review):
Summary:
Neurogenesis in the mammalian olfactory epithelium persists throughout the life of the animal. The process replaces damaged or dying olfactory sensory neurons. It has been tacitly that replacement of the OR subtypes is stochastic, although anecdotal evidence has suggested that this may not be the case. In this study, Santoro and colleagues systematically test this hypothesis by answering three questions: is there enrichment of specific OR subtypes associated with neurogenesis? Is the enrichment dependent on sensory stimulus? Is the enrichment the result of differential generation of the OR type or from differential cell death regulated by neural activity? The authors provide some solid evidence indicating that musk odor stimulus selectively promotes the OR types expressing the musk receptors. The evidence argues against a random selection of ORs in the regenerating neurons.
Strengths:
The strength of the study is a thorough and systematic investigation of the expression of multiple musk receptors with unilateral naris occlusion or under different stimulus conditions. The controls are properly performed. This study is the first to formulate the selective promotion hypothesis and the first systematic investigation to test it. The bulk of the study uses in situ hybridization and immunofluorescent staining to estimate the number of OR types. These results convincingly demonstrate the increased expression of musk receptors in response to male odor or muscone stimulation.
Weaknesses:
A major weakness of the current study is the single-cell RNASeq result. The authors use this piece of data as a broad survey of receptor expression in response to unilateral nasal occlusion. However, several issues with this data raise serious concerns about the quality of the experiment and the conclusions. First, the proportion of OSNs, including both the immature and mature types, constitutes only a small fraction of the total cells. In previous studies of the OSNs using the scRNASeq approach, OSNs constitute the largest cell population. It is curious why this is the case. Second, the authors did not annotate the cell types, making it difficult to assess the potential cause of this discrepancy. Third, given the small number of OSNs, it is surprising to have multiple musk receptors detected in the open side of the olfactory epithelium whereas almost none in the closed side. Since each OR type only constitutes ~0.1% of OSNs on average, the number of detected musk receptors is too high to be consistent with our current understanding and the rest of the data in the manuscript. Finally, unlike the other experiments, the authors did not describe any method details, nor was there any description of quality controls associated with the experiment. The concerns over the scRNASeq data do not diminish the value of the data presented in the bulk of the study but could be used for further analysis.
We are grateful to the reviewer for raising these important questions.
In the revised manuscript, we have clarified that the scRNA-seq dataset presented in the original version of the manuscript (now called dataset OE 1) was published and described in detail in a previous study (C. J. van der Linden et al., 2020). The reviewer is correct that the proportion of OSNs within that dataset was lower in that dataset than in other datasets that have been published more recently (using updated methods). We think this is likely because of the way that the cells were processed (e.g., from cryopreserved single cells followed by live/dead selection). However, because the open and closed sides were processed identically, we do not expect the ratios of OSNs of specific subtypes to be greatly affected. Hence, the differences observed for specific OSN subtypes on the open versus closed sides are expected to be valid.
As the reviewer notes, there is a surprisingly large difference between the number of OSNs of musk-responsive subtypes on the open and closed sides within the OE 1 dataset. This difference is a key piece of information that led us to formulate the hypothesis in the study: that musk responsive subtypes are born at a higher rate in the presence of male/musk odor stimulation. And while it is true that, on average, each subtype represents ~0.1% of the population, it is known that there is wide variance in representations among different subtypes [e.g., (Ibarra-Soria et al., 2017)]. The frequencies of the musk responsive subtypes among all OSNs on the open side of OE 1 (0.3% for Olfr235, 0.4% for olfr1440, 0.06% for Olfr1434, 0% for olfr1431, and 1% for Olfr1437) are in line with previous findings.
To confirm that the scRNA-seq findings from dataset OE 1 are not an artifact of the cell preparation methods used, we generated a second scRNA-seq dataset, OE 2, which has been added to the revised manuscript (Figure 1). The OE 2 dataset was prepared according to the same experimental timeline as OE 1, but the cells were captured immediately after dissociation and live/dead sorting via FACS. As expected, most cells within OE 2 dataset are OSNs (77% on the open side, 66% on the closed). Importantly, like the OE 1 dataset, the OE 2 dataset shows higher quantities of iOSNs of musk responsive subtypes on the open side of the OE compared to the closed (normalized for either total cells or total OSNs) (Figure 1–figure supplement 1D, E).
A weakness of the experiment assessing musk receptor expression is that the authors do not distinguish immature from mature OSNs. Immature OSNs express multiple receptor types before they commit to the expression of a single type. The experiments do not reveal whether mature OSNs maintain an elevated expression level of musk receptors.
While it is established that multiple ORs are coexpressed at a low level during OSN differentiation (Bashkirova et al., 2023; Fletcher et al., 2017; Hanchate et al., 2015; Pourmorady et al., 2024; Saraiva et al., 2015; Scholz et al., 2016; Tan et al., 2015), this has been found to occur primarily at the immediate neuronal precursor 3 (INP3) stage (Bashkirova et al., 2023; Fletcher et al., 2017), which is characterized by expression of Tex15 (Fletcher et al., 2017; Pourmorady et al., 2024) and precedes the immature OSN (iOSN) stage, which is characterized by expression of Gap43 (Fletcher et al., 2017; McIntyre et al., 2010; Verhaagen et al., 1989). Within the scRNA-seq datasets in the present study, iOSNs of specific subtypes are identified based on robust expression of Gap43 (Log<sup>2</sup> UMI > 1) and a specific OR gene (Log<sup>2</sup> UMI > 2), as described in the figures and methods. Thus, the cells defined as iOSNs are expected to express a single OR gene and this expression should be maintained as iOSNs transition to mOSNs. To confirm these predictions, we carried out a detailed analysis of OR expression at three different stages of OSN differentiation: INP3, iOSN, and mOSN (Figure 1–figure supplement 2). The cells chosen for analysis express the musk-responsive ORs Olfr235 or Olfr1440 or a randomly chosen OR Olfr701, in addition to markers that define INP3, iOSN, or mOSN cells. As expected, individual iOSNs and mOSNs of musk-responsive subtypes were found to exhibit robust and singular OR expression on the open and closed sides of OEs from UNO-treated mice. Moreover, and as observed previously, INP3 cells coexpress multiple OR transcripts at low levels. A detailed description of how the analysis was performed is included in the Methods section under Quantification and statistical analysis.
Within the histology-based quantifications, newborn OSNs are identified based on their robust RNA-FISH signals corresponding to a specific OR transcript and an EdU label. Considering the EdU chase time of 7 days, most EdU-positive cells are expected to have passed the INP3 stage and be iOSNs or mOSNs. Moreover, considering the low level of OR expression within INP3 cells, it is unlikely OR transcripts are expressed at a high enough level to be detectable and/or counted at this stage and thereby affect newborn OSN quantifications.
There are also two conceptual issues that are of concern. The first is the concept of selective neurogenesis. The data show an increased expression of musk receptors in response to male odor stimulation. The authors argue that this indicates selective neurogenesis of the musk receptor types. However, it is not clear what the distinction is between elevated receptor expression and a commitment to a specific fate at an early stage of development. As immature OSNs express multiple receptors, a likely scenario is that some newly differentiated immature OSNs have elevated expression of not only the musk receptors but also other receptors. The current experiments do not distinguish the two alternatives. Moreover, as pointed out above, it is not clear whether mature OSNs maintain the increased expression. Although a scRNASeq experiment can clarify it, the authors, unfortunately, did not perform an in-depth analysis to determine at which point of neurogenesis the cells commit to a specific musk receptor type. The quality of the scRNASeq data unfortunately also does not lend confidence for this type of analysis.
The addition of a second scRNA-seq dataset within the revised manuscript (Figure 1), combined with the new scRNA-seq-based analyses of OR expression in INP3, iOSN, and mOSN cells (Figure 1-figure supplement 2), provide strong evidence that iOSNs and mOSNs robustly express a single OR gene and that cellular expression is stable from the iOSN to the mOSN stage. These analyses do not support a scenario in which odor stimulation causes upregulated expression of multiple ORs and thereby causes apparent increases in quantities of newly generated OSNs that express musk-responsive ORs. Rather, the data firmly support a mechanism in which odor stimulation increases quantities of newly generated OSNs that have stably committed to the robust expression of a single musk-responsive OR.
A second conceptual issue, the idea of homeostasis in regeneration, which the authors presented in the Introduction, needs clarification. In its current form, it is confusing. It could mean that a maintenance of the distribution of receptor types, or it could mean the proper replacement of a specific OR type upon the loss of this type. The authors seem to refer to the latter and should define it properly.
We have revised the Introduction section to clarify our use of the term homeostatic in one instance (paragraph 4) and replace it with more specific language in a second instance (paragraph 5).
Reviewer #3 (Recommendations For The Authors):
Concerns over scRNASeq data. It appears that the samples may have included non-OE tissues, which reduced the representation of the OSNs. This experiment may need to be repeated to increase the number of OSNs.
As outlined in the response to the public comments, we think that the low proportion of OSNs in the OE 1 data set reflects how the cells were prepared and processed. We have now included a second scRNA-seq dataset to address this concern.
Cell types should be identified in the scRNASeq analysis, and the number of cells documented for each cell type, at least for the OSNs. The data should be made available for general access.
We have now clarified that the OE 1 dataset was published as part of a previous study (C. J. van der Linden et al., 2020) and was made publicly available as part of that study (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE157119). All cell types in the newly generated OE 2 dataset have been annotated (Figure 1) and this dataset has also been made publicly available (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE278693). The numbers and percentages of OSNs within OE 1 and OE 2 datasets have been added to the legend of Figure 1-figure supplement 1.
The specific OR types should be segregated for mature and immature OSNs. The percentage of a specific OR type should be normalized to the total number of OSNs, rather than the total cells. The current quantification is misleading because it gives the false sense that the muscone receptors represent ~0.1% of cells when the proportion is much higher if only OSNs are considered.
In the revised manuscript, quantities of iOSNs (Gap43+ cells) of specific subtypes within the OE 1 and OE 2 scRNA-seq datasets are graphed as percentages of both all OSNs (Figure 1E, Figure 1–figure supplement 1D) and all cells (Figure 1–figure supplement 1E). As a percentage of all OSNs, average quantities of iOSNs of musk responsive subtypes on the open side of the OE range from 0.005% (for Olfr1431) to 0.14% (for Olfr1440) (Figure 1E).
Within the feature plots for the two datasets, the differentiation stages of indicated OSNs have been clearly defined within the figures and figure legends. For the OE 1 dataset, iOSNs are differentiated from mOSNs by arrows (Figure 1–figure supplement 1C). For the OE 2 dataset (Figure 1D), only immature OSNs are shown for simplicity.
Technical details of the scRNASeq should be documented. In the feature plot of musk-response receptors (Figure. 1D), it is better to use the actual quantity of expression rather than binarized representation (with or without an OR). If one needs to use on/off to determine the number of cells for a given OR type, then the criteria of selection should be given.
Technical details of generation of the scRNA-seq datasets have been documented in the “Method details” section (for the OE 2 dataset) and in the method section of our previous publication of the OE 1 dataset (C. J. van der Linden et al., 2020). Details of the scRNA-seq analyses, including the criteria used to define immature OSNs of specific subtypes, are documented within the “Quantification and statistical analysis” section.
Within the feature plots, we have decided to show OSNs of a given subtype in a binary fashion using specific colors for the sake of simplicity (Figure 1D, Figure 1-figure supplement 1C). To address the reviewer’s cooncern, we have added a new figure that provides detailed information about OR transcript expression (levels and genes) within iOSNs and mOSNs of two different musk responsive subtypes and a randomly chosen subtype (Figure 1-figure supplement 2).
An in-depth analysis of the onset of OR expression in the GBC, INP, immature, and mature OSNs should be performed. It is also important to determine how many other receptors are detected in the cells that express the musk receptors. The current scRNASeq data may not be of sufficiently high quality and the experiment needs to be repeated. It is also important for the authors to take measures to eliminate ambient RNA contamination.
The revised manuscript includes a second scRNA-seq dataset (OE 2; Figure 1). Details of how both the original (OE 1) and new datasets were generated have been documented within the Methods sections of the corresponding publications [(C. J. van der Linden et al., 2020); present study]. For both datasets, live/dead selection of cells was performed, which was expected to reduce ambient RNA.
The revised manuscript also includes a new figure that provides detailed information about OR transcript expression within INP3, iOSN and mOSN cells that express one of two different musk responsive ORs or a randomly chosen OR (Figure 1-figure supplement 2). These data reveal, as reported previously (Bashkirova et al., 2023; Fletcher et al., 2017; Pourmorady et al., 2024), that low levels of multiple OR transcripts are detected in INP3 (Tex15+) cells. By contrast, iOSN (Gap43+) and mOSN (Omp+) cells robustly express a single OR, with little or no expression of other ORs.
Quantification of cells for Figure 2-7 should be changed. Instead of using cell number per 1/2 section, the data should be calculated using density (using the area of the epithelium or normalized to the total number of cells (based on DAPI staining). This is because multiple sections are taken from the same mouse along the A-P axis. These sections have different sizes and numbers of cells.
As noted in response to a similar concern of Reviewer #2, this has been addressed in two ways within the revised manuscript:
(1) We have noted within the Methods section that the approach of using half-sections for normalization has been used in multiple previous studies for quantifying newborn (OR+/EdU+) and total (OR+) OSN abundances (Hossain et al., 2023; Ibarra-Soria et al., 2017; C. van der Linden et al., 2018; C. J. van der Linden et al., 2020). Additionally, within the figure legends and Methods, we have more thoroughly described the approach used, including that it relies on averaging the quantifications from at least 5 high-quality coronal OE tissue sections that are evenly distributed throughout the anterior-posterior length of each OE and thereby mitigates the effects of section size and cell number variation among sections. In the case of UNO treated mice, the open and closed sides within the same section are paired, which further reduces the effects of section-to section variation. We have found that this approach yields reproducible quantities of newborn and total OSNs among biological replicate mice and enables accurate assessment of how quantities of OSNs of specific subtypes change as a result of altered olfactory experience, a key objective of this study.
(2) To assess whether the use of alternative approaches for normalizing newborn OSN quantities suggested by the reviewers would affect the present study’s findings, we compared three methods for normalizing the effects of exposure to male odors or muscone on quantities of newborn Olfr235 OSNs in the OEs of both UNO-treated and non-occluded mice: 1) OR+/EdU+ OSNs per half-section (used in this study), 2) OR+/EdU+ OSNs per total number of EdU+ cells (reviewer suggestion (i)), and 3) OR+/EdU+ OSNs per unit of DAPI+ area (an approximate measure of nuclei number; reviewer suggestion (ii)). The three normalization methods yielded statistically indistinguishable differences in assessing the effects of exposure of either UNO-treated or non-occluded mice to male odors (newly added Figure 2–figure supplement 2 and Figure 3–figure supplement 2), or of exposure of non-occluded mice to muscone (newly added Figure 4–figure supplement 3). Based on these findings, and the considerable time that would be required to renormalize all data in the manuscript, we have chosen to maintain the use of normalization per half-section.
References
Bashkirova, E. V., Klimpert, N., Monahan, K., Campbell, C. E., Osinski, J., Tan, L., Schieren, I., Pourmorady, A., Stecky, B., Barnea, G., Xie, X. S., Abdus-Saboor, I., Shykind, B. M., Marlin, B. J., Gronostajski, R. M., Fleischmann, A., & Lomvardas, S. (2023). Opposing, spatially-determined epigenetic forces impose restrictions on stochastic olfactory receptor choice. eLife, 12, RP87445. https://doi.org/10.7554/eLife.87445
Coppola, D. M. (2012). Studies of olfactory system neural plasticity: The contribution of the unilateral naris occlusion technique. Neural Plasticity, 2012, 351752. https://doi.org/10.1155/2012/351752
Fletcher, R. B., Das, D., Gadye, L., Street, K. N., Baudhuin, A., Wagner, A., Cole, M. B., Flores, Q., Choi, Y. G., Yosef, N., Purdom, E., Dudoit, S., Risso, D., & Ngai, J. (2017). Deconstructing Olfactory Stem Cell Trajectories at Single-Cell Resolution. Cell Stem Cell, 20(6), 817-830.e8. https://doi.org/10.1016/j.stem.2017.04.003
Han, X., Jiang, Y., Feng, N., Yang, P., Zhang, M., Jin, W., Zhang, T., Huang, Z., Zhao, H., Zhang, K., Liu, S., & Hu, D. (2022). Comparison of the Homology Between Muskrat Scented Gland and Mouse Preputial Gland. Journal of Mammalian Evolution, 29(2), 435–446. https://doi.org/10.1007/s10914-022-09604-w
Hanchate, N. K., Kondoh, K., Lu, Z., Kuang, D., Ye, X., Qiu, X., Pachter, L., Trapnell, C., & Buck, L. B. (2015). Single-cell transcriptomics reveals receptor transformations during olfactory neurogenesis. Science (New York, N.Y.), 350(6265), 1251–1255. https://doi.org/10.1126/science.aad2456
Hossain, K., Smith, M., & Santoro, S. W. (2023). A histological protocol for quantifying the birthrates of specific subtypes of olfactory sensory neurons in mice. STAR Protocols, 4(3), 102432. https://doi.org/10.1016/j.xpro.2023.102432
Ibarra-Soria, X., Nakahara, T. S., Lilue, J., Jiang, Y., Trimmer, C., Souza, M. A., Netto, P. H., Ikegami, K., Murphy, N. R., Kusma, M., Kirton, A., Saraiva, L. R., Keane, T. M., Matsunami, H., Mainland, J., Papes, F., & Logan, D. W. (2017). Variation in olfactory neuron repertoires is genetically controlled and environmentally modulated. eLife, 6. https://doi.org/10.7554/eLife.21476
Kelemen, G. (1947). The junction of the nasal cavity and the pharyngeal tube in the rat. Archives of Otolaryngology, 45(2), 159–168. https://doi.org/10.1001/archotol.1947.00690010168002
Lin, D. Y., Zhang, S.-Z., Block, E., & Katz, L. C. (2005). Encoding social signals in the mouse main olfactory bulb. Nature, 434(7032), 470–477. https://doi.org/10.1038/nature03414
McIntyre, J. C., Titlow, W. B., & McClintock, T. S. (2010). Axon growth and guidance genes identify nascent, immature, and mature olfactory sensory neurons. Journal of Neuroscience Research, 88(15), 3243–3256. https://doi.org/10.1002/jnr.22497
Pourmorady, A. D., Bashkirova, E. V., Chiariello, A. M., Belagzhal, H., Kodra, A., Duffié, R., Kahiapo, J., Monahan, K., Pulupa, J., Schieren, I., Osterhoudt, A., Dekker, J., Nicodemi, M., & Lomvardas, S. (2024). RNA-mediated symmetry breaking enables singular olfactory receptor choice. Nature, 625(7993), 181–188. https://doi.org/10.1038/s41586-023-06845-4
Saraiva, L. R., Ibarra-Soria, X., Khan, M., Omura, M., Scialdone, A., Mombaerts, P., Marioni, J. C., & Logan, D. W. (2015). Hierarchical deconstruction of mouse olfactory sensory neurons: From whole mucosa to single-cell RNA-seq. Scientific Reports, 5, 18178. https://doi.org/10.1038/srep18178
Sato-Akuhara, N., Horio, N., Kato-Namba, A., Yoshikawa, K., Niimura, Y., Ihara, S., Shirasu, M., & Touhara, K. (2016). Ligand Specificity and Evolution of Mammalian Musk Odor Receptors: Effect of Single Receptor Deletion on Odor Detection. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 36(16), 4482–4491. https://doi.org/10.1523/JNEUROSCI.3259-15.2016
Scholz, P., Kalbe, B., Jansen, F., Altmueller, J., Becker, C., Mohrhardt, J., Schreiner, B., Gisselmann, G., Hatt, H., & Osterloh, S. (2016). Transcriptome Analysis of Murine Olfactory Sensory Neurons during Development Using Single Cell RNA-Seq. Chemical Senses, 41(4), 313–323. https://doi.org/10.1093/chemse/bjw003
Schwende, F. J., Wiesler, D., Jorgenson, J. W., Carmack, M., & Novotny, M. (1986). Urinary volatile constituents of the house mouse,Mus musculus, and their endocrine dependency. Journal of Chemical Ecology, 12(1), 277–296. https://doi.org/10.1007/BF01045611
Shirasu, M., Yoshikawa, K., Takai, Y., Nakashima, A., Takeuchi, H., Sakano, H., & Touhara, K. (2014). Olfactory receptor and neural pathway responsible for highly selective sensing of musk odors. Neuron, 81(1), 165–178. https://doi.org/10.1016/j.neuron.2013.10.021
Tan, L., Li, Q., & Xie, X. S. (2015). Olfactory sensory neurons transiently express multiple olfactory receptors during development. Molecular Systems Biology, 11(12), 844. https://doi.org/10.15252/msb.20156639
van der Linden, C. J., Gupta, P., Bhuiya, A. I., Riddick, K. R., Hossain, K., & Santoro, S. W. (2020). Olfactory Stimulation Regulates the Birth of Neurons That Express Specific Odorant Receptors. Cell Reports, 33(1), 108210. https://doi.org/10.1016/j.celrep.2020.108210
van der Linden, C., Jakob, S., Gupta, P., Dulac, C., & Santoro, S. W. (2018). Sex separation induces differences in the olfactory sensory receptor repertoires of male and female mice. Nature Communications, 9(1), 5081. https://doi.org/10.1038/s41467-018-07120-1
Verhaagen, J., Oestreicher, A. B., Gispen, W. H., & Margolis, F. L. (1989). The expression of the growth associated protein B50/GAP43 in the olfactory system of neonatal and adult rats. The Journal of Neuroscience: The Official Journal of the Society for Neuroscience, 9(2), 683–691.
Vihani, A., Hu, X. S., Gundala, S., Koyama, S., Block, E., & Matsunami, H. (2020). Semiochemical responsive olfactory sensory neurons are sexually dimorphic and plastic. eLife, 9, e54501. https://doi.org/10.7554/eLife.54501
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript demonstrates that Oct4 overexpression synergizes with Notch inhibition (Rbpj knockout) to promote the conversion of adult murine Müller glia (MG) into bipolar cells. These findings are important as the authors used rigorous genetic lineage tracing (GLAST-CreER; Sun-GFP) to confirm that neurogenesis indeed originates from MGs, addressing a key issue in the field. The single-cell multiomic analyses are compelling, and while functional studies of MG-derived bipolar cells would strengthen the conclusions, they are beyond the scope of this study.
-
Reviewer #1 (Public review):
Summary:
In this study, Le et al.. aimed to explore whether AAV-mediated overexpression of Oct4 could induce neurogenic competence in adult murine Müller glia, a cell type that, unlike its counterparts in cold-blooded vertebrates, lacks regenerative potential in mammals. The primary goal was to determine whether Oct4 alone, or in combination with Notch signaling inhibition, could drive Müller glia to transdifferentiate into bipolar neurons, offering a potential strategy for retinal regeneration.
The authors demonstrated that Oct4 overexpression alone resulted in the conversion of 5.1% of Müller glia into Otx2+ bipolar-like neurons by five weeks post-injury, compared to 1.1% at two weeks. To further enhance the efficiency of this conversion, they investigated the synergistic effect of Notch signaling inhibition by genetically disrupting Rbpj, a key Notch effector. Under these conditions, the percentage of Müller glia-derived bipolar cells increased significantly to 24.3%, compared to 4.5% in Rbpj-deficient controls without Oct4 overexpression. Similarly, in Notch1/2 double-knockout Müller glia, Oct4 overexpression increased the proportion of GFP+ bipolar cells from 6.6% to 15.8%.
To elucidate the molecular mechanisms driving this reprogramming, the authors performed single-cell RNA sequencing (scRNA-seq) and ATAC-seq, revealing that Oct4 overexpression significantly altered gene regulatory networks. They identified Rfx4, Sox2, and Klf4 as potential mediators of Oct4-induced neurogenic competence, suggesting that Oct4 cooperates with endogenously expressed neurogenic factors to reshape Müller glia identity.
Overall, this study aimed to establish Oct4 overexpression as a novel and efficient strategy to reprogram mammalian Müller glia into retinal neurons, demonstrating both its independent and synergistic effects with Notch pathway inhibition. The findings have important implications for regenerative therapies as they suggest that manipulating pluripotency factors in vivo could unlock the neurogenic potential of Müller glia for treating retinal degenerative diseases.
Strengths:
(1) Novelty: The study provides compelling evidence that Oct4 overexpression alone can induce Müller glia-to-bipolar neuron conversion, challenging the conventional view that mammalian Müller glia lacks neurogenic potential.<br /> (2) Technological Advances: The combination of Muller glia-specific labeling and modifying mouse line, AAV-GFAP promoter-mediated gene expression, single-cell RNA-seq, and ATAC-seq provides a comprehensive mechanistic dissection of glial reprogramming.<br /> (3) Synergistic Effects: The finding that Oct4 overexpression enhances neurogenesis in the absence of Notch signaling introduces a new avenue for retinal repair strategies.
Weaknesses:
(1) In this study, the authors did not perform a comprehensive functional assessment of the bipolar cells derived from Müller glia to confirm their neuronal identity and functionality.<br /> (2) Demonstrating visual recovery in a bipolar cell-deficiency disease model would significantly enhance the translational impact of this work and further validate its therapeutic potential.
Comments on revisions:
The author answered all my questions and corrected the minor comments, so I have no more comments on the manuscript.
-
Reviewer #2 (Public review):
Summary:
The authors harness single cell RNAseq data from zebrafish and mice to identify Oct4 as a candidate driver of neurogenesis. They then use adeno-associated virus vectors to show that while Oct4 overexpression alone converts rare adult Müller glia (MG) to bipolar cells, it synergizes with Notch pathway inhibition to cause this neurogenesis (achieved by Cre-mediated knockout of Rbpj floxed allele). Importantly, they genetically lineage-mark adult MG using a GLAST-CreER transgene and a Sun-GFP reporter, so that any non-MG cells that convert can be identified unambiguously. This is crucial because several high-profile papers made erroneous claims using short promoters in the viral delivery vector itself to mark MG, but those promoters are leaky and mark other non-MG cell types, making it impossible to definitively state whether manipulations studied were actually causing neurogenesis, or were merely the result of expression in pre-existing neurons. Once the authors establish Oct4 + RbpjKO synergy they use snRNAseq/ATACseq to identify known and novel transcription factors that could play a role in driving neurogenesis.
Strengths:
The system to mark MG is stringent, so the authors are studying transdifferentiation, not artifactual effects due to leaky viral promoters. The synergy between Oct4 and Notch pathway blockade is notable. The single cell results add the potential involvement of new players such as Rfx4 in adult-MG-neurogenesis.
Weaknesses:
The revised version is clear and there are no major weaknesses.
Overall, the authors achieved what they set out to do, and have made new insights into how neurogenesis can be stimulated in MG. Ultimately, a major long-term goal in the field is to replace lost photoreceptors as this is most relevant to many human visual disorders, and while this paper (like all others before it) does not generate rods or cones, it opens new strategies to coax MG to form a related neuronal cell type. Their approach underscores the benefits of using a gold standard approach for lineage tracing.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
Summary:
In this study, Le et al.. aimed to explore whether AAV-mediated overexpression of Oct4 could induce neurogenic competence in adult murine Müller glia, a cell type that, unlike its counterparts in cold-blooded vertebrates, lacks regenerative potential in mammals. The primary goal was to determine whether Oct4 alone, or in combination with Notch signaling inhibition, could drive Müller glia to transdifferentiate into bipolar neurons, offering a potential strategy for retinal regeneration.
The authors demonstrated that Oct4 overexpression alone resulted in the conversion of 5.1% of Müller glia into Otx2+ bipolar-like neurons by five weeks post-injury, compared to 1.1% at two weeks. To further enhance the efficiency of this conversion, they investigated the synergistic effect of Notch signaling inhibition by genetically disrupting Rbpj, a key Notch effector. Under these conditions, the percentage of Müller gliaderived bipolar cells increased significantly to 24.3%, compared to 4.5% in Rbpjdeficient controls without Oct4 overexpression. Similarly, in Notch1/2 double-knockout Müller glia, Oct4 overexpression increased the proportion of GFP+ bipolar cells from 6.6% to 15.8%.
To elucidate the molecular mechanisms driving this reprogramming, the authors performed single-cell RNA sequencing (scRNA-seq) and ATAC-seq, revealing that Oct4 overexpression significantly altered gene regulatory networks. They identified Rfx4, Sox2, and Klf4 as potential mediators of Oct4-induced neurogenic competence, suggesting that Oct4 cooperates with endogenously expressed neurogenic factors to reshape Müller glia identity.
Overall, this study aimed to establish Oct4 overexpression as a novel and efficient strategy to reprogram mammalian Müller glia into retinal neurons, demonstrating both its independent and synergistic effects with Notch pathway inhibition. The findings have important implications for regenerative therapies as they suggest that manipulating pluripotency factors in vivo could unlock the neurogenic potential of Müller glia for treating retinal degenerative diseases.
Strengths:
(1) Novelty: The study provides compelling evidence that Oct4 overexpression alone can induce Müller glia-to-bipolar neuron conversion, challenging the conventional view that mammalian Müller glia lacks neurogenic potential.
(2) Technological Advances: The combination of Muller glia-specific labeling and modifying mouse line, AAV-GFAP promoter-mediated gene expression, single-cell RNA-seq, and ATAC-seq provides a comprehensive mechanistic dissection of glial reprogramming.
(3) Synergistic Effects: The finding that Oct4 overexpression enhances neurogenesis in the absence of Notch signaling introduces a new avenue for retinal repair strategies.
Weaknesses:
(1) In this study, the authors did not perform a comprehensive functional assessment of the bipolar cells derived from Müller glia to confirm their neuronal identity and functionality.
(2) Demonstrating visual recovery in a bipolar cell-deficiency disease model would significantly enhance the translational impact of this work and further validate its therapeutic potential.
Response: We thank the Reviewer for their evaluation. We agree that functional analysis of Müller glia-derived bipolar cells is indeed important, but is beyond the current scope of the manuscript.
Reviewer #2 (Public review):
Summary:
The authors harness single-cell RNAseq data from zebrafish and mice to identify Oct4 as a candidate driver of neurogenesis. They then use adeno-associated virus vectors to show that while Oct4 overexpression alone converts rare adult Müller glia (MG) to bipolar cells, it synergizes with Notch pathway inhibition to cause this neurogenesis (achieved by Cre-mediated knockout of Rbpj floxed allele). Importantly, they genetically lineage-mark adult MG using a GLAST-CreER transgene and a Sun-GFP reporter, so that any non-MG cells that convert can be identified unambiguously. This is crucial because several high-profile papers made erroneous claims using short promoters in the viral delivery vector itself to mark MG, but those promoters are leaky and mark other non-MG cell types, making it impossible to definitively state whether manipulations studied were actually causing neurogenesis, or were merely the result of expression in pre-existing neurons. Once the authors establish Oct4 + RbpjKO synergy they use snRNAseq/ATACseq to identify known and novel transcription factors that could play a role in driving neurogenesis.
Strengths:
The system to mark MG is stringent, so the authors are studying transdifferentiation, not artifactual effects due to leaky viral promoters. The synergy between Oct4 and Notch pathway blockade is notable. The single-cell results add the potential involvement of new players such as Rfx4 in adult-MG-neurogenesis.
Weaknesses:
The existing version is difficult to read due to an unusually high number of text errors (e.g. references to the wrong figure panels etc.). A fuller explanation for the fraction of non-MG cells seen in control scRNAseq assays is required, particularly because the neurogenic trajectory which is enhanced in the Oct4/Rbpj-KO context is also evident in the control retina. Claims regarding the involvement of transcription factors in adult neurogenesis (such as Rfx4) need to be toned down unless they are backed up with functional data. It is possible that such factors are important, but equally, they may have no role or a redundant role, and without functional tests, it's impossible to say one way or the other.
Overall, the authors achieved what they set out to do, and have made new insights into how neurogenesis can be stimulated in MG. Ultimately, a major long-term goal in the field is to replace lost photoreceptors as this is most relevant to many human visual disorders, and while this paper (like all others before it) does not generate rods or cones, it opens new strategies to coax MG to form a related neuronal cell type. Their approach underscores the benefits of using a gold-standard approach for lineage tracing.
We thank the Reviewer for their evaluation. We have made extensive changes to the manuscript to correct errors and modify discussion as recommended. These are detailed below in our point-by-point responses to specific recommendations to the authors.
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor corrections:
(1) In Figure 1C top GFAP-mCherry panel, two dim GFP + cells have colocalized with Otx2, is it caused by optic imaging thickness or some muller glia cells having the Otx2 expression?
This indeed reflects the effects of optic imaging thickness. Colocalization of Sun1-GFP and Otx2 is not observed when Z-stack images are examined in GlastCreER;Sun1-GFP retinas. This can also be appreciated by the fact that, in cases of apparent overlap of nuclear envelope-targeted Sun1 and Otx2, the sizes of the labeled areas differ. In cases of true expression overlap, such as is seen following Oct4 overexpression, the labeled areas are the same size, or very nearly so.
Whether the Glast-CreERT2 x Rosa26-LSL-Sun1-GFP mouse line has cross-labeling with the Otx2+ bipolar cells, the author should image the mCherry ctrl sample with a thin optical imaging layer with a small pinhole for Z-stack to verify the co-labeling the GFP and Otx2 in mCherry ctrl sample.
Please see above. Since we first described this line (de Melo, et al. 2012), we have examined thousands of sections of GlastCreER;Sun1-GFP retinas, and have yet to see a single GFP-positive neuron. To avoid confusion, however, we have replaced these images with an additional image from a control mCherry-infected GlastCreER;Sun1-GFP retina processed for the same study.
In the middle upper panel, Oct4-mCherry group, the white arrows indicate the GFP colocalized with Otx2 signal, but seems not mCherry positive, by contrast, the neighbor cells have significant mCherry expression but no colocalization with Otx2. The GFAP promoter-Oct4-mCherry may have stopped expression after the Müller Glia cells were converted into Otx2+ bipolar cells, but is there any middle stage in which the Oct4mCherry and Otx2 co-expression? And after Müller glia to Bipolar conversion, why have Glast-CreERT2 driven GFP expressions not suppressed as GFAP promoter driven Oct4-mCherry? Could the author discuss this point?
We observed a significant number of Muller glia-derived cells expressing both Otx2 and weak mCherry signal. GFP expression is driven by the ubiquitous CAG promoter following Cre-dependent excision of a transcriptional stop cassette. We have modified the text to make this point explicit.
(2) In Figure S2b, the mouse is labeled with wild type; I assume it should be the same mouse line as Fig.1. Otherwise, the author should describe the source of the GFP signal.
“Wildtype” in this case refers to GlastCreER;Sun1-GFP controls, which as the Reviewer correctly points out, are not truly wildtype. The genotype of these animals is specified in all figure legends, and is now referred to as “control” rather than “wildtype” in the figures and main text throughout.
In Figure S2k and l, mCherry ctrl panel, the GFP+ cells looked co-labeling with Otx2, so again, is it the thicker optical imaging layer that caused overlapping vertically or the low specific of Müller Glia of the mouse line? Please describe the stars' meaning in Figure S2i,j in the figure legend. There are 2 figures labeled "n" of the quantification data.
This is, again, an example of the thicker optical imaging layer causing apparent overlap. We have previously demonstrated that the Sun1-GFP+ cells do not co-label with Otx2 in GFAP-mCherry AAV-injected control retinas (Le et al., 2022; Fig. 2C). The asterisks (*) indicate mouse-on-mouse vascular staining, which is now clarified in the figure legend. The 2 figures labeled ‘n’ have been relabeled as ‘m’ and ‘n’.
(3) In Figure 2c in the top panel, the Otx2 image was wrong; please replace it with the correct one.
We thank the Reviewer for spotting this error. This is an inadvertent duplication of the single-channel Otx2 staining for mCherry control sample. We have replaced this with the correct image.
(4) In Figure 3a, the Rbpj-cKO mouse line was used, but where was the GFP signal from? Please verify the mouse line you used in your work. The same question is also asked in Figure S3, S4b.
GlastCreER;Rpbj<sup>lox/lox</sup>;Sun1-GFP were used in Figure 3a. As now specified in the Methods and all figure legends, all mice used in this study carry both the GlastCreER and Sun1-GFP transgenes.
(5) In Figure S4c,d, and 5 wks time point, if the authors quantify the GFP+/Sox2- cells changing, it will be more helpful to understand the percentage of the Müller glia cells conversion to Bipolar cells compared to the Figure 2D, and can be as a supplement to the conclusion Müller to Bipolar conversion rather the Müller proliferation.
Sox2-/GFP+ cells are a measure of Müller glia to bipolar cell conversion that complements that of GFP+/Otx2+ cells. This is now clarified in the text. We also include quantification of Sox2-/GFP+ neurons at 5 weeks post-injury in Fig. S5b.
(6) In Figure S1b,c, there is a large portion of cells that are activated Müller glia after NMDA injury. Did the activated Müller glial cells lose their Müller glial identity? Between the loss of Müller glial identity and neuronal reprogramming, are there any markers that can be used to assess whether Müller glial cells are truly transdifferentiating into neurons rather than remaining in a reactive glial state or an intermediate phase?
Wildtype Müller glia progressively revert to resting state, and by 72 hours post-injury have already lost expression of Klf4 and Myc (Hoang, et al. 2020), a point which is now specifically mentioned in the text. In GlastCreER;Sun1-GFP;Nfia/b/x<sup>lox/lox</sup>;Rbpj<sup>lox/lox</sup> Müller glia, reactive MG appear to largely convert to bipolar and amacrine-like cells, and it remains unclear if they eventually revert to a resting state (Le, et al. 2024).
Reviewer #2 (Recommendations for the authors):
This work demonstrates that Oct4 (Pou5f3) can induce neurogenesis in murine Müller glia (MG). Le et al start by showing that murine and zebrafish MG lack expression of Oct4 (Pou5f3) and its target Nanog. To assess the effect of Oct4 they first label adult MG with Sun1-GFP using tamoxifen-treated GlastCreER;Sun1-GFP mice, then later transduce in vivo with AAV vectors expressing mCherry alone or Oct4 + mCherry. Subsequently, they damage the retina with NMDA and assess the effects several weeks later. In Oct4+ cells at 2 weeks there is rare induction of the neural determinant Ascl1, down-regulation of the MG marker Sox2, induction of bipolar markers (Otx2, Scgn,Cabp5) but not amacrine (HuC/D) or rod (Nrl) markers. Combining Oct4 with
Notch inhibition (deleting floxed Rbpj) synergistically increases bipolar cell induction, with Otx2 staining rising to >20% of GFP-marked cells, and cells losing MG identify (loss of Sox2/9). EdU labeling was negligible suggesting direct trans-differentiation. Similar synergy was seen upon combining Oct4 expression with Notch1/2 double gene knockout. Attempts to combine Oct4 with Nfia, Nfib, and Nfix loss were unsuccessful as the GFAP promoter driving Oct4 in MG seems to require these three related transcription factors. scRNAseq confirmed the Oct4-overexpression/Rbpj-KO-driven increase in bipolar cells and decrease in MG cells and revealed that these manipulations may enhance bipolar cell genesis by repressing genes that define quiescent MG and enhancing expression of genes that define reactive MG and neurogenic cells. Finally, multiomic snRNA/scATAC-seq data was performed to assess the effect of Oct2 in wt or Rbpj null MG. This approach revealed that, as anticipated, more genes were up and down-regulated in the context of both manipulations vs Oct4 OE alone. Moreover, Oct4 and Rbpj KO reduced chromatin accessibility at target motifs for transcription factors involved in MG identify/quiescence, while MGPCs showed elevated accessibility for neurogenic factors. The combination of Oct4 OE and Rbpj KO induces accessibility at various interesting TF sites that may contribute to the synergistic neurogenesis, including Rfx4, Klf4, Insm1, and others.
This is an interesting paper that adds to the growing literature on how neurogenesis can be induced in mammalian MG. The focus on Oct4 is interesting and the synergistic effects are striking and analyzed in some detail with scRNAseq and multiomic snRNA/scATACseq. The latter results provide useful new insight into transcriptional programs that may be critical in driving neurogenesis. Functional insight into these new candidates is not explored in this manuscript, but that's beyond the scope of the current work and forms the basis for new studies. There are some overreaching statements in the Discussion that need to be toned down, but apart from that and a long list of textual errors that need to be fixed, this paper is a valuable contribution to the field.
Major comments
There are numerous textual errors (some, but not all, examples are detailed in minor comments). It was difficult to follow this paper given the unusually high number of textual errors and the abbreviated legends. Greater attention should be paid to harmonizing the text with the figures and ensuring that the legends are correct and complete.
The manuscript has been proofread carefully and errors corrected.
The opening section of the scRNAseq data should outline briefly why sorting for GFP labeled cells purifies a significant fraction of non-MG cell types, despite the earlier claim, (which agrees with other publications), that GLAST-CreER transgene expression is highly specific to MG. Presumably, it mainly/totally reflects the co-purification of cells, cell fragments, and/or cell-free mRNA from other lineages. Is it also possible that a fraction (however small) of these cells reflect low-level spurious/temporary activation of GLAST-CreER expression in non-MG? The "contamination" is present despite the addition of the GFP sequence to the reference genome (as explained in Methods). They mention: "a clear differentiation trajectory connecting Muller glia, neurogenic Muller gliaderived progenitor cells (MGPCs), and differentiating amacrine and bipolar cells (Fig. 3b)". However, the same trajectory is evident in control mCherry samples, so one could argue that this trajectory is active in normal retina at some low rate, but that would/should equate to rare sun-GFP+ non-MG in controls. Are there any such cells, even extremely rarely, or is it truly 0%? At any rate, the authors need to raise these concerns and offer some explanation(s) at the start of their scRNAseq Results section. If there are really no such sun-GFP+ cells, the authors should comment on the presence of the apparent inactive trajectory in the Discussion.
Since we first described this line (de Melo, et al. 2012), we have examined thousands of sections of GlastCreER;Sun1-GFP retinas, and have yet to see a single GFP-positive neuron. We have also previously shown (Hoang, et al. 2020) that FACSbased isolation of GFP-positive cells from GlastCreER;Sun1-GFP yields a roughly thirty-fold enrichment of Muller glia, implying the presence of small numbers of contaminating neurons. We thereby conclude that the presence of small numbers of neurons (rods, cones, bipolar, and amacrine cells) in the control GlastCreER;Sun1-GFP represents contamination rather than low levels of glia-to-neuron conversion, particularly since we are unable to detect the expression of genes such as neurogenic bHLH factors or immature photoreceptor precursor-specific factors such as Prdm1 that indicate the presence of intermediate cell states. This is now addressed in the Results section related to both Figures 3 and 4.
Discussion:
In reference to other strategies to induce neurogenesis the authors make the claim that Oct4 is fundamentally different: "In these cases, Müller glia broadly upregulate proneural genes and/or downregulate Notch signaling. Oct4 instead induces expression of the neurogenic transcription factor Rfx4, which is not expressed in developing retina. It is likely that activation of this parallel pathway to neurogenic competence in part accounts for synergistic induction of neurogenesis seen in Rbpj-deficient Müller glia". First, all these strategies, including Oct4, seem to activate bHLH factors, so they have that in common and the authors should note that overlap. More seriously, without functional tests (e.g. KO Rfx4) the authors need to dial back the over-reaching statement that Rfx4 is the fundamental mechanism driving the Oct4 effect. They can certainly suggest that this is one possibility, but equally, Rfx4 may have very little or no effect on neurogenesis, or it could act redundantly with some of the other factors the authors uncovered. It's impossible to know without functional data, so they either need to add the functional data, or hold back on the strong one-sided and overreaching claim.
Since both Rfx4 expression and motif accessibility are selectively observed following Oct4 overexpression, and Rfx4 also has known neurogenic activity, we stand by our conclusion that it is a particularly strong candidate for mediating the neurogenic effects of Oct4 overexpression. However, the Reviewer is correct that in the absence of functional data, speculation about its function should be qualified. We have done this in the revised manuscript.
Minor comments
This sentence in the Results is confusing: "While expression of neurogenic bHLH factors driven by the Gfap promoter was rapidly silenced in Muller glia and activated in amacrine and retinal ganglion cells, Gfap-Oct4-mCherry remained selectively expressed in Muller glia but did not induce detectable levels of Muller glia-derived neurogenesis in the uninjured retina (Le et al., 2022)". The cited reference is at the end so it sounds like the Oct4 assay was performed in Le et al 2022, and there is no reference to a Figure for the Oct4 data in the current paper.
As stated here, in Le, et al. 2022, we did not observe any conversion of Sun1-GFP-positive Muller glia to neurons in the absence of injury. In the current study, we instead test whether NMDA-induced excitotoxicity induced glia to neuron conversion in Muller glia overexpressing Oct4. This is now made clear in the revised text.
There are many errors and omissions regarding Figure S2:
Figure S2a, b legend, and panels do not match. 2a should be a schematic of the strategy to label MG with Sun1-GFP using GLAST-Cre and a floxed Sun1-GFP allele, but that's missing and instead, the current 2a is a schematic of AAV vectors. It seems that the current 2b legend may describe the combination of the current 2a and 2b panels.
This has been corrected.
Figure S2: Asterisks label certain stained elements in the Oct4 labeled panels, but there is no explanation in the legend. Are these meant to indicate non-specific staining? If so, what is the evidence that the signal is non-specific?
These asterisks represent non-specific mouse-on-mouse vascular staining observed with the mouse monoclonal anti-Oct4 used in this study. This is now indicated in the figure legend.
The text refers to Ascl1 staining in Figure S2e,f, but it's S2g,h.
This has been corrected.
Re this: "While Sun1-GFP-positive cells infected with Oct4-mCherry mostly express the Muller glial marker Sox2 (Fig. S2a,b), from 2 weeks post-injury onwards a subset of GFP positive cells did not show detectable Sox2 expression (Fig. S2b, yellow arrows)". Figure S2a, b are schematic diagrams, not immunofluorescence. They probably mean Figure S2c, d.
This has been corrected.
Fig S2m is mislabeled "n".
This has been corrected.
There are probably other errors with this figure, but I mostly gave up at this point. The authors should go through the paper to find and correct any additional mistakes/omissions in the text and legends.
The manuscript has been carefully proofread and errors corrected.
The figure panels are not always mentioned in the order that they appear. There are many examples.
Figure panels are now mentioned in the order that they appear.
Several schematics use "d-18-14" to indicate "day -18 to -14". The former is at first uninterpretable or at best unclear (could mean day -18 to day 14), perhaps d -18 to -14, or d -18:-14 would be clearer.
This has been corrected.
Re: "AAV-infected wildtype Muller glia could be readily identified by selective expression of Oct4 (Fig. 4e). Wildtype Oct4-expressing Muller glia give rise to both small numbers of neurogenic MGPCs (Fig. 4b),". Figure 4E is labeled Pou5f1, but it would be helpful to avoid confusion by also indicating on the figure that Pou5f1 = Oct4; and Fig 4b does not indicate neurogenic MGPCs (perhaps they mean 4c).
This has been corrected.
Some parts of the Results are written in the present tense and should be in the past tense (for guidance: https://www.nature.com/scitable/topicpage/effective-writing13815989/).
Past tense is now used throughout.
Pit1 (Pou1f1) is referred to as a "close variant" of Oct4/Pou4f5, but this is unclear (e.g. variant could mean a splice variant from the same locus) and the term "paralogue" should be used.
“Paralogue” is now used in this context.
Re: "Infection with Oct4-mCherry vector induced both Oct4 (Fig. S5e) and Ascl1 (Fig. S5d) expression in Notch1/2-deficient Müller glia." Supplementary image 5d is the one depicting Oct4 and 5e is the one showing Ascl1. However, the reference is reversed.
This has been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents a valuable finding on the role of secretory leukocyte protease inhibitors (SLPI) in developing Lyme disease in mice infected with Borrelia burgdorferi. The evidence supporting the authors' claims is solid. However, several concerns raised by the reviewers remain unaddressed. This paper would be of interest to scientists in the infectious inflammatory disease field.
-
Reviewer #2 (Public review):
This study by Yu and coworkers investigates the potential role of Secretory leukocyte protease inhibitor (SLPI) in Lyme arthritis. They show that, after needle inoculation of the Lyme disease agent, B. burgdorferi, compared to wild type mice, a SLPI-deficient mouse suffers elevated bacterial burden, joint swelling and inflammation, pro-inflammatory cytokines in the joint, and levels of serum neutrophil elastase (NE). They suggest that SLPI levels of Lyme disease patients are diminished relative to healthy controls. Finally, using a powerful screen of secreted mammalian proteins, they find that SLPI interacts directly B. burgdorferi.
The known role of SLPI in dampening inflammation and inflammatory damage by inhibition of NE makes the enhanced inflammation in the joint of B. burgdorferi-infected mice a predicted result but it has not previously been demonstrated and could spur further study. A limitation that is unaddressed experimentally is potential contribution of the greater bacterial burden to the enhanced inflammation, leaving open the question of whether greater immunologic stimulus or a defect in the regulation of inflammation is responsible for the observed enhanced disease. Answering this question would better justify the statement in the abstract that "These data demonstrate the importance of SLPI in suppressing periarticular joint inflammation in Lyme disease."
Although the finding of SLPI binding to bacteria is potentially quite interesting the biological relevance of this interaction is not addressed. Readers of only the abstract, which describes the direct interaction of SLPI with bacteria, may mistakenly conclude that the authors demonstrate that recruitment of this immunoregulatory factor to the bacterial surface enhances inflammation of infected tissues. This attractive possibility has not been demonstrated in this study; such assertion would require comparison of bacteria that either bind or do not bind SLPI in a mouse infection model.
Finally, the investigators take advantage of clinical samples to ask if serum SLPI levels a diminished in Lyme disease patients relative to healthy controls. The assessment of human samples is interesting and generally to be lauded, but here the comparison is limited by: (a) a small sample number, with only 5 healthy control samples (which should not be difficult to obtain); and (b) the inclusion of samples from 4 patients with erythema migrans rather than Lyme arthritis, which was the manifestation tracked in the mouse studies. Moreover, of the 3 Lyme arthritis patients, serum samples from multiple blood draws were included, resulting in 5 data points; similarly, of the 4 erythema migrans patients, 13 separate samples were included. The multiple samplings from some but not all subjects could result in differential "weighting" of samples. Therefore, although the investigators provide a statistical analysis of these data, it is difficult to evaluate the validity of this apparent difference.
In summary, this is an interesting study that provides new information regarding infection in a host deficient in SLPI and, using a state-of-the-art screen of the mammalian secretome to show that B. burgdorferi binds SLPI, raising the attractive possibility that this pathogen utilizes a host immune regulator to enhance inflammation. The conclusions that SLPI enhances inflammation directly due to its immunoregulatory activity and that SLPI levels are diminished in human Lyme disease patients, as well as the implication that SLPI binding by the bacterium has pathogenic significance, each require further study.
-
Author response:
The following is the authors’ response to the current reviews.
We deeply appreciate the reviewer’s careful review and critiques. These are excellent critiques that we are working on and probably require a few more years of work. Published together, we believe these critiques add value to our manuscript.
The following is the authors’ response to the original reviews.
Reviewer #2 (Public review):
Summary:
This manuscript by Yu and coworkers investigates the potential role of Secretory leukocyte protease inhibitor (SLPI) in Lyme arthritis. They show that, after needle inoculation of the Lyme disease (LD) agent, B. burgdorferi, compared to wild type mice, a SLPI-deficient mouse suffers elevated bacterial burden, joint swelling and inflammation, pro-inflammatory cytokines in the joint, and levels of serum neutrophil elastase (NE). They suggest that SLPI levels of Lyme disease patients are diminished relative to healthy controls. Finally, they find that SLPI may interact directly the B. burgdorferi.
Strengths:
Many of these observations are interesting and the use of SLPI-deficient mice is useful (and has not previously been done).
Weaknesses:
(a) The known role of SLPI in dampening inflammation and inflammatory damage by inhibition of NE makes the enhanced inflammation in the joint of B. burgdorferi-infected mice a predicted result; (b) The potential contribution of the greater bacterial burden to the enhanced inflammation is acknowledged but not experimentally addressed; (c) The relationship of SLPI binding by B. burgdorferi to the enhanced disease of SLPI-deficient mice is not addressed in this study, making the inclusion of this observation in this manuscript incomplete; and (d) assessment of SLPI levels in healthy controls vs. Lyme disease patients is inadequate.
We greatly appreciate the critiques, and we do agree. Even though the observation of NE level is predictable, we believe that it is important to actually demonstrate it in the context of murine Lyme arthritis. The function of SLPI goes beyond inhibiting NE level. As an ongoing project in our lab, we believe that the current study serves as a good starting point to explore the pleiotropic effects SLPI in the pathogenesis of murine Lyme arthritis and in patients. And, the critiques here are of great value to our research.
Comments on revised version:
Several of the points were addressed in the revised manuscript, but the following issues remain:
Previous point that the relationship of SLPI binding to B. burgdorferi to the enhanced disease of SLPI-deficient mice is not investigated: The authors indicate that such investigations are ongoing. In the absence of any findings, I recommend that their interesting BASEHIT and subsequent studies be presented in a future study, which would have high impact.
We thank the reviewer for the critique. We do agree that this part of the story is not complete. However, we would like to keep the BASEHIT and binding data in the paper, as we believe that it is an important finding. We confirmed the binding using ELISA, flow cytometry, and immunofluorescent microscopy. We showed that the binding is specific to infectious strain of B. burgdorferi, thus likely to contribute to the pathogenesis of murine Lyme arthritis. Our data suggest that SLPI can directly interact with a B. burgdorferi protein. We are exploring the biological significance of the binding. And this finding can be further explored by other labs too.
Previous recommendation 1: (The authors added lines 267-68, not 287-68). This ambiguity is acknowledged but remains. In addition, in the revised manuscript, the authors state "However, these data also emphasize the importance of SLPI in controlling the development of inflammation in periarticular tissues of B. burgdorferi-infected mice." Given acknowledged limitations of interpretation, "suggest" would be more appropriate than "emphasize".
We thank the reviewer for the careful reading, and we apologize for the mistake. The change has been made accordingly (line 268).
Previous recommendation 5: The lack of clinical samples can be a challenge. Nevertheless, 4 of the 7 samples from LD patients are from individuals suffering from EM rather than arthritis (i.e., the manifestation that is the topic of the study) and some who are sampled multiple times, make an objective statistical comparison difficult. I don't have a suggestion as to how to address the difference in number of samples from a given subject. However, the authors could consider segregating EM vs. LA in their analysis (although it appears that limiting the comparison between HC and LA patients would not reveal a statistical difference).
We thank the reviewer for the critique. And we agree with the reviewer that the patient’s data presented are not ideal. We believe that at this point the combination of the samples is most logical, as the number of samples we have from patients with Lyme arthritis is fairly limited. We stated the limitation in the discussion. We do believe that the finding of the correlation is important. It suggests the potential function of SLPI in patients, beyond murine infection.
What’s more, various groups with large number of different samples can elucidate the relationship further.
Previous recommendation 6: Given that binding of SLPI to the bacterial surface is an essential aspect of the authors' model, and that the ELISA assay to indicate SLPI binding used cell lysates rather than intact bacteria, a control PI staining to validate the integrity of bacteria seems reasonable.
We appreciate the suggestion and has provided the propidium iodide staining in Supplemental Figure 5 (line 539-542, 568-569, 718-722).
Previous recommendation 8: The inclusion of a no serum control (that presumably shows 100% viability) would validate the authors' assertion that 20% serum has bactericidal activity.
We appreciate the suggestion. As stated in the manuscript (line 583-584), the percent viability was normalized to the control spirochetes culture without any treatment. Thus, the control spirochetes culture, without serum and SLPI treatment, showed 100% viability. We have revised Supplemental Figure 3 accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important work uses an innovative approach to understand similarities between haemodynamic and electrophysiological activity of the human brain, and how the brain might carry out multiple functions concurrently across different brain regions by using multiple timescales. The study provides convincing evidence to indicate that while spatially similar functional brain networks are found in both modalities, there is a tendency for these to occur asynchronously. This work will be of interest to neurophysiological and brain imaging researchers.
-
Reviewer #1 (Public review):
The paper proposes an interesting perspective on the spatio-temporal relationship between FC in fMRI and electrophysiology. The study found that while similar networks configurations are found in both modalities, there is a tendency for the networks to spatially converge more commonly at synchronous than asynchronous timepoints.
My confidence in the findings and their interpretation has been improved by the addition of some basic simulations. It helps give confidence in the measure being used to distinguish between scenarios.
Of course, there may be other scenarios that are problematic that are not covered by the current simulations - this highlights the difficulty of making a claim based on a heuristic measure.
That said, with the simulations included and if the caveat above is acknowledged, then I think the paper is in good shape.
-
Reviewer #2 (Public review):
Summary:
The study investigates the brain's functional connectivity (FC) dynamics across different timescales using simultaneous recordings of intracranial EEG/source-localized EEG and fMRI. The primary research goal was to determine which of three convergence/divergence scenarios is the most likely to occur.
The results indicate that despite similar FC patterns found in different data modalities, the timepoints were not aligned, indicating spatial convergence but temporal divergence.
The researchers also found that FC patterns in different frequencies do not overlap significantly, emphasizing the multi-frequency nature of brain connectivity. Such asynchronous activity across frequency bands supports the idea of multiple connectivity states that operate independently and are organized into a multiplex system.
Strengths:
The data supporting the authors' claims are convincing and come from simultaneous recordings of fMRI and iEEG/EEG, which has been recently developed and adapted.
The analysis methods are solid and involved a novel approach to analyzing the co-occurrence of FC patterns across modalities (cross-modal recurrence plot, CRP) and robust statistics, including replication of the main results using multiple operationalizations of the functional connectome (e.g., amplitude, orthogonalized, and phase-based coupling).
In addition, the authors provided a detailed interpretation of the results, placing them in the context of recent advances and understanding of the relationships between functional connectivity and cognitive states.
The authors also did a control analysis and verified the effect of temporal window size or different functional connecvitity operationalizations. I also applaud their effort to make the analysis code open-sourced.
Comments on revisions:
The authors addressed all my concerns in the previous round of review.
-
Author response:
The following is the authors’ response to the previous reviews
Public Reviews:
Reviewer #1 (Public review):
The paper proposes an interesting perspective on the spatio-temporal relationship between FC in fMRI and electrophysiology. The study found that while similar networks configurations are found in both modalities, there is a tendency for the networks to spatially converge more commonly at synchronous than asynchronous timepoints. However, my confidence in the findings and their interpretation is undermined by an incomplete justification for the expected outcomes for each of the proposed scenarios.
As detailed below, the reviewer’s comment motivated us to conduct simulations to establish the relationship between the scenarios that we seek to adjudicate and the empirical outcomes.
Main Concern
Fig 1 makes sense to me conceptually, including the schematics of the trajectories, i.e.:
- Scenario1. Temporally convergent, same trajectories through connectome state space
- Scenario2. Temporally divergent, different trajectories through connectome state space
However, based on my understanding (and apologies if I am mistaken), I am concerned that these scenarios do not necessarily translate into the schematic CRP plots shown in fig 2C, or the statements in the main text, i.e.:
- For scenario1, "epochs of cross-modal spatial similarity should occur more frequently at on-diagonal (synchronous) than off-diagonal (asynchronous) entries, resulting in an on-/off-diagonal ratio larger than unity"
- For scenario2, "epochs of spatial similarity could occur equally likely at on-diagonal and off-diagonal entries (ratio≈1)"
Where do the authors get these statements and the schematics in fig2C from? They do not seem to be fully justified via previous literature, theory, or simulations?
In particular, I am not convinced based on the evidence currently in the paper, that the ratio of off- to on-diagonal entries (and under what assumptions) is a definitive way to discriminate between scenarios 1 and 2.
For example, what about the case where the same network configuration reoccurs in both modalities at multiple time points. It seems to me that you would get a CRP with entries occurring equally on the on-diagonal as on the off-diagonal, regardless of whether the dynamics are matched between the two modalities or not (i.e. regardless of scenario 1 or 2 being true).
This thought experiment example might have a flaw in it, and the authors might ultimately be correct, but nonetheless a systematic justification needs to be provided for using the ratio of off- to on-diagonal entries to discriminate between scenario 1 and 2 (and under what assumptions it is valid).
Thank you for raising this important point. In response, we have now included simulation results to complement our earlier authors’ response, which provided literature references and a theoretical explanation of the on-/off-diagonal ratio metric.
In the absence of theory, the authors could use surrogate data for scenario 1 and 2. For example:
a. For scenario 1, run the CRP using a single modality. E.g. feed in the EEG into the analysis as both modality 1 AND modality 2. This should provide at least one example of CRP under scenario 1 (although it does not ensure that all CRPs under this scenario will look like this, it is at least a useful sanity check).
Note: This simulation was included in the previous round of author’s responses.
b. For scenario 2, run the CRP using a single modality plus a shuffled version. E.g. feed in the EEG into the analysis as both modality 1 AND a temporally shuffled version of the EEG as modality 2. The temporal shuffling of the EEG could be done by simple splitting the data into blocks of say ~10s and then shuffling them into a new order. This should provide a version of the CRP under scenario 2 (although it does not ensure that all CRPs under this scenario will look like this, it is at least a useful sanity check)
The authors have provided CRP plots for option a. It shows a CRP, as expected, consistent with scenario 1. This is a useful sanity check. However, as mentioned above, it does not ensure that all CRPs under this scenario will look like this.
However, the authors have not shown a CRP as per option b. As such, there is an incomplete justification for the expected outcomes of the scenarios.
Note that another option, which has not been carried out, is to use full simulations, with clearly specified assumptions, for scenario1 and 2. One way of doing this is to use a simplified (state-space) setup where you randomly simulate N spatially fixed networks that are independently switching on and off over time (i.e. "activation" is 0 or 1). Note that this would result in a N-dimensional connectome state space.
Using this, you can simulate and compute the CRPs for the two scenarios:
a. Scenario 1: where the simulated activation timecourses are set to be the same between both modalities
b. Scenario 2: where the simulated activation timecourses are simulated separately for each of the modalities
We followed the reviewer’s suggestion and have now included full simulations to address the concerns regarding the theory of the on-/off-diagonal ratio metric. As recommended, we defined a random quantized signal with N levels to represent the recurrent manifestation of N fixed connectome states. This setup was used to demonstrate the relationship between the two scenarios and the CRP observations used to adjudicate between the scenarios in our paper.
The CRP matrices in Fig. S10 provide an example illustration of this simulation. In the case where the two state timeseries are identical, there are more co-occurrences of the same state (white entries) on the diagonal than off the diagonal (left subplot). This is in line with Scenario 1, where both spatial and temporal convergence are present. Conversely, in Scenario 2, where state time courses are shuffled, co-occurrences of the same states are more dispersed, and the diagonal prominence vanishes (right subplot). This difference illustrates how the CRP reflects the presence or absence of temporal alignment, dissociating scenarios 1 and 2.
To quantitively validate this observation, we calculated the on-/off-diagonal ratio across simulations with varying N values. For Scenario 2 (shuffled version), the ratio consistently remained close to 1, indicating the absence of temporal synchronization. In contrast, Scenario 1 (non-shuffled version) produced significantly higher ratios, exceeding 1, confirming the metric's ability to capture meaningful synchrony. These results demonstrate that the simulations successfully replicate the expected relationship between the two scenarios and the CRPs, and validate the theoretical foundation of the ratio metric under the defined assumptions.
Minor Concern
Leakage correction. The paper states: "To mitigate this issue, we provide results from source-localized data both with and without leakage correction (supplementary and main text, respectively)." It is great that the authors provide both. However, given that FC in EEG is almost totally dominated by spatial leakage (see Hipp paper), the main results/figures for the scalp EEG should be done using spatial leakage corrected EEG data.
Thank you. We agree that source leakage is an important consideration, which is why the current work investigated the intracranial EEG-fMRI data as a primary approach and subsequently added the scalp EEG-fMRI approach. While source leakage correction is essential for addressing spurious connectivity, it can also risk removing genuine functional connectivity that includes zero-lag relationships. We are reassured by the observation that the scalp data both without and with leakage correction confirmed the findings of the intracranial data, i.e., the presence of spatial and a lack of temporal cross-modal convergence. As such we do not believe that source leakage had a considerable impact on the specific question at hand.
Reviewer #2 (Public review):
Summary:
The study investigates the brain's functional connectivity (FC) dynamics across different timescales using simultaneous recordings of intracranial EEG/source-localized EEG and fMRI. The primary research goal was to determine which of three convergence/divergence scenarios is the most likely to occur.
The results indicate that despite similar FC patterns found in different data modalities, the timepoints were not aligned, indicating spatial convergence but temporal divergence.
The researchers also found that FC patterns in different frequencies do not overlap significantly, emphasizing the multi-frequency nature of brain connectivity. Such asynchronous activity across frequency bands supports the idea of multiple connectivity states that operate independently and are organized into a multiplex system.
Strengths:
The data supporting the authors' claims are convincing and come from simultaneous recordings of fMRI and iEEG/EEG, which has been recently developed and adapted.
The analysis methods are solid and involved a novel approach to analyzing the co-occurrence of FC patterns across modalities (cross-modal recurrence plot, CRP) and robust statistics, including replication of the main results using multiple operationalizations of the functional connectome (e.g., amplitude, orthogonalized, and phase-based coupling).
In addition, the authors provided a detailed interpretation of the results, placing them in the context of recent advances and understanding of the relationships between functional connectivity and cognitive states.
The authors also did a control analysis and verified the effect of temporal window size or different functional connecvitity operationalizations. I also applaud their effort to make the analysis code open-sourced.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors answer my concerns and they are resolved.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents an important finding on the alterations in the autophagic-lysosomal pathway in a Huntington's disease model. The evidence supporting the claims of the authors is convincing. The original reviewers have found most of the issues previously raised have been addressed although further suggestions are given for consideration. These comments are listed below. The work will be of interest to neuroscientists working on HD.
-
Reviewer #1 (Public review):
Summary:
Huntington's disease (HD) is characterized by the expansion of polyglutamine repeats in huntingtin protein (HTT), leading to the formation of aggresomes composed of mutant huntingtin (mHTT). This study investigates the potential therapeutic strategy of enhancing autophagy to clear mHTT. The authors' evaluation of the autophagic-lysosomal pathway (ALP) in human HD brains shows that, in early stages, there is upregulated lysosomal biogenesis and relatively normal autophagy flux, while late-stage brains exhibit impaired autolysosome clearance, suggesting that early intervention may be beneficial. The authors cross the Q175 HD knock-in model with the TRGL autophagy reporter mouse to investigate ALP dynamics in vivo. In these models, mHTT is detected in autophagic vacuoles and colocalizes with autophagy receptors p62/SQSTM1 and ubiquitin. Although ALP alterations in the Q175 model are milder and later onset compared to human HD, they do show lysosome depletion and impaired autophagic flux. Treatment with an mTOR inhibitor in 6-month-old TRGL/Q175 mice normalized lysosome numbers, alleviated aggresome pathology, and reduced mHTT, p62, and ubiquitin levels. These findings suggest that autophagy modulation during the early stages of disease progression may offer potential therapeutic interventions for HD pathology.
Strengths:
Provide supportive animal evidence for mTOR inhibition in enhancing autophagy and reducing toxicity in HD animal models.
Weaknesses:
Lacks animal behavior and survival rate data, particularly regarding whether the extent of motor dysfunction in TRGL/Q175 mice is comparable to that in Q175 mice and whether the administration of mTORi INK improves these symptoms.
-
Reviewer #2 (Public review):
Summary:
In this manuscript the authors have explored the beneficial effect of autophagy upregulation in the context of HD pathology in a disease stage-specific manner. The authors have observed functional autophagy lysosomal pathway (ALP) and its machineries at the early stage in HD mouse model, whereas impairment of ALP has been documented at the later stages of the disease progression. Eventually, the authors have taken advantage of operational ALP pathway at the early stage of HD pathology, in order to upregulate ALP and autophagy flux by inhibiting mTORC1 in vivo, which ultimately reverted back multiple ALP-related abnormalities and phenotypes. Therefore, this manuscript is a promising effort to shed light on the therapeutic interventions with which HD pathology can be treated at the patient level in future.
Strengths:
The study has shown alteration of ALP in HD mouse model in a very detailed manner. Such stage dependent in vivo study will be informative and has not been done before. Also, this research provides possible therapeutic intervention in patients in future.
Weaknesses:
In this revised version of the manuscript, the authors have satisfactorily addressed all the concerns raised by the reviewers. They have also provided futuristic viewpoints towards tackling neurodegenerative disorder, especially Huntington Disease (HD).
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
This study investigates alterations in the autophagic-lysosomal pathway in the Q175 HD knock-in model crossed with the TRGL autophagy reporter mouse. The findings provide valuable insights into autophagy dynamics in HD and the potential therapeutic benefits of modulating this pathway. The study suggests that autophagy stimulation may offer therapeutic benefits in the early stages of HD progression, with mTOR inhibition showing promise in ameliorating lysosomal pathology and reducing mutant huntingtin accumulation.
However, the data raises concerns regarding the strength of the evidence. The observed changes in autophagic markers, such as autolysosome and lysosome numbers, are relatively modest, and the Western blot results do not fully match the quantitative results. These discrepancies highlight the need for further validation and more pronounced effects to strengthen the conclusions. While the study suggests the potential of autophagy regulation as a long-term therapeutic strategy, additional experiments and more reliable data are necessary to confirm the broader applicability of the TRGL/Q175 mouse model.
Furthermore, the 2004 publication by Ravikumar et al. demonstrated that inhibition of mTOR by rapamycin or the rapamycin ester CCI-779 induces autophagy and reduces the toxicity of polyglutamine expansions in fly and mouse models of Huntington's disease. mTOR is a key regulator of autophagy, and its inhibition has been explored as a therapeutic strategy for various neurodegenerative diseases, including HD. Studies suggest that inhibiting mTOR enhances autophagy, leading to the clearance of mHTT aggregates. Given that dysfunction of the autophagic-lysosomal pathway and lysosomal function in HD is already well-established, and that mTOR inhibition as a therapeutic approach for HD is also known, this study does not present entirely novel findings.
Major Concerns:
(1) In Figure 3A1 and A2, delayed and/or deficient acidification of AL causes deficits in the reformation of LY to replenish the LY pool. However, in Figure S2D, there is no difference in AL formation or substrate degradation, as shown by the Western blotting results for CTSD and CTSB. How can these discrepancies be explained?
We appreciate the reviewer raising this point, and we agree with the concern. Please note that the material used for our immunoblotting was hemibrain homogenates, containing not only neurons but also glial cells, so the results for any protein, e.g., CTSD or CTSB in Fig. S2D, represented combined signals from neurons and glial cells. Our longstanding experience with western blot analysis of autophagy pathway markers is that signals from glial cells significantly interfere with/dilute the signals from neurons. By contrast, the immunofluorescence (IF) results in Fig. 3A, obtained with the assistance of tfLC3 probe and hue angle-based AV/LY subtype analysis, revealed the in situ conditions of the AL and LY within neurons selectively, which reflects the advantage of using the in vivo neuron-specific expression of the LC3 probe combined with IF with a LY marker in this study and our other related studies (Lee, Rao et al. 2019, Lee, Yang et al. 2022) as explained in the Introduction of this paper. Please also refer to a similar discussion regarding the WB-detected protein levels of p-ATG14 in L542-547.
(2) The results demonstrate that in the brain sections of 17-month-old TRGL/Q175 mice, there was an increase in the number of acidic autolysosomes (AL), including poorly acidified autolysosomes (pa-AL), alongside a decrease in lysosome (LY) numbers. These AL/pa-AL changes were not significant in 2-month-old or 7-month-old TRGL/Q175 mice, where only a reduction in lysosome numbers was observed. This indicates that these changes, representing damage to the autophagy-lysosome pathway (ALP), manifest only at later stages of the disease. Considering that the ALP is affected predominantly in the advanced stages of the disease (e.g., at 17 months), why were 6-month-old TRGL/Q175 mice selected for oral mTORi INK treatment, and why was the treatment duration restricted to just 3 weeks?
We thank the reviewer for the comment. A key outcome measure in our evaluation of mTORi treatment was amelioration of mHTT pathology, i.e., mHTT aggregates/IBs. Before conducting the mTORi treatment experiments, we had learned from our assessments of age-associated progression of mHTT aggresomes/IBs in mice of different ages (e.g., 2-, 6-, 10- and 17-mo) that there were already severe mHTT accumulations in Q175 at 10-mo-old (e.g., Fig. 2A). This is consistent with a previous report (Carty, Berson et al. 2015) showing that striatal mHTT inclusions dynamically increase from 4 to 8 months. From a therapeutic point of view, more aggregates in the mouse brain would make it more difficult for the autophagy machinery to clear these aggregates. Thus, the high degree of aggregates in 10- or 17-mo may not be modifiable by the mTORi and/or prevent reliable/sensitive measurements on mTORi-induced phenotype changes. We then preferred to apply the treatment to younger (i.e., 6-mo-old) mice when the mHTT pathology was not so severe, with detectable, albeit mild, ALP abnormality. Additionally, due to the 2-year funding limit for this project, there was insufficient time to generate a large set of old mice (e.g., ~18-mo) for another drug treatment experiment. In future studies, it might be worthy to conduct the treatment “in the advanced stages of the disease (e.g., ~18-mo)” to further examine the modification potential of the mTORi on the ALP as well as the HTT aggregations. As for the treatment duration, we were interested in an acute treatment schedule given that, in our dosing tests, we observed rapid responses to the treatment (e.g., target engagement) in a few days even with one dose, and that the 14-15-day treatments produced consistent responses (e.g., Fig. S3A). Long-term treatment, however, would be worthy testing in the future although our current study informs a therapeutic approach that has been suggested by others involving intermittent/pulsatile administration of mTOR inhibitors to minimize side effects of chronic long-term administration.
(3) Is the extent of motor dysfunction in TRGL/Q175 mice comparable to that in Q175 mice? Does the administration of mTORi INK improve these symptoms?
Unfortunately, we were unable to investigate motor functions experimentally with specific assays such as open field or rotarod tests in this study (partially affected by the falling of the funded research period within the COVID-19 pandemic peak periods in 2020). Based on our experience in handling the mice, we did not notice any obvious differences between Q175 and TRGL/Q175, and any improvements after the acute mTORi INK treatment.
(4) Why is eGFP expression not visible in Fig. 6A in TRGL-Veh mice? Additionally, why do normal (non-poly-Q) mice have fewer lysosomes (LY) than TRGL/Q175-INK mice? IHC results also show that CTSD levels are lower in TRGL mice compared to TRGL/Q175-INK mice. Does this suggest lysosome dysfunction in TRGL-Veh mice?
We appreciate the reviewer raising this point, which has been corrected (through slightly increasing the eGFP signal in the green channel and the merged channels equally for all genotypes), and the revised Fig. 6A is showing better eGFP signals. Regarding higher LY numbers/CTSD levels in TRGL/Q175-INK compared to the control TRGL-Veh mice, it does not necessarily imply LY dysfunction in TRGL mice, rather, it likely suggests mTORi treatment inducing LY biogenesis. Our original characterization of the TRGL mouse of varying ages, where low expression of the tgLC3 construct, produces only a very small increment of total LC3, resulting in no discernable functional changes in the autophagy pathway (Lee, Rao et al. 2019). The underlying mechanism, e.g., TFEB activation following mTOR inhibition, remains to be investigated in future studies.
(5) In Figure 5A, the phosphorylation of ATG14 (S29) shows minimal differences in Western blotting, which appears inconsistent with the quantitative results. A similar issue is observed in the quantification of Endo-LC3.
We welcome the reviewer’s point, and therefore bands showing bigger differences of p-ATG14 (S29) have been used in the revised Fig. 5A, making the images and the quantitative results more consistent and representative. Similar changes have also been made to the Endo-LC3 data at the bottom of Fig. 5A.
(6) In Figure S2A and Figure S2B, 17-month-old TRGL/Q175 mice show a decrease in pp70S6K and the p-ULK1/ULK1 ratio, but no changes are observed in autophagy-related markers. Do these results indicate only a slight change in autophagy at this stage in TRGL/Q175 mice? Since the mTOR pathway regulates multiple cellular mechanisms, could mTOR also influence other processes? Is it possible that additional mechanisms are involved?
We completely agree with the reviewer. As mentioned in the text at multiple locations, LAP alterations in Q175 and TRGL/Q175 mice are mild even at a relatively old age (e.g., 17-mo), especially at the protein levels detected by immunoblotting. We agree that even if the mild alterations in the levels of pp70S6K (T389) and p-ULK1/ULK1 ratio may indicate “a slight change in autophagy”, it may also imply that other cell processes are involved given that mTOR signaling regulates multiple cellular functions. In particular, the p70S6K/p-p70S6K – a mTOR substrate used as a readout for mTOR activity in this study – is a key component of the protein synthesis pathway (Wang and Proud 2006, Magnuson, Ekim et al. 2012) , so its changes may serve as readouts for alterations in not only the autophagy pathway, but also the protein synthesis pathway. [A related discussion about mTOR/protein synthesis pathways, in response to a comment from Reviewer 2, has been incorporated into the text under Discussion, L633-640]
Reviewer #2 (Public review):
Summary:
In this manuscript, the authors have explored the beneficial effect of autophagy upregulation in the context of HD pathology in a disease stage-specific manner. The authors have observed functional autophagy lysosomal pathway (ALP) and its machineries at the early stage in the HD mouse model, whereas impairment of ALP has been documented at the later stages of the disease progression. Eventually, the authors took advantage of the operational ALP pathway at the early stage of HD pathology, in order to upregulate ALP and autophagy flux by inhibiting mTORC1 in vivo, which ultimately reverted back to multiple ALP-related abnormalities and phenotypes. Therefore, this manuscript is a promising effort to shed light on the therapeutic interventions with which HD pathology can be treated at the patient level in the future.
Strengths:
The study has shown the alteration of ALP in the HD mouse model in a very detailed manner. Such stage-dependent in vivo study will be informative and has not been done before. Also, this research provides possible therapeutic interventions for patients in the future.
Weaknesses:
Some constructive comments and suggestions in order to reflect the key aspects and concepts better in the manuscript :
(1) The authors have observed lysosome number alteration in a temporally regulated disease stage-specific manner. In this scenario investigation of regulation, localization, and level of TFEB, the transcription factor required for lysosome biogenesis, would be interesting and informative.
We thank the reviewer for this point and completely agree that exploring TFEBrelated aspects would be interesting which will be investigated in future studies.
(2) For the general scientific community better clarification of the short forms will be useful. For example, in line 97, page 4, AP full form would be useful. Also 'metabolized via autophagy' can be replaced by 'degraded via autophagy'.
We appreciate the reviewer for raising this point. We introduced each abbreviation at the location where the full term first appears and, for the case of “AP”, it was introduced in (previous) Line 69 when “autophagosome” first appears. We agree with the reviewer about easy reading for the general scientific community and thus we have added an Abbreviation section after the Key Words section, listing abbreviations used in this manuscript.
Also, the word “metabolized” has been replaced with “degraded” as suggested.
(3) The nuclear vs cytosolic localization of HTT aggregates shown in Figure 2, are very interesting. The increase in cytosolic HTT aggregate formation at 10 months compared to 6 months probably suggests spatio-temporal regulation of aggregate formation. The authors could comment in a more elaborate manner, on the reason and impact of this kind of regulation of aggregate formation in the context of HD pathology.
We value the reviewer’s important point. Previous studies have well documented that mHTT aggregates exist in both intranuclear and extranuclear locations in the brains of both human HD and mouse models (DiFiglia, Sapp et al. 1997, Li, Li et al. 1999, Carty, Berson et al. 2015, Peng, Wu et al. 2016, Berg, Veeranna et al. 2024). HTT can travel between the nucleus and cytoplasm and the default location for HTT is cytoplasmic, and thus the occurrence of nuclear mHTT aggregates is considered as a result of dysfunction in the nuclear exporting system for proteins (DiFiglia, Sapp et al. 1995, Gutekunst, Levey et al. 1995, Sharp, Loev et al. 1995, Cornett, Cao et al. 2005) while other factors such as phosphorylation of HTT may also affect nuclear targeting (DeGuire, Ruggeri et al. 2018). Extranuclear aggregates of mHTT usually appear later than nuclear aggregates and develop more aggressively in terms of numbers and pace after their appearance (Li, Li et al. 1999, Carty, Berson et al. 2015, Landles, Milton et al. 2020). The fact that there are neurons containing extranuclear aggregates without having nuclear aggregates within the same cells (Carty, Berson et al. 2015) does not support a nuclear-cytoplasmic sequence for aggregate formation, implying different mechanisms controlling the formation of these two types of aggregates. It was reported that there were no significant differences in toxicity associated with the presence of nuclear compared with extranuclear aggregates (Hackam, Singaraja et al. 1999), while other studies have proposed that nuclear aggregates correlate with transcriptional dysfunction while extranuclear aggregates may impair neuronal communication and can track disease progression (Li, Li et al. 1999, Benn, Landles et al. 2005, Landles, Milton et al. 2020). Thus, the observation of a higher level of extranuclear mHTT aggregates at 10-mo compared to 6-mo from the present study is consistent with previous findings mentioned above. In addition, our EM observations of homogenous granular/short fine fibril ultrastructure of both nuclear and extranuclear aggregates are consistent with findings from mouse model studies (Davies, Turmaine et al. 1997, Scherzinger, Lurz et al. 1997), which, interestingly, is different from in vitro studies where nuclear aggregates exhibited a core and shell structure but extranuclear aggregates did not possess the shell (Riguet, Mahul-Mellier et al. 2021), reflecting differences between in vivo and in vitro conditions. Taken together, even if efforts have been made in this and previous studies in trying to understand the differences between nuclear and extranuclear aggregates, the mechanisms regarding the spatial-temporal regulation of aggregate formation have so far not been fully revealed which will require additional investigations.
(4) In this manuscript, the authors have convincingly shown that mTOR inhibition is inducing autophagy in the HD mouse model in vivo. On the other hand, mTOR inhibition would also reduce overall cellular protein translation. This aspect of mTOR inhibition can also potentially contribute to the alleviation of disease phenotype and disease symptoms by reducing protein overload in HD pathology. The authors' comments regarding this aspect would be appreciated.
We recognize the value of the reviewer’s point which we completely agree with. Lowering mHTT via interfering protein translation (e.g., through RNAi, antisense oligonucleotides) has been an attractive strategy in HD therapeutic development (Kordasiewicz, Stanek et al. 2012, Tabrizi, Ghosh et al. 2019). As mentioned above, mTOR regulates multiple cellular pathways including protein synthesis, and inhibition of mTOR as what was done in the present study is potentially affect protein synthesis as well. While our results of decreases in mHTT signals (Fig. 7) can be interpreted as a result of autophagymediated clearance of mHTT, certainly, a possibility cannot be excluded that mTOR inhibition may result in a reduction in HTT production which may also contribute to the observed results – future studies should determine how significant of such a contribution is. [The above description has been incorporated into the text under Discussion, L633-640]
(5) The authors have shown nuclear inclusion formation and aggregation of mHTT and also commented on its potential removal with the UPS system (proteasomal degradation) in vivo. As there is also a reciprocal relationship present between autophagy and proteasomal machineries, upon upregulation of autophagy machinery by mTOR inhibition proteasomal activity may decrease. How nuclear proteasomal activity increases to tackle nuclear mHTT IBs, would be interesting to understand in the context of HD pathology. Comments from the authors in this aspect would clarify the role of multiple degradation pathways in handling mutant HTT protein in HD pathology.
We appreciate the reviewer raising this point. We agree that there are reciprocal relationships between autophagy and the UPS (Korolchuk, Menzies et al. 2010, Park and Cuervo 2013). In general, failure in one pathway would lead to compensatory upregulation of the other pathway, and vice versa (Lee, Park et al. 2019). So, as the reviewer pointed out, “upon upregulation of autophagy machinery by mTOR inhibition proteasomal activity may decrease”. However, we proposed in the Discussion that “It is possible that stimulation of autophagy is reducing the mHTT in the cytoplasm and thereby partially relieves the burden of the proteasome both in the cytoplasm and in the nucleus so that the nuclear proteasome operates more effectively”, which is inconsistent with the general expectation for a decreased UPS activity. However, please note that there are also instances where two pathways may act in the same direction, e.g., autophagy inhibition disturbs UPS degradative function (Korolchuk, Mansilla et al. 2009, Park and Cuervo 2013). Anyhow, our statement is just speculation, requiring verifications with additional experiments in the future. One of the observations reported here which may support the above speculation is the reductions of AV-non-associated form of mHTT/p62/Ub (Fig. 7B3), given that some of them might exist within the nucleus, whose reduced levels may reflect increased intranuclear UPS activity, besides the other possibility that they may travel from the nucleus to the cytosol for clearance as already discussed inside the text. [The last sentence has been incorporated into the text under Discussion, L628-632]
(6) For the treatment of neurodegenerative disorders taking the temporal regulation into consideration is extremely important, as that will determine the success rate of the treatments in patients. The authors in this manuscript have clearly discussed this scenario. However, for neurodegenerative disordered patients, in most cases, the symptom manifestation is a late onset scenario. In that case, it will be complicated to initiate an early treatment regime in HD patients. If the authors can comment on and discuss the practicality of the early treatment regime for therapeutic purposes that would be impactful.
We appreciate the reviewer raising this point and we agree with the main concern that “for neurodegenerative disordered patients, in most cases, the symptom manifestation is a late onset scenario.” This is really a common challenge in the therapeutic fields for neurodegeneration diseases. It should be first noted that the current study is an experimental therapeutical attempt in a mouse model which is consistent with previous reports (Ravikumar, Vacher et al. 2004) as a proof of concept for manipulating autophagy (i.e., via inhibiting mTOR in the current setting) as a potential therapeutic, whose clinical practicality requires further verifications. Moreover, in our opinion, early diagnosis (e.g., genetic testing in individuals with higher risk for HD) may be a key in overcoming the above challenges, i.e., if early diagnosis is enabled, it would become possible for earlier interventions. [The above description has been incorporated into the text under Discussion, L654-659]
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Minor concerns:
(1) Figures 1 and 2 should indicate the number of sections and mice/genotypes.
Thanks for the suggestion, and the info has been added in the figure legends.
(2) Figure 3A2 should explain how AP, AL, pa-AL, and LY are quantified.
Thanks for raising this point. Please note that the quantitation of AP, AL, pa-AL and LY was performed by the hue angle-based analysis which was described under “Confocal image collection and hue angle-based quantitative analysis for AV/LY subtypes” within the Materials and Methods. A phrase “(see the Materials and Methods)” has been added after the existing description “Hue angle-based analysis was performed for AV/LY subtype determination using the methods described in Lee et al., 2019” in the figure legend.
References
Benn, C. L., C. Landles, H. Li, A. D. Strand, B. Woodman, K. Sathasivam, S. H. Li, S. Ghazi-Noori, E. Hockly, S. M. Faruque, J. H. Cha, P. T. Sharpe, J. M. Olson, X. J. Li and G. P. Bates (2005). "Contribution of nuclear and extranuclear polyQ to neurological phenotypes in mouse models of Huntington's disease." Hum Mol Genet 14(20): 3065-3078.
Berg, M. J., Veeranna, C. M. Rosa, A. Kumar, P. S. Mohan, P. Stavrides, D. M. Marchionini, D.S. Yang and R. A. Nixon (2024). "Pathobiology of the autophagy-lysosomal pathway in the Huntington’s disease brain." bioRxiv: 2024.2005.2029.596470.
Carty, N., N. Berson, K. Tillack, C. Thiede, D. Scholz, K. Kottig, Y. Sedaghat, C. Gabrysiak, G. Yohrling, H. von der Kammer, A. Ebneth, V. Mack, I. Munoz-Sanjuan and S. Kwak (2015). "Characterization of HTT inclusion size, location, and timing in the zQ175 mouse model of Huntington's disease: an in vivo high-content imaging study." PLoS One 10(4): e0123527.
Cornett, J., F. Cao, C. E. Wang, C. A. Ross, G. P. Bates, S. H. Li and X. J. Li (2005). "Polyglutamine expansion of huntingtin impairs its nuclear export." Nat Genet 37(2): 198204.
Davies, S. W., M. Turmaine, B. A. Cozens, M. DiFiglia, A. H. Sharp, C. A. Ross, E. Scherzinger, E. E. Wanker, L. Mangiarini and G. P. Bates (1997). "Formation of neuronal intranuclear inclusions underlies the neurological dysfunction in mice transgenic for the HD mutation." Cell 90(3): 537-548.
DeGuire, S. M., F. S. Ruggeri, M. B. Fares, A. Chiki, U. Cendrowska, G. Dietler and H. A. Lashuel (2018). "N-terminal Huntingtin (Htt) phosphorylation is a molecular switch regulating Htt aggregation, helical conformation, internalization, and nuclear targeting." J Biol Chem 293(48): 18540-18558.
DiFiglia, M., E. Sapp, K. Chase, C. Schwarz, A. Meloni, C. Young, E. Martin, J. P. Vonsattel, R. Carraway, S. A. Reeves and et al. (1995). "Huntingtin is a cytoplasmic protein associated with vesicles in human and rat brain neurons." Neuron 14(5): 1075-1081.
DiFiglia, M., E. Sapp, K. O. Chase, S. W. Davies, G. P. Bates, J. P. Vonsattel and N. Aronin (1997). "Aggregation of huntingtin in neuronal intranuclear inclusions and dystrophic neurites in brain." Science 277(5334): 1990-1993.
Gutekunst, C. A., A. I. Levey, C. J. Heilman, W. L. Whaley, H. Yi, N. R. Nash, H. D. Rees, J. J. Madden and S. M. Hersch (1995). "Identification and localization of huntingtin in brain and human lymphoblastoid cell lines with anti-fusion protein antibodies." Proc Natl Acad Sci U S A 92(19): 8710-8714.
Hackam, A. S., R. Singaraja, T. Zhang, L. Gan and M. R. Hayden (1999). "In vitro evidence for both the nucleus and cytoplasm as subcellular sites of pathogenesis in Huntington's disease." Hum Mol Genet 8(1): 25-33.
Kordasiewicz, H. B., L. M. Stanek, E. V. Wancewicz, C. Mazur, M. M. McAlonis, K. A. Pytel, J. W. Artates, A. Weiss, S. H. Cheng, L. S. Shihabuddin, G. Hung, C. F. Bennett and D. W. Cleveland (2012). "Sustained therapeutic reversal of Huntington's disease by transient repression of huntingtin synthesis." Neuron 74(6): 1031-1044.
Korolchuk, V. I., A. Mansilla, F. M. Menzies and D. C. Rubinsztein (2009). "Autophagy inhibition compromises degradation of ubiquitin-proteasome pathway substrates." Mol Cell 33(4): 517-527.
Korolchuk, V. I., F. M. Menzies and D. C. Rubinsztein (2010). "Mechanisms of cross-talk between the ubiquitin-proteasome and autophagy-lysosome systems." FEBS Lett 584(7): 1393-1398.
Landles, C., R. E. Milton, N. Ali, R. Flomen, M. Flower, F. Schindler, C. Gomez-Paredes, M. K. Bondulich, G. F. Osborne, D. Goodwin, G. Salsbury, C. L. Benn, K. Sathasivam, E. J. Smith, S. J. Tabrizi, E. E. Wanker and G. P. Bates (2020). "Subcellular Localization And Formation Of Huntingtin Aggregates Correlates With Symptom Onset And Progression In A Huntington'S Disease Model." Brain Commun 2(2): fcaa066.
Lee, J. H., S. Park, E. Kim and M. J. Lee (2019). "Negative-feedback coordination between proteasomal activity and autophagic flux." Autophagy 15(4): 726-728.
Lee, J. H., M. V. Rao, D. S. Yang, P. Stavrides, E. Im, A. Pensalfini, C. Huo, P. Sarkar, T. Yoshimori and R. A. Nixon (2019). "Transgenic expression of a ratiometric autophagy probe specifically in neurons enables the interrogation of brain autophagy in vivo." Autophagy 15(3): 543-557.
Lee, J. H., D. S. Yang, C. N. Goulbourne, E. Im, P. Stavrides, A. Pensalfini, H. Chan, C. Bouchet-Marquis, C. Bleiwas, M. J. Berg, C. Huo, J. Peddy, M. Pawlik, E. Levy, M. Rao, M. Staufenbiel and R. A. Nixon (2022). "Faulty autolysosome acidification in Alzheimer's disease mouse models induces autophagic build-up of Abeta in neurons, yielding senile plaques." Nat Neurosci 25(6): 688-701.
Li, H., S. H. Li, A. L. Cheng, L. Mangiarini, G. P. Bates and X. J. Li (1999). "Ultrastructural localization and progressive formation of neuropil aggregates in Huntington's disease transgenic mice." Hum Mol Genet 8(7): 1227-1236.
Magnuson, B., B. Ekim and D. C. Fingar (2012). "Regulation and function of ribosomal protein S6 kinase (S6K) within mTOR signalling networks." Biochem J 441(1): 1-21.
Park, C. and A. M. Cuervo (2013). "Selective autophagy: talking with the UPS." Cell Biochem Biophys 67(1): 3-13.
Peng, Q., B. Wu, M. Jiang, J. Jin, Z. Hou, J. Zheng, J. Zhang and W. Duan (2016). "Characterization of Behavioral, Neuropathological, Brain Metabolic and Key Molecular Changes in zQ175 Knock-In Mouse Model of Huntington's Disease." PLoS One 11(2): e0148839.
Ravikumar, B., C. Vacher, Z. Berger, J. E. Davies, S. Luo, L. G. Oroz, F. Scaravilli, D. F. Easton, R. Duden, C. J. O'Kane and D. C. Rubinsztein (2004). "Inhibition of mTOR induces autophagy and reduces toxicity of polyglutamine expansions in fly and mouse models of Huntington disease." Nat Genet 36(6): 585-595.
Riguet, N., A. L. Mahul-Mellier, N. Maharjan, J. Burtscher, M. Croisier, G. Knott, J. Hastings, A. Patin, V. Reiterer, H. Farhan, S. Nasarov and H. A. Lashuel (2021). "Nuclear and cytoplasmic huntingtin inclusions exhibit distinct biochemical composition, interactome and ultrastructural properties." Nat Commun 12(1): 6579.
Scherzinger, E., R. Lurz, M. Turmaine, L. Mangiarini, B. Hollenbach, R. Hasenbank, G. P. Bates, S. W. Davies, H. Lehrach and E. E. Wanker (1997). "Huntingtin-encoded polyglutamine expansions form amyloid-like protein aggregates in vitro and in vivo." Cell 90(3): 549-558.
Sharp, A. H., S. J. Loev, G. Schilling, S. H. Li, X. J. Li, J. Bao, M. V. Wagster, J. A. Kotzuk, J. P. Steiner, A. Lo and et al. (1995). "Widespread expression of Huntington's disease gene (IT15) protein product." Neuron 14(5): 1065-1074.
Tabrizi, S. J., R. Ghosh and B. R. Leavitt (2019). "Huntingtin Lowering Strategies for Disease Modification in Huntington's Disease." Neuron 101(5): 801-819.
Wang, X. and C. G. Proud (2006). "The mTOR pathway in the control of protein synthesis." Physiology (Bethesda) 21: 362-369.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This is an important study demonstrating that cholecystokinin is a key modulator of auditory thalamocortical plasticity during development and in young adult but not aged mice, though cortical application of this neuropeptide in older animals appears to go some way to restoring this age-dependent loss in plasticity. A strength of this work is the use of multiple experimental approaches, which together provide convincing support for the proposed involvement of cholecystokinin. While there are some limitations in the electrophysiological recordings and areas where the manuscript would benefit from further clarification, this work is likely to be influential in opening up a new avenue of investigation into the roles of neuropeptides in sensory plasticity.
-
Reviewer #1 (Public review):
This report addresses a compelling topic. However, I have significant concerns, which necessitate a reassessment of the report's overall value.
Anatomical Specificity and Stimulation Site:<br /> While the authors clarify that the ventral MGB (MGv) was the intended stimulation target, the electrode track (Fig. 1A) and viral spread (Fig. 2E) suggest possible involvement of the dorsal MGB (MGd) and broader area. Given that MGv-AI and MGd-AC pathways have distinct-and sometimes opposing-effects on plasticity, the reported LTP values (with unusually small standard deviations) raise concerns about the specificity of the findings. Additional anatomical verification would help resolve this issue.
Statistical Rigor and Data Variability:<br /> The remarkably low standard deviations in LTP measurements are unexpected based on established variability in thalamocortical plasticity. The authors' response confirms these values are accurate, but further justification, such as methodological controls or replication-would bolster confidence in these results. Additionally, the comparison of in vivo vs. in vitro LTP variability requires more substantive support.
Viral Targeting and Specificity:<br /> The manuscript does not clearly address whether cortical neurons were inadvertently infected by AAV9. Given the potential for off-target effects, explicit confirmation (e.g., microphotograph of stimulation site) would strengthen the study's conclusions.
Integration of Prior Literature:<br /> The discussion of existing work is adequate but could be more comprehensive. A deeper engagement with contrasting findings would provide better context for the study's contributions.
Therapeutic Implications:<br /> The authors' discussion of therapeutic potential is now appropriately cautious and well-reasoned.
Conclusion:<br /> While the study presents intriguing findings, the concerns outlined above must be addressed to fully establish the validity and impact of the results. I appreciate the authors' efforts thus far and hope they can provide additional data or clarification to resolve these issues. With these revisions, the manuscript could make a valuable contribution to the field.
-
Reviewer #2 (Public review):
Summary:
This work used multiple approaches to show that CCK is critical for long-term potentiation (LTP) in the auditory thalamocortical pathway. They also showed that the CCK mediation of LTP is age-dependent and supports frequency discrimination. This work is important because is opens up a new avenue of investigation of the roles of neuropeptides in sensory plasticity.
Strengths:
The main strength is the multiple approaches used to comprehensively examine the role of CCK in auditory thalamocortical LTP. Thus, the authors do provide a compelling set of data that CCK mediates thalamocortical LTP in an age-dependent manner.
Weaknesses:
There are some details that should be addressed, primarily regarding potential baseline differences in comparison groups. The behavioral assessment is relatively limited, but may be fleshed out in future work.
-
Reviewer #3 (Public review):
Summary:
Cholecystokinin (CCK) is highly expressed in auditory thalamocortical (MGB) neurons and CCK has been found to shape cortical plasticity dynamics. In order to understand how CCK shapes synaptic plasticity in the auditory thalamocortical pathway, they assessed the role of CCK signaling across multiple mechanisms of LTP induction with the auditory thalamocortical (MGB - layer IV Auditory Cortex) circuit in mice. In these physiology experiments that leverage multiple mechanisms of LTP induction and a rigorous manipulation of CCK and CCK-dependent signaling, they establish an essential role of auditory thalamocortical LTP on the co-release of CCK from auditory thalamic neurons. By carefully assessing the development of this plasticity over time and CCK expression, they go on to identify a window of time that CCK is produced throughout early and middle adulthood in auditory thalamocortical neurons to establish a window for plasticity from 3 weeks to 1.5 years in mice, with limited LTP occurring outside of this window. The authors go on to show that CCK signaling and its effect on LTP in the auditory cortex is also capable of modifying frequency discrimination accuracy in an auditory PPI task. In evaluating the impact of CCK on modulating PPI task performance, it also seems that in mice <1.5 years old CCK-dependent effects on cortical plasticity is almost saturated. While exogenous CCK can modestly improve discrimination of only very similar tones, exogenous focal delivery of CCK in older mice can significantly improve learning in a PPI task to bring their discrimination ability in line with those from young adult mice.
Strengths:
(1) The clarity of the results, along with the rigor multi-angled approach, provide significant support for the claim that CCK is essential for auditory thalamocortical synaptic LTP. This approach uses a combination of electrical, acoustic, and optogenetic pathway stimulation alongside conditional expression approaches, germline knockout, viral RNA downregulation and pharmacological blockade. Through the combination of these experimental configures the authors demonstrate that high-frequency stimulation-induced LTP is reliant on co-release of CCK from glutamatergic MGB terminals projecting to the auditory cortex.
(2) The careful analysis of the CCK, CCKB receptor, and LTP expression is also a strength that puts the finding into the context of mechanistic causes and potential therapies for age-dependent sensory/auditory processing changes. Similarly, not only do these data identify a fundamental biological mechanism, but they also provide support for the idea that exogenous asynchronous stimulation of the CCKBR is capable of restoring an age-dependent loss in plasticity.
(3) Although experiments to simultaneously relate LTP and behavioral change or identify a causal relationship between LTP and frequency discrimination are not made, there is still convincing evidence that CCK signaling in the auditory cortex (known to determine synaptic LTP) is important for auditory processing/frequency discrimination. These experiments are key for establishing the relevance of this mechanism.
Weaknesses:
(1) Given the magnitude of the evoked responses, one expects that pyramidal neurons in layer IV are primarily those that undergo CCK-dependent plasticity, but the degree to which PV-interneurons and pyramidal neurons participate in this process differently is unclear.
(2) While these data support an important role for CCK in synaptic LTP in the auditory thalamocortical pathway, perhaps temporal processing of acoustic stimuli is as or more important than frequency discrimination. Given the enhanced responsivity of the system, it is unclear whether this mechanism would improve or reduce the fidelity of temporal processing in this circuit. Understanding this dynamic may also require consideration of cell type as raised in weakness #1.
(3) In Figure 1, an example of increased spontaneous and evoked firing activity of single neurons after HFS is provided. Yet it is surprising that the group data are analyzed only for the fEPSP. It seems that single neuron data would also be useful at this point to provide insight into how CCK and HFS affect temporal processing and spontaneous activity/excitability, especially given the example in 1F.
(4) The circuitry that determines PPI requires multiple brain areas, including the auditory cortex. Given the complicated dynamics of this process, it may be helpful to consider what, if anything, is known specifically about how layer IV synaptic plasticity in the auditory cortex may shape this behavior.
Comments on revisions:
The manuscript is much improved and many of the issues or questions have been addressed. Ideally, evidence for the degree of transsynaptic spread for AAV9-Syn-ChrimsonR-tdTomato would also be provided in some form since in the authors' response in sounds like some was observed, as expected.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
This study offers a valuable investigation into the role of cholecystokinin (CCK) in thalamocortical plasticity during early development and adulthood, employing a range of experimental techniques. The authors demonstrate that tetanic stimulation of the auditory thalamus induces cortical long-term potentiation (LTP), which can be evoked through either electrical or optical stimulation of the thalamus or by noise bursts. They further show that thalamocortical LTP is abolished when thalamic CCK is knocked down or when cortical CCK receptors are blocked. Interestingly, in 18-month-old mice, thalamocortical LTP was largely absent but could be restored through the cortical application of CCK. The authors conclude that CCK contributes to thalamocortical plasticity and may enhance thalamocortical plasticity in aged subjects.
While the study presents compelling evidence, I would like to offer several suggestions for the authors' consideration:
(1) Thalamocortical LTP and NMDA-Dependence:
It is well established that thalamocortical LTP is NMDA receptor-dependent, and blocking cortical NMDA receptors can abolish LTP. This raises the question of why thalamocortical LTP is eliminated when thalamic CCK is knocked down or when cortical CCK receptors are blocked. If I correctly understand the authors' hypothesis - that CCK promotes LTP through CCKR-intracellular Ca2+-AMPAR. This pathway should not directly interfere with the NMDA-dependent mechanism. A clearer explanation of this interaction would be beneficial.
Thank you for your question regarding the role of CCK and NMDA receptors (NMDARs) in thalamocortical LTP. We propose that CCK receptor (CCKR) activation enhances intracellular calcium levels, which are crucial for thalamocortical LTP induction. Calcium influx through NMDARs is also essential to reach the threshold required for activating downstream signaling pathways that promote LTP (Heynen and Bear, 2001). Thus, CCKRs and NMDARs may function in a complementary manner to facilitate LTP, with both contributing to the elevation of intracellular calcium.
However, it is important to note that the postsynaptic mechanisms of thalamocortical LTP in the auditory cortex (ACx) differ from those in other sensory cortices. Studies have shown that thalamocortical LTP in the ACx appears to be less dependent on NMDARs (Chun et al., 2013), which is distinct from somatosensory or visual cortices. Our previous studies also found that while NMDAR antagonists can block HFS-induced LTP in the inner ACx, LTP can still be induced in the presence of CCK even after the NMDARs blockade (Chen et al. 2019). These findings suggest that CCK may act through an alternative mechanism involving CCKR-mediated calcium signaling and AMPAR modulation, which partially compensates for the loss of NMDAR signaling. This distinction may reflect functional differences between the ACx and other sensory cortices, as highlighted in previous studies (King and Nelken, 2009).
While our current study focuses on the role of CCKR-mediated plasticity in the auditory system, further investigations are needed to elucidate how CCKRs and NMDARs interact within the broader framework of thalamocortical neuroplasticity across different cortical regions. Understanding whether similar mechanisms operate in other sensory systems, such as the visual cortex, will be an important direction for future research.
Heynen, A.J., and Bear, M.F. (2001). Long-term potentiation of thalamocortical transmission in the adult visual cortex in vivo. J Neurosci 21, 9801-9813. 10.1523/jneurosci.21-24-09801.2001.
Chun, S., Bayazitov, I.T., Blundon, J.A., and Zakharenko, S.S. (2013). Thalamocortical Long-Term Potentiation Becomes Gated after the Early Critical Period in the Auditory Cortex. The Journal of Neuroscience 33, 7345-7357. 10.1523/jneurosci.4500-12.2013.
Chen, X., Li, X., Wong, Y.T., Zheng, X., Wang, H., Peng, Y., Feng, H., Feng, J., Baibado, J.T., Jesky, R., et al. (2019). Cholecystokinin release triggered by NMDA receptors produces LTP and sound-sound associative memory. Proc Natl Acad Sci U S A 116, 6397-6406. 10.1073/pnas.1816833116.
King, A. J., & Nelken, I. (2009). Unraveling the principles of auditory cortical processing: can we learn from the visual system? Nature neuroscience, 12(6), 698-701.
(2) Complexity of the Thalamocortical System:
The thalamocortical system is intricate, with different cortical and thalamic subdivisions serving distinct functions. In this study, it is not fully clear which subdivisions were targeted for stimulation and recording, which could significantly influence the interpretation of the findings. Clarifying this aspect would enhance the study's robustness.
Thank you for your valuable feedback. We would like to clarify that stimulation was conducted in the medial geniculate nucleus ventral (MGv), and recording was performed in layer IV of the ACx. Targeting the MGv allows us to investigate the influence of thalamic inputs on auditory cortical responses. Layer IV of the ACx is known to receive direct thalamic projections, making it an ideal site for assessing how thalamic activity influences cortical processing. We will incorporate this clarification into the revised manuscript to enhance the robustness of our study.
Results section:
“Stimulation electrodes were placed in the MGB (specifically in the medial geniculate nucleus ventral subdivision, MGv), and recording electrodes were inserted into layer IV of ACx”
“The recording electrodes were lowered into layer IV of ACx, while the stimulation electrodes were lowered into MGB (MGv subdivision). The final stimulating and recording positions were determined by maximizing the cortical fEPSP amplitude triggered by the ES in the MGB. The accuracy of electrode placement was verified through post-hoc histological examination and electrophysiological responses.”
(3) Statistical Variability:
Biological data, including field excitatory postsynaptic potentials (fEPSPs) and LTP, often exhibit significant variability between samples, sometimes resulting in a standard deviation that exceeds 50% of the mean value. The reported standard deviation of LTP in this study, however, appears unusually small, particularly given the relatively limited sample size. Further discussion of this observation might be warranted.
Thank you for your question. In our experiments, the sample size N represents the number of animals used, while n refers to the number of recordings, with each recording corresponding to a distinct stimulation and recording sites. To adhere to ethical guidelines and minimize animal usage, we often perform multiple recordings within a single animal, such as from different hemispheres of the brain. Although N may appear small, our statistical analyses are based on n, ensuring sufficient data points for reliable conclusions.
Furthermore, as our experiments are conducted in vivo, we observe lower variability in the increase of fEPSP slopes following LTP induction compared to brain slice preparations, where standard deviations exceeding 50% of the mean are common. This reduced variability likely reflects the robustness of the physiologically intact conditions in the in vivo setup.
(4) EYFP Expression and Virus Targeting:
The authors indicate that AAV9-EFIa-ChETA-EYFP was injected into the medial geniculate body (MGB) and subsequently expressed in both the MGB and cortex. If I understand correctly, the authors assume that cortical expression represents thalamocortical terminals rather than cortical neurons. However, co-expression of CCK receptors does not necessarily imply that the virus selectively infected thalamocortical terminals. The physiological data regarding cortical activation of thalamocortical terminals could be questioned if the cortical expression represents cortical neurons or both cortical neurons and thalamocortical terminals.
Thank you for your question. In Figure 2A, EYFP expression indicates thalamocortical projections, while the co-expression of EYFP with PSD95 confirms the identity of thalamocortical terminals. The CCK-B receptors (CCKBR) are located on postsynaptic cortical neurons. The observed co-labeling of thalamocortical terminals and postsynaptic CCKBR suggests that CCK-expressing neurons in the medial geniculate body (MGB) can release CCK, which subsequently acts on the postsynaptic CCKBR. This evidence supports our interpretation of the functional role of CCK modulating neural plasticity between thalamocortical inputs and cortical neurons. As shown in Figure 2A, we aim to demonstrate that the co-labeling of thalamocortical terminals with CCK receptors accounts for a substantial proportion of the thalamocortical terminals. We will ensure that this clarification is emphasized in the revised manuscript to address your concerns.
Results section:
“Cre-dependent AAV9-EFIa-DIO-ChETA-EYFP was injected into the MGB of CCK-Cre mice. EYFP labeling marked CCK-positive neurons in the MGB. The co-expression of EYFP thalamocortical projections with PSD95 confirms the identity of thalamocortical terminals (yellow), which primarily targeted layer IV of the ACx (Figure 2A, upper panel). Immunohistochemistry revealed that a substantial proportion (15 out of 19, Figure 2A lower right panel) of thalamocortical terminals (arrows) colocalize with CCK receptors (CCKBR) on postsynaptic cortical neurons in the ACx (Figure 2A lower panel), supporting the functional role of CCK in modulating thalamocortical plasticity.”
(5) Consideration of Previous Literature:
A number of studies have thoroughly characterized auditory thalamocortical LTP during early development and adulthood. It may be beneficial for the authors to integrate insights from this body of work, as reliance on data from the somatosensory thalamocortical system might not fully capture the nuances of the auditory pathway. A more comprehensive discussion of the relevant literature could enhance the study's context and impact.
Thank you for your valuable feedback. We will enhance our discussion on auditory thalamocortical LTP during early development and adulthood to provide a more comprehensive context for our study.
(6) Therapeutic Implications:
While the authors suggest potential therapeutic applications of their findings, it may be somewhat premature to draw such conclusions based on the current evidence. Although speculative discussion is not harmful, it may not significantly add to the study's conclusions at this stage.
Thank you for your thoughtful feedback. We agree that the therapeutic applications mentioned in our study are speculative at this stage and should be regarded as a forward-looking perspective rather than definitive conclusions. Our intention was to highlight the broader potential of our findings to inspire further research, rather than to propose immediate clinical applications.
In light of your feedback, we have adjusted the language in the manuscript to reflect a more cautious interpretation. Speculative discussions are now explicitly framed as hypotheses or possibilities for future exploration. We emphasize that our findings provide a foundation for further investigations into CCK-based plasticity and its implications.
We believe that appropriately framed forward-thinking discussions are valuable in guiding the direction of future research. We sincerely hope that our current and future work will contribute to a deeper understanding of thalamocortical plasticity and, over time, potentially lead to advancements in human health.
Reviewer #2 (Public review):
Summary:
This work used multiple approaches to show that CCK is critical for long-term potentiation (LTP) in the auditory thalamocortical pathway. They also showed that the CCK mediation of LTP is age-dependent and supports frequency discrimination. This work is important because it opens up a new avenue of investigation of the roles of neuropeptides in sensory plasticity.
Strengths:
The main strength is the multiple approaches used to comprehensively examine the role of CCK in auditory thalamocortical LTP. Thus, the authors do provide a compelling set of data that CCK mediates thalamocortical LTP in an age-dependent manner.
Weaknesses:
The behavioral assessment is relatively limited but may be fleshed out in future work.
Reviewer #3 (Public review):
Summary:
Cholecystokinin (CCK) is highly expressed in auditory thalamocortical (MGB) neurons and CCK has been found to shape cortical plasticity dynamics. In order to understand how CCK shapes synaptic plasticity in the auditory thalamocortical pathway, they assessed the role of CCK signaling across multiple mechanisms of LTP induction with the auditory thalamocortical (MGB - layer IV Auditory Cortex) circuit in mice. In these physiology experiments that leverage multiple mechanisms of LTP induction and a rigorous manipulation of CCK and CCK-dependent signaling, they establish an essential role of auditory thalamocortical LTP on the co-release of CCK from auditory thalamic neurons. By carefully assessing the development of this plasticity over time and CCK expression, they go on to identify a window of time that CCK is produced throughout early and middle adulthood in auditory thalamocortical neurons to establish a window for plasticity from 3 weeks to 1.5 years in mice, with limited LTP occurring outside of this window. The authors go on to show that CCK signaling and its effect on LTP in the auditory cortex is also capable of modifying frequency discrimination accuracy in an auditory PPI task. In evaluating the impact of CCK on modulating PPI task performance, it also seems that in mice <1.5 years old CCK-dependent effects on cortical plasticity are almost saturated. While exogenous CCK can modestly improve discrimination of only very similar tones, exogenous focal delivery of CCK in older mice can significantly improve learning in a PPI task to bring their discrimination ability in line with those from young adult mice.
Strengths:
(1) The clarity of the results along with the rigor multi-angled approach provide significant support for the claim that CCK is essential for auditory thalamocortical synaptic LTP. This approach uses a combination of electrical, acoustic, and optogenetic pathway stimulation alongside conditional expression approaches, germline knockout, viral RNA downregulation, and pharmacological blockade. Through the combination of these experimental configures the authors demonstrate that high-frequency stimulation-induced LTP is reliant on co-release of CCK from glutamatergic MGB terminals projecting to the auditory cortex.
(2) The careful analysis of the CCK, CCKB receptor, and LTP expression is also a strength that puts the finding into the context of mechanistic causes and potential therapies for age-dependent sensory/auditory processing changes. Similarly, not only do these data identify a fundamental biological mechanism, but they also provide support for the idea that exogenous asynchronous stimulation of the CCKBR is capable of restoring an age-dependent loss in plasticity.
(3) Although experiments to simultaneously relate LTP and behavioral change or identify a causal relationship between LTP and frequency discrimination are not made, there is still convincing evidence that CCK signaling in the auditory cortex (known to determine synaptic LTP) is important for auditory processing/frequency discrimination. These experiments are key for establishing the relevance of this mechanism.
Weaknesses:
(1) Given the magnitude of the evoked responses, one expects that pyramidal neurons in layer IV are primarily those that undergo CCK-dependent plasticity, but the degree to which PV-interneurons and pyramidal neurons participate in this process differently is unclear.
Thank you for this insightful comment. We agree that the differential roles of PV-interneurons and pyramidal neurons in CCK-dependent thalamocortical plasticity remain unclear and acknowledge this as an important limitation of our study. Our primary focus was on pyramidal neurons, as our in vivo electrophysiological recordings measured the fEPSP slope in layer IV of the auditory cortex, which primarily reflects excitatory synaptic activity. However, we recognize the critical role of the excitatory-inhibitory balance in cortical function and the potential contribution of PV-interneurons to this process. In future studies, we plan to utilize techniques such as optogenetics, two-photon calcium imaging and cell-type-specific recordings to investigate the distinct contributions of PV-interneurons and pyramidal neurons to CCK-dependent thalamocortical plasticity, thereby providing a more comprehensive understanding of how CCK modulates thalamocortical circuits.
(2) While these data support an important role for CCK in synaptic LTP in the auditory thalamocortical pathway, perhaps temporal processing of acoustic stimuli is as or more important than frequency discrimination. Given the enhanced responsivity of the system, it is unclear whether this mechanism would improve or reduce the fidelity of temporal processing in this circuit. Understanding this dynamic may also require consideration of cell type as raised in weakness #1.
Thank you for this thoughtful comment. We acknowledge that our study did not directly address the fidelity of temporal processing, which is indeed a critical aspect of auditory function. Our behavioral experiments primarily focused on linking frequency discrimination to the role of CCK in synaptic strengthening within the auditory thalamocortical pathway. However, we agree that enhanced responsivity of the system could also impact temporal processing dynamics, such as the precise timing of auditory responses. Whether this modulation improves or reduces the fidelity of temporal processing remains an open and important question.
As you noted, understanding these dynamics will require a deeper investigation into the interactions between different cell types, particularly the balance between excitatory and inhibitory neurons. Exploring how CCK modulation affects both the circuit and cellular levels in temporal processing is an important direction for future research, which we plan to pursue. Thank you again for raising this important point.
Disscusion section:
“While we focused on homosynaptic plasticity at thalamocortical synapses by recording only fEPSPs in layer IV of ACx, it is essential to further explore heterosynaptic effects of CCK released from thalamocortical synapses on intracortical circuits, particularly its role in modulating the excitatory-inhibitory balance. PV-interneurons, as key regulators of cortical inhibition, may contribute to the temporal fidelity of sensory processing, which is critical for auditory perception (Nocon et al., 2023; Cai et al., 2018). Additionally, CCK may facilitate cross-modal plasticity by modulating heterosynaptic plasticity in interconnected cortical areas. Future studies would provide valuable insights into the broader role of CCK in shaping sensory processing and cortical network dynamics.”
Nocon, J.C., Gritton, H.J., James, N.M., Mount, R.A., Qu, Z., Han, X., and Sen, K. (2023). Parvalbumin neurons enhance temporal coding and reduce cortical noise in complex auditory scenes. Communications Biology 6, 751. 10.1038/s42003-023-05126-0.
Cai, D., Han, R., Liu, M., Xie, F., You, L., Zheng, Y., Zhao, L., Yao, J., Wang, Y., Yue, Y., et al. (2018). A Critical Role of Inhibition in Temporal Processing Maturation in the Primary Auditory Cortex. Cereb Cortex 28, 1610-1624. 10.1093/cercor/bhx057.
(3) In Figure 1, an example of increased spontaneous and evoked firing activity of single neurons after HFS is provided. Yet it is surprising that the group data are analyzed only for the fEPSP. It seems that single-neuron data would also be useful at this point to provide insight into how CCK and HFS affect temporal processing and spontaneous activity/excitability, especially given the example in 1F.
Thank you for your insightful comment. In our in vivo electrophysiological experiments on LTP induction, we recorded neural activity for over 1.5 hours to assess changes in neuronal responses over time, both prior to and following the induction. While single neuron firing data can provide valuable insights, such measurements are inherently more variable due to factors like cortical state fluctuations and the condition of nearby neurons, which makes them less reliable for long-term analysis. For this reason, we focused on fEPSP, as it offers a more stable and robust readout of synaptic activity over extended periods.
We appreciate your suggestion and recognize the value of single-neuron data in understanding how CCK and HFS affect temporal processing and excitability. In future studies, we will consider to incorporate single-neuron analyses to complement our synaptic-level findings and provide a more comprehensive understanding of these mechanisms.
(4) The authors mention that CCK mRNA was absent in CCK-KO mice, but the data are not provided.
Thank you for your comment. Data from the CCK-KO mice are presented in Figure 3A (far right) and in the upper panel of Figure 3B (far right). In the lower panel of Figure 3B, data from the CCK-KO group are not shown because the normalized values for this group were essentially zero, as expected due to the absence of CCK mRNA.
(5) The circuitry that determines PPI requires multiple brain areas, including the auditory cortex. Given the complicated dynamics of this process, it may be helpful to consider what, if anything, is known specifically about how layer IV synaptic plasticity in the auditory cortex may shape this behavior.
Thank you for raising this important point. Pre-pulse inhibition (PPI) of the acoustic startle response indeed involves multiple brain regions, with the ascending auditory pathway playing a key role (Gómez-Nieto et al., 2020). Within the auditory cortex, layer IV neurons receive tonotopically organized inputs from the medial geniculate nucleus and are critical for integrating thalamic inputs and shaping auditory processing.
In our behavioral experiments, mice were required to discriminate pre-pulses of varying frequencies against a continuous background sound. Given the role of auditory cortical neurons in integrating thalamic inputs and shaping auditory processing, it is likely that synaptic plasticity in these neurons contributes to the enhanced discrimination of pre-pulses. Supporting this idea, our previous work demonstrated that local infusion of CCK, paired with weak acoustic stimuli, significantly increased auditory responses in the auditory cortex (Li et al., 2014). In the current study, we further showed that CCK release during high-frequency stimulation of the thalamocortical pathway induced LTP in layer IV of the auditory cortex. Together, these findings suggest that CCK-dependent synaptic plasticity in layer IV may amplify the cortical representation of weak auditory inputs, thereby improving pre-pulses detection and enhancing PPI performance.
It is also worth noting that aged mice with hearing loss typically exhibit PPI deficits due to impaired auditory processing (Ouagazzal et al., 2006 and Young et al., 2010). We propose that enhanced plasticity in the thalamocortical pathway, mediated by CCK, might partially compensate for these deficits by amplifying residual auditory signals in aged mice. However, the precise mechanisms by which layer IV synaptic plasticity modulates PPI behavior remain to be fully understood. Given the complex dynamics of sensory processing, future studies could explore how layer IV neurons interact with other cortical and subcortical circuits involved in PPI, as well as the specific contributions of excitatory and inhibitory cell types. These investigations will help provide a more comprehensive understanding of the role of CCK in modulating sensory gating and auditory processing.
Gómez-Nieto, R., Hormigo, S., & López, D. E. (2020). Prepulse inhibition of the auditory startle reflex assessment as a hallmark of brainstem sensorimotor gating mechanisms. Brain sciences, 10(9), 639.
Li, X., Yu, K., Zhang, Z., Sun, W., Yang, Z., Feng, J., Chen, X., Liu, C.-H., Wang, H., Guo, Y.P., and He, J. (2014). Cholecystokinin from the entorhinal cortex enables neural plasticity in the auditory cortex. Cell Research 24, 307-330. 10.1038/cr.2013.164.
Ouagazzal, A. M., Reiss, D., & Romand, R. (2006). Effects of age-related hearing loss on startle reflex and prepulse inhibition in mice on pure and mixed C57BL and 129 genetic background. Behavioural brain research, 172(2), 307-315.
Young, J. W., Wallace, C. K., Geyer, M. A., & Risbrough, V. B. (2010). Age-associated improvements in cross-modal prepulse inhibition in mice. Behavioral neuroscience, 124(1), 133.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
Major concerns:
(1) In Figure 1, the authors used different metrics for fEPSP strength. In Figure 1D, the authors used the slope, while they used the amplitude in Figure 1G. It is known that the two metrics are different from each other. While the slope is calculated from the linear regression between the voltage change per time of the rising phase of the fEPSP, the amplitude represents the voltage value of the fEPSP's peak. Please clarify here and in the method what metric you used, because the two terms are not interchangeable.
Thank you for pointing out this oversight in our manuscript. We confirm that we used the slope of the fEPSP as the metric for assessing synaptic strength throughout the study, including both Figure 1D and Figure 1G. We will make the necessary corrections to ensure clarity and consistency. Thank you for bringing this to our attention.
(2) It is not mentioned in the details of the methods about the CCK-KO mice. Please give such details. Although the authors used the CCK-KO mouse model as a control, I think that it is not a good choice to test the hypothesis mentioned in lines 165 and 166. The experiment was supposed to monitor the CCK-BR activity after HFS of the MGB and answer whether the CCK-BR will get activated by thalamic stimulation, but the CCK-KO mouse does not have CCK to be released after the optogenetic activation of the Chrimson probe. Therefore, it is expected to give nothing as if the experimenter runs an experiment without intervention. I think that the appropriate way to examine the hypothesis is to compare mice that were either injected with AAV9-Syn-FLEX-ChrimsonR-tdTomato or AAV9-Syn-FLEX-tdTomato. However, CCK-OK would be a perfect model to confirm that LTP can be only generated dependently on CCK, by simply running the HFS of the MGB that would be associated with the cortical recording of the fEPSP. This also will rule out the assumption that the authors mentioned in lines 191 and 192.
Thank you for your valuable feedback. The rationale behind our experimental design was to validate the newly developed CCK sensor and confirm its specificity. We aimed to verify CCK release post-HFS by comparing the responses of the CCK sensor in CCK-KO mice and CCK-Cre mice. This comparison allowed us to determine that the observed increase in fluorescence intensity post-HFS was specifically due to CCK release, rather than other neurotransmitters induced by HFS.
We appreciate your suggestion to compare mice injected with AAV9-Syn-FLEX-ChrimsonR-tdTomato and AAV9-Syn-FLEX-tdTomato, as it is indeed a valuable approach for directly testing the hypothesis regarding CCK-BR activation. However, we prioritized using the CCK-KO model to validate the CCK sensor's efficacy and specificity. The validation can be inferred by comparing the CCK sensor activity before and after HFS.
Regarding concerns mentioned in lines 191 and 192 about potential CCK release from other projections via indirect polysynaptic activation, CCK-KO mice were not suitable for this aspect due to their global knockout of CCK. To address this limitation, we utilized shRNA to specifically down-regulate Cck expression in MGB neurons. This approach focused on the necessity of CCK released from thalamocortical projections for the observed LTP and effectively ruled out the possibility of indirect polysynaptic activation.
We also acknowledge that the methods section lacked sufficient details about the CCK-KO mice, which may have caused confusion. In the revised methods section, we will add the following details:
(1) The genotype of the CCK-KO mice used in this study (CCK-ires-CreERT2, Jax#012710).
(2) A brief description of the CCK-KO validation, emphasizing the absence of CCK mRNA in these mice (as shown in Figure 3A and 3B).
(3) The experimental purpose of using CCK-KO mice to validate the specificity of the CCK sensor.
We believe these additions will clarify the rationale for using CCK-KO mice and their role in this study. Thank you again for highlighting these important points.
(3) Figure 3C: The authors should examine if there is a difference in the baseline of fEPSPs across different age groups as the dependence on the normalization in the analysis within each group would hide if there were any difference of the baseline slope of fEPSP between groups which could be related to any misleading difference after HFS. Also, I wonder about the absence of LTP in P20, which is a closer age to the critical period. Could the authors discuss that, please?
Thank you for your insightful feedback. To address your concern regarding baseline differences in fEPSP slopes across age groups, we conducted additional analysis. Baseline fEPSP across the three groups (P20, 8w, 18m), normalized to the 8w group, were 64.8± 13.1%, 100.0 ± 20.4%, and 58.8± 10.3%, respectively. While there was a trend suggesting smaller fEPSP slopes in the P20 and 18m groups compared to the young adult group, these differences were not statistically significant due to data variability (P20 vs. 8w, P = 0.319; 8w vs. 18m, P=0.147; P20 vs. 18m, P = 1.0, one-way ANOVA). These results suggest that baseline variability is unlikely to confound the observed differences in LTP after HFS. Furthermore, we ensured that normalization minimized any potential baseline effects.
Regarding the absence of LTP in P20, this likely reflects developmental regulation of CCKBR expression in the auditory cortex (ACx). The HFS-induced thalamocortical LTP observed in our study is CCK-dependent and mechanistically distinct from the NMDA-dependent thalamocortical LTP during the critical period. Specifically, correlated pre- and postsynaptic activity can induce NMDA-dependent thalamocortical LTP only during an early critical period corresponding to the first several postnatal days, after which this pairing becomes ineffective starting from the second postnatal week (Crair and Malenka, 1995; Isaac et al., 1997; Chun et al., 2013). In contrast, the CCK-dependent Thalamocortical LTP induced by HFS is robust in adult mice but appears absent in P20, likely due to the lack of postsynaptic CCKBR expression in the ACx at this developmental stage.
We will include these clarifications in the revised manuscript, particularly in the Discussion section, to provide a more comprehensive explanation of our findings. Thank you for your valuable comments and suggestions.
Crair, M.C., and Malenka, R.C. (1995). A critical period for long-term potentiation at thalamocortical synapses. Nature 375, 325-328. 10.1038/375325a0.
Isaac, J.T.R., Crair, M.C., Nicoll, R.A., and Malenka, R.C. (1997). Silent Synapses during Development of Thalamocortical Inputs. Neuron 18, 269-280. https://doi.org/10.1016/S0896-6273(00)80267-6.
Chun, S., Bayazitov, I.T., Blundon, J.A., and Zakharenko, S.S. (2013). Thalamocortical Long-Term Potentiation Becomes Gated after the Early Critical Period in the Auditory Cortex. The Journal of Neuroscience 33, 7345-7357. 10.1523/jneurosci.4500-12.2013.
(4) Figure 4F: It is noticed that the baseline fEPSP of the CCK group and ACSF groups were different, which raises a concern about the baseline differences between treatment groups.
Thank you for your valuable feedback and for pointing out this important detail. We apologize for any confusion caused by the presentation of the data. As noted in the figure legend, the scale bars for the fEPSPs were different between the left (0.1 mV) and right panels (20 µV). This difference in scale may have created the perception of baseline differences between the CCK and ACSF groups. To enhance clarity and avoid potential misunderstanding, we will unify the scale bar values in the revised figure. This adjustment will provide a clearer and more accurate comparison of fEPSPs between groups. Thank you again for bringing this issue to our attention.
(5) From Figure S2D, it seems that different animals were injected with the drug and ACSF. Therefore, how the authors validate the position of the recording electrode to the cortical area of certain CF and relative EF. Also, there is not enough information about the basis of the selection of the EF. Should it be lower than the CF with a certain value? Was the EF determined after the initial tuning curve in each case? To mitigate this difference, it would be appropriate if the authors examined the presence of a significant difference in the tuning width and CFs between animals exposed to ACSF and CCK-4. This will give some validation of a balanced experiment between ACSF and CCK-4. I wonder also why the authors used rats here not mice, as it will be easier to interpret the results came from the same species.
Thank you for your thoughtful comments. The effective frequency (EF) was determined after measuring the initial tuning curve for each case. The EF was selected to elicit a clear sound response while maintaining a sufficient distance from the characteristic frequency (CF) to allow measurable increases in response intensity. Specifically, EF was selected based on the starting point of the tuning peak, which corresponds to the onset of its fastest rising phase. From this point, EF was determined by moving 0.2 or 0.4 octaves toward the CF. While there were individual differences in EF selection among animals, the methodology for determining EF was standardized and applied consistently across both the ACSF and CCK-4 groups.
Regarding the use of rats in these experiments, these studies were conducted prior to our current work with mice. The findings in rat provide valuable insights that support our current results in mice. Since the rat data are supplementary to the primary findings, we included them as supplementary material to provide additional context and validation. Furthermore, in consideration of animal welfare, we chose not to replicate these experiments in mice, as the findings from rats were sufficient to support our conclusions.
Methods section:
“The tuning curve was determined by plotting the lowest intensity at which the neuron responded to different tones. The characteristic frequency (CF) is defined as the frequency corresponding to the lowest point on this curve. The effective frequency (EF) was determined to elicit a clear sound response while maintaining a sufficient distance from the CF to allow measurable increases in response intensity. Specifically, EF was selected based on the starting point of the tuning peak, which corresponds to the onset of its fastest rising phase. From this point, EF was determined by moving 0.2 or 0.4 octaves toward the CF.”
(6) Lines 384-386: There are no figures named 5H and I.
Thank you for pointing this out. The references to Figures 5H and 5I were incorrect and should have referred to Figures 5C and 5D. We sincerely apologize for this oversight and will correct these errors in the revised manuscript to ensure clarity and accuracy. Thank you again for bringing this to our attention.
(7) The authors should mention the sex of the animals used.
Thank you for your comment and for highlighting this important detail. The sex of the animals used in this study is specified in the Animals section of the Methods: "In the present study, male mice and rats were used to investigate thalamocortical LTP." We appreciate your careful attention to this point and will ensure that this detail remains clearly stated in the manuscript.
(8) Lines 534 and 648: These coordinates are difficult to understand. Since the experiment was done on both mice and rats, we need a clear description of the coordinates in both. Also, I think that you should mention the lateral distance from the sagittal suture as the ventral coordinates should be calculated from the surface of the skull above the AC and not from the sagittal suture.
Thank you for your valuable feedback and for pointing out this important issue. We apologize for any confusion caused by our description of the coordinates. The term “ventral” was deliberately used because the auditory cortex is located on the lateral side of the skull, which may have caused some misunderstanding.
To provide a clearer and more accurate descriptions of the coordinates, we will revise the text in the manuscript as follows: “A craniotomy was performed at the temporal bone (-2 to -4 mm posterior and -1.5 to -3 mm ventral to bregma for mice; -3.0 to -5.0 mm posterior and -2.5 to -6.5 mm ventral to bregma for rats) to access the auditory cortex.'
We appreciate your attention to these details and will ensure that the revised manuscript includes this clarification to improve accuracy and eliminate potential confusion. Thank you again for bringing this to our attention.
(9) Line 536: The author should specify that these coordinates are for the experiment done on mice.
Thank you for your valuable feedback. We will revise the manuscript to explicitly specify that these coordinates refer to the experiments conducted on mice. This clarification will help improve the clarity and precision of the manuscript. We greatly appreciate your attention to this point and your effort to enhance the quality of our work.
Methods section:
“and a hole was drilled in the skull according to the coordinates of the ventral division of the MGB (MGv, AP: -3.2 mm, ML: 2.1 mm, DV: 3.0 mm) for experiments conducted on mice.”
(10) Line 590: Please add the specifications of the stimulating electrode. Is it unipolar or bipolar? What is the cat.# provided by FHC?
Thank you for your valuable feedback. The electrodes used in the experiments are unipolar. We will include the catalog number provided by FHC in the revised manuscript for clarity. The revised text will be updated as follows:
“In HFS-induced thalamocortical LTP experiments, two customized microelectrode arrays with four tungsten unipolar electrodes each, impedance: 0.5-1.0 MΩ (recording: CAT.# UEWSFGSECNND, FHC, U.S.), and 200-500 kΩ (stimulating: CAT.# UEWSDGSEBNND, FHC, U.S.), were used for the auditory cortical neuronal activity recording and MGB ES, respectively.”
We appreciate your attention to this detail, and we will ensure that the revised manuscript reflects this clarification accurately.
(11) Lines 612-614: There are no details of how the optic fiber was inserted or post-examined. If there is a word limitation, the authors may reference another study showing these procedures.
Thank you for your insightful comment and for highlighting this important aspect of the methodology. To address this, we will reference the study by Sun et al. (2024) in the revised manuscript, which provides detailed procedures for optic fiber insertion and post-examination. We believe that this reference will help enhance the clarity and completeness of the methods section.
Sun, W., Wu, H., Peng, Y., Zheng, X., Li, J., Zeng, D., Tang, P., Zhao, M., Feng, H., Li, H., et al. (2024). Heterosynaptic plasticity of the visuo-auditory projection requires cholecystokinin released from entorhinal cortex afferents. eLife 13, e83356. 10.7554/eLife.83356.
We appreciate your valuable suggestion, which will contribute to improving the quality of the manuscript.
Minor concerns:
(1) The definition of HFS was repeated many times throughout the manuscript. Please mention the defined name for the first time in the manuscript only followed by its abbreviation (HFS).
Thank you for your suggestion and for pointing out this important detail. We will revise the manuscript to ensure that all abbreviations are defined only upon their first mention in the manuscript, with subsequent mentions using the abbreviations consistently. We appreciate your careful attention to detail and your effort to help improve the manuscript.
(2) Line 173: There is a difference between here and the methods section (620 nm here and 635 nm there) please correct which wavelength the authors used.
Thank you for your careful review and for bringing this discrepancy to our attention. We have corrected the inconsistency, and the wavelength has been unified throughout the manuscript to ensure accuracy and clarity. The revised text now reads as follows:
“The fluorescent signal was monitored for 25s before and 60s after the HFLS (5~10 mW, 620 nm) or HFS application.”
We appreciate your valuable feedback, which has helped us improve the precision and consistency of the manuscript.
(3) Line 185: I think the authors should refer to Figure 2G before mentioning the statistical results.
Thank you for your careful review and for pointing out this oversight. We have now added a reference to Figure 2G at the appropriate location to ensure clarity and logical flow in the manuscript, as recommended..
(4) Line 202: I think the authors should refer to Figure 2J before mentioning the statistical results.
Thank you again for your careful review and for highlighting this point. We have revised the manuscript to include a reference to Figure 2J before mentioning the statistical results.
We appreciate your valuable feedback, which has helped us improve the accuracy and presentation of the results.
(5) Line 260: Please add appropriate references at the end of the sentence to support the argument.
Thank you for your valuable suggestion. To address this, we have add appropriate references to support the statement regarding the multiple steps involved between mRNA expression and neuropeptide release. Additionally, we have revised the statement to adopt a more cautious interpretation. The revised text is as follows:
“It is widely recognized that mRNA levels do not always directly correlate with peptide levels due to multiple steps involved in peptide synthesis and processing, including translation, post-translational modifications, packaging, transportation, and proteolytic cleavage, all of which require various enzymes and regulatory mechanisms (38-41). A disruption at any stage in this process could lead to impaired CCK release, even when Cck mRNA is present.”
We have included the following references to support this statement:
38. Mierke, C.T. (2020). Translation and Post-translational Modifications in Protein Biosynthesis. In Cellular Mechanics and Biophysics: Structure and Function of Basic Cellular Components Regulating Cell Mechanics, C.T. Mierke, ed. (Springer International Publishing), pp. 595-665. 10.1007/978-3-030-58532-7_14.
39. Gualillo, O., Lago, F., Casanueva, F.F., and Dieguez, C. (2006). One ancestor, several peptides post-translational modifications of preproghrelin generate several peptides with antithetical effects. Mol Cell Endocrinol 256, 1-8. 10.1016/j.mce.2006.05.007.
40. Sossin, W.S., Fisher, J.M., and Scheller, R.H. (1989). Cellular and molecular biology of neuropeptide processing and packaging. Neuron 2, 1407-1417. https://doi.org/10.1016/0896-6273(89)90186-4.
41. Hook, V., Funkelstein, L., Lu, D., Bark, S., Wegrzyn, J., and Hwang, S.R. (2008). Proteases for processing proneuropeptides into peptide neurotransmitters and hormones. Annu Rev Pharmacol Toxicol 48, 393-423. 10.1146/annurev.pharmtox.48.113006.094812.
We greatly appreciate your helpful feedback, which has allowed us to improve both the accuracy and the depth of discussion in the manuscript.
(6) Line 278: The authors mentioned "due to the absence of CCK in aged animals", which was not an appropriate description. It should be a reduction of CCK gene expression or a possible deficient CCK release.
Thank you for your careful review and for pointing out the inaccuracy in our description. We agree with your suggestion and have revised the statement to more appropriately reflect the findings.
“Our findings revealed that thalamocortical LTP cannot be induced in aged mice, likely due to insufficient CCK release, despite intact CCKBR expression.”
This revision ensures a more accurate and precise description of the potential mechanisms underlying the observed phenomenon. We greatly appreciate your valuable feedback, which has helped us improve the clarity and accuracy of the manuscript.
(7) Line 291: The authors mentioned that "without MGB stimulation", which is confusing. The MGB was stimulated with a single electrical pulse to evoke cortical fEPSPs. Therefore it should be "without HFS of MGB".
Thank you for pointing this out and for highlighting the potential confusion caused by our original phrasing. Upon review, we recognize that our original phrasing "without MGB stimulation" may have been unclear and could have led to misinterpretation. To clarify, our intention was to describe the period during which CCK was present without any stimulation of the MGB.
It is important to note that, in the presence of CCK, LTP can be induced even with low-frequency stimulation, including in aged mice. This observation underscores the potent effect of CCK in facilitating thalamocortical LTP, regardless of the specific stimulation protocol used.
To address this issue, we have revised the sentence for improved clarity as follows::
" To investigate whether CCK alone is sufficient to induce thalamocortical LTP without activating thalamocortical projections, we infused CCK-4 into the ACx of young adult mice immediately after baseline fEPSPs recording. Stimulation was then paused for 15 min to allow for CCK degradation, after which recording resumed."
We believe this revision resolves the misunderstanding and provides a clearer and more accurate description of the experimental context. We greatly appreciate your insightful feedback, which has helped us refine the manuscript for clarity and precision.
Reviewer #3 (Recommendations for the authors):
Minor comments:
(1) Line 99, 134, possibly other locations: "site" to "sites".
Thank you for your careful review. We appreciate your attention to detail and have made the necessary corrections in the manuscript.
(2) Throughout the manuscript there are some minor issues with language choice and subtle phrasing errors and I suggest English language editing.
Thank you for your suggestion. In response, we have thoroughly reviewed the manuscript and addressed issues related to language choice and phrasing. The text has been carefully edited to ensure clarity, precision, and consistency. We believe these revisions have significantly enhanced the overall quality of the manuscript. We greatly appreciate your feedback, which has been invaluable in improving the presentation of our work.
(3) Based on the experimental configurations, I do not think it is a problematic caveat, but authors should be aware of the high likelihood of AAV9 jumping synapses relative to other AAV serotypes.
Thank you for bringing up the potential of AAV9 crossing synapses, a recognized characteristic of this serotype. We appreciate your observation regarding its relevance to our experimental design. In our study, we carefully considered the possibility of trans-synaptic transfer during both the experimental design and data interpretation phases. To minimize the likelihood of significant trans-synaptic spread, we implemented several measures, including controlling the injection volume, using a slow injection rate, and limiting the viral expression time. Post-hoc histological analyses confirmed that the expression of AAV9 was largely confined to the intended regions, with limited evidence of synaptic jumping under our experimental conditions.
While we acknowledge the inherent potential for AAV9 to cross synapses, we believe this effect does not substantially confound the interpretation of our findings in the current study. To address this concern, we have added a brief discussion on this point in the revised manuscript to enhance clarity. We greatly appreciate your insightful comment, which has helped us further refine our work.
Discussion section:
“ One potential limitation of our study is the trans-synaptic transfer property of AAV9. To mitigate this, we carefully controlled the injection volume, rate, and viral expression time, and conducted post-hoc histological analyses to minimize off-target effects, thereby reducing the likelihood of trans-synaptic transfer confounding the interpretation of our findings.”
(4) The trace identifiers (1-4) do not seem correctly placed/colored in Figure S1D. Please check others carefully.
Thank you for your careful review and for bringing this issue to our attention. We have corrected the trace identifiers in Figure S1D. Additionally, we have carefully reviewed all other figures to ensure their accuracy and consistency. We greatly appreciate your attention to detail, which has helped improve the overall quality of the manuscript.
(5) Please provide a value of the laser power range based on calibrated values.
Thank you for your suggestion. We have included the calibrated laser power range in the revised manuscript as follows:
“The laser stimulation was produced by a laser generator (5-20 mW(30), Wavelength: 473 nm, 620 nm; CNI laser, China) controlled by an RX6 system and delivered to the brain via an optic fiber (Thorlabs, U.S.) connected to the generator.”
We appreciate your feedback, which has helped improve the clarity and precision of our methodological description.
(6) It would be useful to annotate figures in a way that identifies in which transgenic mice experiments are being performed.
Thank you for your valuable suggestion. We will add annotations to the figures to explicitly identify the type of mice used in each experiment. We believe this enhancement will improve the clarity and accessibility of our results. We greatly appreciate your input in making our manuscript more informative.
(7) Please comment on the rigor you use to address the accuracy of viral injections. How often did they spread outside of the MGB/AC?
Thank you for raising this important question regarding the accuracy of viral injections and the potential spread outside the MGB or AC. Below, we provide details for each set of experiments:
shRNA Experiments:
For the shRNA experiments targeting the MGB, our primary goal was to achieve comprehensive coverage of the entire MGB. To this end, we used larger injection volumes and multiple injection sites, which inevitably resulted in some viral spread beyond the MGB. However, this approach was necessary to ensure robust knockdown effects that were representative of the entire MGB. While strict confinement to specific subregions could not be guaranteed, this strategy allowed us to prioritize the effectiveness of the knockdown within the target region.
Fiber photometry Experiments:
For the fiber photometry experiments targeting the auditory cortex (AC), we used larger injection volumes and multiple injection sites to cover its relatively large size. Although this approach might have resulted in some CCK-sensor virus spread outside the AC, the placement of the optic fiber was guided by the location of the auditory cortex. Consequently, any minor viral expression outside the AC would not affect the experimental results, as recordings were confined to the intended area through precise fiber placement.
Optogenetic Experiments:
For the optogenetic experiments targeting the MGB, we specifically injected virus into the MGv subregion. To minimize viral spread, we employed several strategies, including the used fine injection needles, waiting for tissue stabilization (7 minutes post-needle insertion), delivering small volumes at a slow rate to prevent backflow, aspirating 5 nL of the solution post-injection, and raising the needle by 100 μm before waiting an additional 5 minutes prior to full retraction. These measures significantly reduced the risk of viral leakage to adjacent regions.
Histological Validation:
After the electrophysiological experiments, we systematically verified the accuracy of viral expression by examining histological sections to ensure that the expression was primarily localized within the intended regions.
Terminology in the Manuscript:
In the manuscript, we deliberately used the term "MGB" in the manuscript rather than specifically "MGv" to transparently acknowledge the potential for viral spread in some experiments.
We hope this explanation clarifies the strategies we employed to address the accuracy of viral injections, as well as how we managed potential viral spread. We have also added a brief information in the revised manuscript to reflect these points and acknowledge the inherent variability in viral delivery.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript presents an important machine-learning-based approach to the automated detection of urine and fecal deposits by rodents, key ethological behaviors that have traditionally been very poorly studied. The strength of evidence for the claim is solid, showing accuracy near 90% across several contexts. Training and testing for the specific contexts used by other experimenters, however, is probably warranted to make the model most relevant to the data that may be analyzed.
-
Reviewer #1 (Public review):
Summary:
The manuscript provides a novel method for the automated detection of scent marks from urine and feces in rodents. Given the importance of scent communication in these animals and their role as model organisms, this is a welcome tool.
Strengths:
The method uses a single video stream to allow for the distinction between urine and feces. It is automated.
Weaknesses:
The accuracy is decent but not perfect and may be too low to detect some effects that are biologically real but subtle (e.g. less than 10% differences). For many assays, however, this tools will be useful.
-
Author response:
The following is the authors’ response to the original reviews
We thank the reviewers for their constructive and helpful comments, which led us to make major changes in the model and manuscript, including adding the results of new experiments and analyses. We believe that the revised manuscript is much better than the previous version and that it addresses all issued raised by the reviewers.
Summary of changes made in the revised manuscript:
(1) We increased the training set size from 39 video clips to 97 video clips and the testing set size from 25 video clips to 60 video clips. The increase in training set size improved the overall accuracy from a mean F1 score of 0.81 in the previous version to a mean F1 score of 0.891 (see Figure 2 and Figure 3) in the current version. Specifically, the F1 score for urine detection was improved from 0.79 to 0.88.
(2) We further evaluated the accuracy of the DeePosit algorithm in comparison to a second human annotator and found that the algorithm accuracy is comparable to human-level accuracy.
(3) The additional test videos allowed us to test the consistency of the algorithm performance across gender, space, time, and experiment type (SP, SxP, and ESPs). We found consistent levels of performance across all categories (see Figure 3), suggesting that errors made by the algorithm are uniform across conditions, hence should not create any bias of the results.
(4) In addition, we tested the algorithm performance on a second strain of mice (male C57BL/6) in a different environmental condition (white arena instead of a black one) and found that the algorithm achieves comparable accuracy, even though C57BL/6 mice and white arena were not included in the training set. Thus, the algorithm seems to be robust and efficient across various experimental conditions.
(5) Analyzing urination and defecation dynamics in an additional strain of mice revealed interesting strain-specific features, as discussed in the revised manuscript.
(6) Overall, we found DeePosit accuracy to be stable with no significant bias across stages of the experiment, types of the experiment, gender of the mice, strain of mice, and across experimental conditions.
(7) We also compared the performance of DeePosit to a classic object detection algorithm: YOLOv8. We trained YOLOv8 both on a single image input (YOLOv8 Gray) and on 3 image inputs representing a sequence of three time points around the ground truth event (t): t+0, t+10, and t+30 seconds (YOLOv8 RGB). DeePosit achieved significantly better accuracy over both YOLOv8 alternatives. YOLOv8 RGB achieved better accuracy than YOLOv8 Gray, suggesting that temporal information is important for this task. It's worth mentioning that while YOLOv8 requires the annotator to draw rectangles surrounding each urine spot or feces as part of the training set, our algorithm training set used just a single click inside each spot, allowing faster generation of training sets.
(8) As for the algorithm parameters, we tested the effect of the main parameter of the preliminary detection (the temperature threshold for the detection of a new blob) and found that a threshold of 1.6°C gave the best accuracy and used this parameter for all of the experiments instead of 1.1°C which was used in the original manuscript. It's worth mentioning that the performance is quite stable (mean F1 score of 0.88-0.89) for the thresholds between 1.1°C and 3°C (Figure 3—Figure Supplement 2).
(9) We also checked if changing the input length of the video clip that is fed to the classifier affects the accuracy by training the classifier with -11..30 seconds video clips (41 seconds in total) instead of -11..60 seconds (71 seconds in total) and found no difference in accuracy.
(10) In the revised paper, we report recall, precision, and F1 scores in the caption of the relevant figures and also supply Excel files with the full statistics for each of the figures.
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The manuscript provides a novel method for the automated detection of scent marks from urine and feces in rodents. Given the importance of scent communication in these animals and their role as model organisms, this is a welcome tool.
We thank the reviewer for the positive assessment of our tool
Strengths:
The method uses a single video stream (thermal video) to allow for the distinction between urine and feces. It is automated.
Weaknesses:
The accuracy level shown is lower than may be practically useful for many studies. The accuracy of urine is 80%.
We have trained the model better, using a larger number of video clips. The increase in training set size improved the overall accuracy from a mean F1 score of 0.81 in the previous version to a mean F1 score of 0.891 (see Figure 2 and Figure 3) in the current version. Specifically, the F1 score for urine detection was improved from 0.79 to 0.88.
This is understandable given the variability of urine in its deposition, but makes it challenging to know if the data is accurate. If the same kinds of mistakes are maintained across many conditions it may be reasonable to use the software (i.e., if everyone is under/over counted to the same extent). Differences in deposition on the scale of 20% would be challenging to be confident in with the current method, though differences of the magnitude may be of biological interest. Understanding how well the data maintain the same relative ranking of individuals across various timing and spatial deposition metrics may help provide further evidence for the utility of the method.
The additional test videos allowed us to test the consistency of the algorithm performance across gender, space, time and experiment type (SP, SxP, and ESP). We found consistent levels of performance across all categories (see Figure 3), suggesting that errors made by the algorithm are uniform across conditions, hence should not create any bias of the results.
Reviewer #2 (Public Review):
Summary:
The authors built a tool to extract the timing and location of mouse urine and fecal deposits in their laboratory set up. They indicate that they are happy with the results they achieved in this effort.
Yes, we are.
The authors note urine is thought to be an important piece of an animal's behavioral repertoire and communication toolkit so methods that make studying these dynamics easier would be impactful.
We thank the reviewer for the positive assessment of our work.
Strengths:
With the proposed method, the authors are able to detect 79% of the urine that is present and 84% of the feces that is present in a mostly automated way.
Weaknesses:
The method proposed has a large number of design choices across two detection steps that aren't investigated. I.e. do other design choices make the performance better, worse, or the same?
We chose to use a heuristic preliminary detection algorithm for the detection of warm blobs, since warm blobs can be robustly detected with heuristic algorithms without the need for a training set. This design selection might allow easier adaptation of our algorithm for different types of arenas. Another advantage of using a heuristic preliminary detection is the easy control of the preliminary detection parameters such as the minimum temperature difference for detecting a blob, size limits of the detected blob, cooldown rate and so on that may help in adopting it to new conditions. As for the classifier, we chose to feed it with a relatively small window surrounding each preliminary detection, and hence it is not affected by the arena’s appearance outside of its region of interest. This should allow lower sensitivity to the arena’s appearance.
As for the algorithm parameters, we tested the effect of the main parameter of the preliminary detection (the temperature threshold for the detection of a new blob) and found that a threshold of 1.6°C gave the best accuracy and used this parameter for all of the experiments instead of 1.1°C which was used in the original manuscript. It's worth mentioning that the performance is quite stable (mean F1 score of 0.88-0.89) for the thresholds between 1.1°C and 3°.
We also checked if changing the input length of the video clip fed to the classifier affects the accuracy by training the classifier with -11..30 seconds video clips (41 seconds in total) instead of -11..60 seconds (71 seconds in total) and found no difference in accuracy.
Overall, the algorithm's accuracy seems to be rather stable across various choices of parameters.
Are these choices robust across a range of laboratory environments?
We tested the algorithm performance on a second strain of mice (male C57BL/6) in a different environmental condition (white arena instead of a black one) and found that the algorithm achieves comparable accuracy, even though C57BL/6 mice and white arena were not included in the training set. Thus, the algorithm seems to be robust and efficient across various experimental conditions.
How much better are the demonstrated results compared to a simple object detection pipeline (i.e. FasterRCNN or YOLO on the raw heat images)?
We compared the performance of DeePosit to a classic object detection algorithm: YOLOv8. We trained YOLOv8 both on a single image input (YOLOv8 Gray) and on 3 image inputs representing a sequence of three time points around the ground truth event (t): t+0, t+10, and t+30 seconds (YOLOv8 RGB). DeePosit achieved significantly better accuracy over both YOLOv8 alternatives. YOLOv8 RGB achieved better accuracy than YOLOv8 Gray, suggesting that temporal information is important for this task. It's worth mentioning that while YOLOv8 requires annotator to draw rectangles surrounding each urine spot or feces as part of the training set, our algorithm training set used just a single click inside each spot, allowing faster generation of a training sets.
The method is implemented with a mix of MATLAB and Python.
That is right.
One proposed reason why this method is better than a human annotator is that it "is not biased." While they may mean it isn't influenced by what the researcher wants to see, the model they present is still statistically biased since each object class has a different recall score. This wasn't investigated. In general, there was little discussion of the quality of the model.
We tested the consistency of the algorithm performance across gender, space, time and experiment type (SP, SxP, and ESP). We found consistent levels of performance across all categories (see Figure 3), suggesting that errors made by the algorithm are uniform across conditions, hence should ne create any bias of the results. Specifically, the detection accuracy is similar between urine and feces, hence should not impose a bias between the various object classes.
Precision scores were not reported.
In the revised paper we report recall, precision, and F1 scores in the caption of the relevant figures and also supply Excel files with the full statistics for each of the figures.
Is a recall value of 78.6% good for the types of studies they and others want to carry out? What are the implications of using the resulting data in a study?
We have trained the model better, using a larger number of video clips. The increase in training set size improved the overall accuracy from a mean F1 score of 0.81 in the previous version to a mean F1 score of 0.891 (see Figure 2 and Figure 3) in the current version. Specifically, the F1 score for urine detection was improved from 0.79 to 0.88.
How do these results compare to the data that would be generated by a "biased human?"
We further evaluated the accuracy of the DeePosit algorithm in comparison to a second human annotator and found that the algorithm accuracy is comparable to human-level accuracy (Figure 3).
5 out of the 6 figures in the paper relate not to the method but to results from a study whose data was generated from the method. This makes a paper, which, based on the title, is about the method, much longer and more complicated than if it focused on the method.
We appreciate the reviewer's comment, but the analysis of this new dataset by DeePosit demonstrates how the algorithm may be used to reveal novel and distinguishable dynamics of urination and defecation activities during social interactions, which were not yet reported.
Also, even in the context of the experiments, there is no discussion of the implications of analyzing data that was generated from a method with precision and recall values of only 7080%. Surely this noise has an effect on how to correctly calculate p-values etc. Instead, the authors seem to proceed like the generated data is simply correct.
As mentioned above, the increase in training set size improved the overall accuracy from a mean F1 score of 0.81 in the previous version to a mean F1 score of 0.891 (see Figure 2 and Figure 3) in the current version. Specifically, the F1 score for urine detection was improved from 0.79 to 0.88.
Reviewer #3 (Public Review):
Summary:
The authors introduce a tool that employs thermal cameras to automatically detect urine and feces deposits in rodents. The detection process involves a heuristic to identify potential thermal regions of interest, followed by a transformer network-based classifier to differentiate between urine, feces, and background noise. The tool's effectiveness is demonstrated through experiments analyzing social preference, stress response, and temporal dynamics of deposits, revealing differences between male and female mice.
Strengths:
The method effectively automates the identification of deposits
The application of the tool in various behavioral tests demonstrates its robustness and versatility.
The results highlight notable differences in behavior between male and female mice
We thank the reviewer for the positive assessment of our work.
Weaknesses:
The definition of 'start' and 'end' periods for statistical analysis is arbitrary. A robustness check with varying time windows would strengthen the conclusions.
In all the statistical tests conducted in the revised manuscript, we have used a time period of 4 minutes for the analysis. We did not used the last minute of each stage for the analysis since the input of DeePosit requires 1 minute of video after the event. Nevertheless, we also conducted the same tests using a 5-minute period and found similar results (Figure 5—Figure Supplement 1).
The paper could better address the generalizability of the tool to different experimental setups, environments, and potentially other species.
As mentioned above, we tested the algorithm performance on a second strain of mice (male C57BL/6) in a different environmental condition (white arena instead of a black one) and found that the algorithm achieves comparable accuracy, even though C57BL/6 mice and white arena were not included in the training set. Thus, the algorithm seems to be robust and efficient across various experimental conditions.
The results are based on tests of individual animals, and there is no discussion of how this method could be generalized to experiments tracking multiple animals simultaneously in the same arena (e.g., pair or collective behavior tests, where multiple animals may deposit urine or feces).
At the moment, the algorithm cannot be applied for multiple animals freely moving in the same arena. However, in the revised manuscript we explicitly discussed what is needed for adapting the algorithm to perform such analyses.
Recommendations for the authors:
- Add a note and/or perform additional calculations to show that the results do not depend on the specific definitions of 'start' and 'end' periods. For instance, vary the time window thresholds and recalculate the statistics using different windows (e.g., 1-5 minutes instead of 1-4 minutes).
In all the statistical tests conducted in the revised manuscript, we have used a time period of 4 minutes for the analysis. We did not use the last minute of each stage for the analysis since the input of DeePosit requires 1 minute of video after the event. Nevertheless, we also conducted the same tests using a 5-minute period and found similar results (Figure 5—Figure Supplement 1).
- Condense Figures 4, 5, and 6 to simplify the presentation. Focus on demonstrating the effectiveness of the tool rather than detailed experimental outcomes, as the primary contribution of this paper is methodological.
We have added to the revised manuscript one technical figure (Figure 3) comparing the accuracy of the algorithm performance across gender, space, time, and experiment type (SP, SxP, and ESP) as well as comparing its performance to a second human annotator and to YOLOv8. One more partially technical figure (Figure 5) compares the results of the algorithm between white ICR mice in the black arena and black C57BL/6 mice in the white arena. Thus, only Figures 4 and 6 show detailed experimental outcomes.
- Provide more detail on how the preliminary detection procedure and parameters might need adjustment for different experimental setups or conditions. Discuss potential adaptations for field settings or more complex environments.
As for the algorithm parameters, we tested the effect of the main parameter of the preliminary detection (the temperature threshold for the detection of a new blob) and found that a threshold of 1.6°C gave the best accuracy and used this parameter for all of the experiments instead of 1.1°C which was used in the original manuscript. It's worth mentioning that the performance is quite stable (mean F1 score of 0.88-0.89) for the thresholds between 1.1°C and 3°.
We also checked if changing the input length of the video clip that is fed to the classifier affects the accuracy by training the classifier with -11..30 seconds video clips (41 seconds in total) instead of -11..60 seconds (71 seconds in total) and found no difference in accuracy.
Overall, the algorithm's accuracy seems to be rather stable across various choices of parameters.
Editor's note:
Should you choose to revise your manuscript, please ensure your manuscript includes full statistical reporting including exact p-values wherever possible alongside the summary statistics (test statistic and df) and 95% confidence intervals. These should be reported for all key questions and not only when the p-value is less than 0.05 in the main manuscript.
We have deposited the detailed statistics of each figure in https://github.com/davidpl2/DeePosit/tree/main/FigStat/PostRevision
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study combines a computational language model, i.e., HM-LSTM, and temporal response function (TRF) modeling to quantify the neural encoding of hierarchical linguistic information in speech, and addresses how hearing impairment affects neural encoding of speech. The analysis has been significantly improved during the revision but remain somewhat incomplete - The TRF analysis should be more clearly described and controlled. The study is of potential interest to audiologists and researchers who are interested in the neural encoding of speech.
-
Reviewer #1 (Public review):
About R squared in the plots:<br /> The authors have used a z-scored R squared in the main ridge regression plots. While this may be interpretable, it seems non-standard and overly complicated. The authors could use a simple Pearson r to be most direct and informative (and in line with similar work, including Goldstein et al. 2022 which they mentioned). This way the sign of the relationships is preserved.
About the new TRF analysis:<br /> The new TRF analysis is a necessary addition and much appreciated. However, it is missing the results for the acoustic regressors, which should be there analogous to the HM-LSTM ridge analysis. The authors should also specify which software they have utilized to conduct the new TRF analysis. It also seems that the linguistic predictors/regressors have been newly constructed in a way more consistent with previous literature (instead of using the HM-LSTM features); these specifics should also be included in the manuscript (did it come from Montreal Forced Aligner, etc.?). Now that the original HM-LSTM can be compared to a more standard TRF analysis, it is apparent that the results are similar.
The authors' wording about this suggests that these new regressors have a nonzero sample at each linguistic event's offset, not onset. This should also be clarified. As the authors know, the onset would be more standard, and using the offset has implications for understanding the timing of the TRFs, as a phoneme has a different duration than a word, which has a different duration from a sentence, etc.
About offsets:<br /> TRFs can still be interpretable using the offset timings though; however, the main original analysis seems to be utilizing the offset times in a different, more confusing way. The authors still seem to be saying that only the peri-offset time of the EEG was analyzed at all, meaning the vast majority of the EEG trial durations do not factor into the main HM-LSTM response results whatsoever. The way the authors describe this does not seem to be present in any other literature, including the papers that they cite. Therefore, much more clarification on this issue is needed. If the authors mean that the regressors are simply time-locked to the EEG by aligning their offsets (rather than their onsets, because they have varying onsets or some such experimental design complexity), then this would be fine. But it does not seem to be what the authors want to say. This may be a miscommunication about the methods, or the authors may have actually only analyzed a small portion of the data. Either way, this should be clarified to be able to be interpretable.
-
Reviewer #2 (Public review):
This study presents a valuable finding on the neural encoding of speech in listeners with normal hearing and hearing impairment, uncovering marked differences in how attention to different levels of speech information is allocated, especially when having to selectively attend to one speaker while ignoring an irrelevant speaker. The results overall support the claims of the authors, although a more explicit behavioural task to demonstrate successful attention allocation would have strengthened the study. Importantly, the use of more "temporally continuous" analysis frameworks could have provided a better methodology to assess the entire time course of neural activity during speech listening. Despite these limitations, this interesting work will be useful to the hearing impairment and speech processing research community.
The study compares speech-in-quiet vs. multi-talker scenarios, allowing to assess within-participant the impact that the addition of a competing talker has on the neural tracking of speech. Moreover, the inclusion of a population with hearing loss is useful to disentangle the effects of attention orienting and hearing ability. The diagnosis of high-frequency hearing loss was done as part of the experimental procedure by professional audiologists, leading to a high control of the main contrast of interest for the experiment. Sample size was big, allowing to draw meaningful comparisons between the two populations.
An HM-LSTM model was employed to jointly extract speech features spanning from the stimulus acoustics to word-level and phrase-level information, represented by embeddings extracted at successive layers of the model. The model was specifically expanded to include lower level acoustic and phonetic information, reaching a good representation of all intermediate levels of speech.
Despite conveniently extracting all features jointly, the HM-LSTM model processes linguistic input sentence-by-sentence, and therefore only allows to assess the corresponding EEG data at sentence offset. If I understood correctly, while the sentence information extracted with the HM-LSTM reflects the entire sentence - in terms of its acoustic, phonetic and more abstract linguistic features - it only gives a condensed final representation of the sentence. As such, feature extraction with the HM-LSTM is not compatible with a continuous temporal mapping on the EEG signal, and this is the main reason behind the authors' decision to fit a regression at nine separate time points surrounding sentence offsets.
While valid and previously used in the literature, this methodology, in the particular context of this experiment, might be obscuring important attentional effects impacted by hearing-loss. By fitting a regression only around sentence-final speech representations, the method might be overlooking the more "online" speech processing dynamics, and only assessing the permanence of information at different speech levels at sentence offset. In other words, the acoustic attentional bias between Attended and Unattended speech might exist even in hearing-impaired participants but, due to a lower encoding or permanence of acoustic information in this population, it might only emerge when using methodologies with a higher temporal resolution, such as Temporal Response Functions (TRFs). If a univariate TRF fit simply on the continuous speech envelope did not show any attentional bias (different trial lengths should not be a problem for fitting TRFs), I would be entirely convinced of the result. For now, I am unsure on how to interpret this finding.
Despite my doubts on the appropriateness of condensed speech representations and single-point regression for acoustic features in particular, the current methodology allows the authors to explore their research questions, and the results support their conclusions.
This work presents an interesting finding on the limits of attentional bias in a cocktail-party scenario, suggesting that fundamentally different neural attentional filters are employed by listeners with high-frequency hearing loss, even in terms of the tracking of speech acoustics. Moreover, the rich dataset collected by the authors is a great contribution to open science and will offer opportunities for re-analysis.
-
Author response:
The following is the authors’ response to the original reviews
eLife Assessment
This valuable study investigates how hearing impairment affects neural encoding of speech, in particular the encoding of hierarchical linguistic information. The current analysis provides incomplete evidence that hearing impairment affects speech processing at multiple levels, since the novel analysis based on HM-LSTM needs further justification. The advantage of this method should also be further explained. The study can also benefit from building a stronger link between neural and behavioral data.
We sincerely thank the editors and reviewers for their detailed and constructive feedback.
We have revised the manuscript to address all of the reviewers’ comments and suggestions. The primary strength of our methods lies in the use of the HM-LSTM model, which simultaneously captures linguistic information at multiple levels, ranging from phonemes to sentences. As such, this model can be applied to other questions regarding hierarchical linguistic processing. We acknowledge that our current behavioral results from the intelligibility test may not fully differentiate between the perception of lower-level acoustic/phonetic information and higher-level meaning comprehension. However, it remains unclear what type of behavioral test would effectively address this distinction. We aim to xplore this connection further in future studies.
Public Reviews:
Reviewer #1 (Public Review):
The authors are attempting to use the internal workings of a language hierarchy model, comprising phonemes, syllables, words, phrases, and sentences, as regressors to predict EEG recorded during listening to speech. They also use standard acoustic features as regressors, such as the overall envelope and the envelopes in log-spaced frequency bands. This is valuable and timely research, including the attempt to show differences between normal-hearing and hearing-impaired people in these regards. I will start with a couple of broader questions/points, and then focus my comments on three aspects of this study: The HM-LSTM language model and its usage, the time windows of relevant EEG analysis, and the usage of ridge regression.
Firstly, as far as I can tell, the OSF repository of code, data, and stimuli is not accessible without requesting access. This needs to be changed so that reviewers and anybody who wants or needs to can access these materials.
It is my understanding that keeping the repository private during the review process and making them public after acceptance is standard practice. As far as I understand, although the OSF repository was private, anyone with the link should be able to access it. I have now made the repository public.
What is the quantification of model fit? Does it mean that you generate predicted EEG time series from deconvolved TRFs, and then give the R2 coefficient of determination between the actual EEG and predicted EEG constructed from the convolution of TRFs and regressors? Whether or not this is exactly right, it should be made more explicit.
Model fit was measured by spatiotemporal cluster permutation tests (Maris & Oostenveld, 2007) on the contrasts of the timecourses of the z-transformed coefficient of determination (R<sup>2</sup>). For instance, to assess whether words from the attended stimuli better predict EEG signals during the mixed speech compared to words from the unattended stimuli, we used the 150dimensional vectors corresponding to the word layer from our LSTM model for the attended and unattended stimuli as regressors. We then fit these regressors to the EEG signals at 9 time points (spanning -100 ms to 300 ms around the sentence offsets, with 50 ms intervals). We then conducted one-tailed two-sample t-tests to determine whether the differences in the contrasts of the R<sup>2</sup> timecourses were statistically significant. Note that we did not perform TRF analyses. We have clarified this description in the “Spatiotemporal clustering analysis” section of the “Methods and Materials” on p.10 of the manuscript.
About the HM-LSTM:
• In the Methods paragraph about the HM-LSTM, a lot more detail is necessary to understand how you are using this model. Firstly, what do you mean that you "extended" it, and what was that procedure?
The original HM-LSTM model developed by Chung et al. (2017) consists of only two levels: the word level and the phrase level (Figure 1b from their paper). By “extending” the model, we mean that we expanded its architecture to include five levels: phoneme, syllable, word, phrase, and sentence. Since our input consists of phoneme embeddings, we cannot directly apply their model, so we trained our model on the WenetSpeech corpus (Zhang et al., 2021), which provides phoneme-level transcripts. We have added this clarification on p.4 of the manuscript.
• And generally, this is the model that produces most of the "features", or regressors, whichever word we like, for the TRF deconvolution and EEG prediction, correct?
Yes, we extracted the 2048-dimensional hidden layer activity from the model to represent features for each sentence in our speech stimuli at the phoneme, syllable, word, phrase and sentence levels. But we did not perform any TRF deconvolution, we fit these features (downsampled to 150-dimension using PCA) to the EEG signals at 9 timepoints around the offset of each sentence using ridge regression. We have now added a multivariate TRF (mTRF) analysis following Reviewer 3’s suggestions, and the results showed similar patterns to the current results (see Figure S2). We have added the clarification in the “Ridge regression at different time latencies” section of the “Methods and Materials” on p.10 of the manuscript.
Resutls from the mTRF analyses were added on p.7 of the manuscript.
• A lot more detail is necessary then, about what form these regressors take, and some example plots of the regressors alongside the sentences.
The linguistic regressors are just 5 150-dimensional vectors, each corresponding to one linguistic level, as shown in Figure 1B.
• Generally, it is necessary to know what these regressors look like compared to other similar language-related TRF and EEG/MEG prediction studies. Usually, in the case of e.g. Lalor lab papers or Simon lab papers, these regressors take the form of single-sample event markers, surrounded by zeros elsewhere. For example, a phoneme regressor might have a sample up at the onset of each phoneme, and a word onset regressor might have a sample up at the onset of each word, with zeros elsewhere in the regressor. A phoneme surprisal regressor might have a sample up at each phoneme onset, with the value of that sample corresponding to the rarity of that phoneme in common speech. Etc. Are these regressors like that? Or do they code for these 5 linguistic levels in some other way? Either way, much more description and plotting is necessary in order to compare the results here to others in the literature.
No, these regressors were not like that. They were 150-dimensional vectors (after PCA dimension reduction) extracted from the hidden layers of the HM-LSTM model. After training the model on the WenetSpeech corpus, we ran it on our speech stimuli and extracted representations from the five hidden layers to correspond to the five linguistic levels. As mentioned earlier, we did not perform TRF analyses; instead, we used ridge regression to predict EEG signals around the offset of each sentence, a method commonly employed in the literature (e.g., Caucheteux & King, 2022; Goldstein et al., 2022; Schmitt et al., 2021; Schrimpf et al., 2021). For instance, Goldstein et al. (2022) used word embeddings from GPT-2 to predict ECoG activity surrounding the onset of each word during naturalistic listening. We have included these literatures on p.3 in the manuscript, and the method is illustrated in Figure 1B.
• You say that the 5 regressors that are taken from the trained model's hidden layers do not have much correlation with each other. However, the highest correlations are between syllable and sentence (0.22), and syllable and word (0.17). It is necessary to give some reason and interpretation of these numbers. One would think the highest correlation might be between syllable and phoneme, but this one is almost zero. Why would the syllable and sentence regressors have such a relatively high correlation with each other, and what form do those regressors take such that this is the case?
All the regressors are represented as 2048-dimensional vectors derived from the hidden layers of the trained HM-LSTM model. We applied the trained model to all 284 sentences in our stimulus text, generating a set of 284 × 2048-dimensional vectors. Next, we performed Principal Component Analysis (PCA) on the 2048 dimensions and extracted the first 100 principal components (PCs), resulting in 284 × 100-dimensional vectors for each regressor. These 284 × 100 matrices were then flattened into 28,400-dimensional vectors. Subsequently, we computed the correlation matrix for the z-transformed 28,400-dimensional vectors of our five linguistic regressors. The code for this analysis, lstm_corr.py, can be found in our OSF repository. We have added a section “Correlation among linguistic features” in “Materials and Methods” on p.10 of the manuscript.
We consider the observed coefficients of 0.17 and 0.22 to be relatively low compared to prior model-brain alignment studies which report correlation coefficients above 0.5 for linguistic regressors (e.g., Gao et al., 2024; Sugimoto et al., 2024). In Chinese, a single syllable can also function as a word, potentially leading to higher correlations between regressors for syllables and words. However, we refrained from overinterpreting the results to suggest a higher correlation between syllable and sentence compared to syllable and word. A paired ttest of the syllable-word coefficients versus syllable-sentence coefficients across the 284 sentences revealed no significant difference (t(28399)=-3.96, p=1). We have incorporated this information into p.5 of the manuscript.
• If these regressors are something like the time series of zeros along with single sample event markers as described above, with the event marker samples indicating the onset of the relevant thing, then one would think e.g. the syllable regressor would be a subset of the phoneme regressor because the onset of every syllable is a phoneme. And the onset of every word is a syllable, etc.
All the regressors are aligned to 9 time points surrounding sentence offsets (-100 ms to 300 ms with a 50 ms interval). This is because all our regressors are taken from the HM-LSTM model, where the input is the phoneme representation of a sentence (e.g., “zh ə_4 y ie_3 j iəu_4 x iaŋ_4 sh uei_3 y ii_2 y aŋ_4”). For each unit in the sentence, the model generates five 2048dimensional vectors, each corresponding to the five linguistic levels of the entire sentence. We have added the clarification on p.11 of the manuscript.
For the time windows of analysis:
• I am very confused, because sometimes the times are relative to "sentence onset", which would mean the beginning of sentences, and sometimes they are relative to "sentence offset", which would mean the end of sentences. It seems to vary which is mentioned. Did you use sentence onsets, offsets, or both, and what is the motivation?
• If you used onsets, then the results at negative times would not seem to mean anything, because that would be during silence unless the stimulus sentences were all back to back with no gaps, which would also make that difficult to interpret.
• If you used offsets, then the results at positive times would not seem to mean anything, because that would be during silence after the sentence is done. Unless you want to interpret those as important brain activity after the stimuli are done, in which case a detailed discussion of this is warranted.
Thank you very much for pointing this out. All instances of “sentence onset” were typos and should be corrected to “sentence offset.” We chose offset because the regressors are derived from the hidden layer activity of our HM-LSTM model, which processes the entire sentence before generating outputs. We have now corrected all the typos. In continuous speech, there is no distinct silence period following sentence offsets. Additionally, lexical or phrasal processing typically occurs 200 ms after stimulus offsets (Bemis & Pylkkanen, 2011; Goldstein et al., 2022; Li et al., 2024; Li & Pylkkänen, 2021). Therefore, we included a 300 ms interval after sentence offsets in our analysis, as our regressors encompass linguistic levels up to the sentence level. We have added this motivation on p.11 of the manuscript.
• For the plots in the figures where the time windows and their regression outcomes are shown, it needs to be explicitly stated every time whether those time windows are relative to sentence onset, offset, or something else.
Completely agree and thank you very much for the suggestion. We have now added this information on Figure 4-6.
• Whether the running correlations are relative to sentence onset or offset, the fact that you can have numbers outside of the time of the sentence (negative times for onset, or positive times for offset) is highly confusing. Why would the regressors have values outside of the sentence, meaning before or after the sentence/utterance? In order to get the running correlations, you presumably had the regressor convolved with the TRF/impulse response to get the predicted EEG first. In order to get running correlation values outside the sentence to correlate with the EEG, you would have to have regressor values at those time points, correct? How does this work?
As mentioned earlier, we did not perform TRF analyses or convolve the regressors. Instead, we conducted regression analyses at each of the 9 time points surrounding the sentence offsets, following standard methods commonly used in model-brain alignment studies (e.g., Gao et al., 2024; Goldstein et al., 2022). The time window of -100 to 300 ms was selected based on prior findings that lexical and phrasal processing typically occurs 200–300 ms after word offsets (Bemis & Pylkkanen, 2011; Goldstein et al., 2022; Li et al., 2024; Li & Pylkkänen, 2021). Additionally, we included the -100 to 200 ms time period in our analysis to examine phoneme and syllable level processing (cf. Gwilliams et al., 2022). We have added the clarification on p. of the manuscript.
• In general, it seems arbitrary to choose sentence onset or offset, especially if the comparison is the correlation between predicted and actual EEG over the course of a sentence, with each regressor. What is going on with these correlations during the middle of the sentences, for example? In ridge regression TRF techniques for EEG/MEG, the relevant measure is often the overall correlation between the predicted and actual, calculated over a longer period of time, maybe the entire experiment. Here, you have calculated a running comparison between predicted and actual, and thus the time windows you choose to actually analyze can seem highly cherry-picked, because this means that most of the data is not actually analyzed.
The rationale for choosing sentence offsets instead of onsets is that we are aligning the HM-LSTM model’s activity with EEG responses, and the input to the model consists of phoneme representations of the entire sentence at one time. In other words, the model needs to process the whole sentence before generating representations at each linguistic level. Therefore, the corresponding EEG responses should also align with the sentence offsets, occurring after participants have seen the complete sentence. The ridge regression followed the common practice in model-brain alignment studies (e.g., Gao et al., 2024; Goldstein et al., 2022; Huth et al., 2016; Schmitt et al., 2021; Schrimpf et al., 2021), and the time window is not cherrypicked but based on prior literature reporting lexical and sublexical processing at these time period (e.g., Bemis & Pylkkanen, 2011; Goldstein et al., 2022; Gwilliams et al., 2022; Li et al., 2024; Li & Pylkkänen, 2021).
• In figures 5 and 6, some of the time window portions that are highlighted as significant between the two lines have the lines intersecting. This looks like, even though you have found that the two lines are significantly different during that period of time, the difference between those lines is not of a constant sign, even during that short period. For instance, in figure 5, for the syllable feature, the period of 0 - 200 ms is significantly different between the two populations, correct? But between 0 and 50, normal-hearing are higher, between 50 and 150, hearing-impaired are higher, and between 150 and 200, normal-hearing are higher again, correct? But somehow they still end up significantly different overall between 0 and 200 ms. More explanation of occurrences like these is needed.
The intersecting lines in Figures 5 and represent the significant time windows for withingroup comparisons (i.e., significant model fit compared to 0). They do not depict betweengroup comparisons, as no significant contrasts were found between the groups. For example, in Figure 1, the significant time windows for the acoustic models are shown separately for the hearing-impaired and normal-hearing groups. No significant differences were observed, as indicated by the sensor topography. We have now clarified this point in the captions for Figures 5 and 6.
Using ridge regression:
• What software package(s) and procedure(s) were specifically done to accomplish this? If this is ridge regression and not just ordinary least squares, then there was at least one non-zero regularization parameter in the process. What was it, how did it figure in the modeling and analysis, etc.?
The ridge regression was performed using customary python codes, making heavy use of the sklearn (v1.12.0) package. We used ridge regression instead of ordinary least squares regression because all our linguistic regressors are 150-dimensional dense vectors, and our acoustic regressors are 130-dimension vectors (see “Acoustic features of the speech stimuli” in “Materials and Methods”). We kept the default regularization parameter (i.e., 1). This ridge regression methods is commonly used in model-brain alignment studies, where the regressors are high-dimensional vectors taken from language models (e.g., Gao et al., 2024; Goldstein et al., 2022; Huth et al., 2016; Schmitt et al., 2021; Schrimpf et al., 2021). The code ridge_lstm.py can be found in our OSF repository, and we have added the more detailed description on p.11 of the manuscript.
• It sounds like the regressors are the hidden layer activations, which you reduced from 2,048 to 150 non-acoustic, or linguistic, regressors, per linguistic level, correct? So you have 150 regressors, for each of 5 linguistic levels. These regressors collectively contribute to the deconvolution and EEG prediction from the resulting TRFs, correct? This sounds like a lot of overfitting. How much correlation is there from one of these 150 regressors to the next? Elsewhere, it sounds like you end up with only one regressor for each of the 5 linguistic levels. So these aspects need to be clarified.
• For these regressors, you are comparing the "regression outcomes" for different conditions; "regression outcomes" are the R2 between predicted and actual EEG, which is the coefficient of determination, correct? If this is R2, how is it that you have some negative numbers in some of the plots? R2 should be only positive, between 0 and 1.
Yes we reduced 2048-dimensional vectors for each of the 5 linguistic levels to 150 using PCA, mainly for saving computational resources. We used ridge regression, following the standard practice in the field (e.g., Gao et al., 2024; Goldstein et al., 2022; Huth et al., 2016; Schmitt et al., 2021; Schrimpf et al., 2021).
Yes, the regression outcomes are the R<sup>2</sup> values representing the fit between the predicted and actual EEG data. However, we reported normalized R<sup>2</sup> values which are ztransformed in the plots. All our spatiotemporal cluster permutation analyses were conducted using the z-transformed R<sup>2</sup> values. We have added this clarification both in the figure captions and on p.11 of the manuscript. As a side note, R<sup>2</sup> values can be negative because they are not the square of a correlation coefficient. Rather, R<sup>2</sup> compares the fit of the chosen model to that of a horizontal straight line (the null hypothesis). If the chosen model fits the data worse than the horizontal line, then R<sup>2</sup> value becomes negative: https://www.graphpad.com/support/faq/how-can-rsup2sup-be-negative
Reviewer #2 (Public Review):
This study compares neural responses to speech in normal-hearing and hearing-impaired listeners, investigating how different levels of the linguistic hierarchy are impacted across the two cohorts, both in a single-talker and multi-talker listening scenario. It finds that, while normal-hearing listeners have a comparable cortical encoding of speech-in-quiet and attended speech from a multi-talker mixture, participants with hearing impairment instead show a reduced cortical encoding of speech when it is presented in a competing listening scenario. When looking across the different levels of the speech processing hierarchy in the multi-talker condition, normal-hearing participants show a greater cortical encoding of the attended compared to the unattended stream in all speech processing layers - from acoustics to sentencelevel information. Hearing-impaired listeners, on the other hand, only have increased cortical responses to the attended stream for the word and phrase levels, while all other levels do not differ between attended and unattended streams.
The methods for modelling the hierarchy of speech features (HM-LSTM) and the relationship between brain responses and specific speech features (ridge-regression) are appropriate for the research question, with some caveats on the experimental procedure. This work offers an interesting insight into the neural encoding of multi-talker speech in listeners with hearing impairment, and it represents a useful contribution towards understanding speech perception in cocktail-party scenarios across different hearing abilities. While the conclusions are overall supported by the data, there are limitations and certain aspects that require further clarification.
(1) In the multi-talker section of the experiment, participants were instructed to selectively attend to the male or the female talker, and to rate the intelligibility, but they did not have to perform any behavioural task (e.g., comprehension questions, word detection or repetition), which could have demonstrated at least an attempt to comply with the task instructions. As such, it is difficult to determine whether the lack of increased cortical encoding of Attended vs. Unattended speech across many speech features in hearing-impaired listeners is due to a different attentional strategy, which might be more oriented at "getting the gist" of the story (as the increased tracking of only word and phrase levels might suggest), or instead it is due to hearing-impaired listeners completely disengaging from the task and tuning back in for selected key-words or word combinations. Especially the lack of Attended vs. Unattended cortical benefit at the level of acoustics is puzzling and might indicate difficulties in performing the task. I think this caveat is important and should be highlighted in the Discussion section. RE: Thank you very much for the suggestion. We admit that the hearing-impaired listeners might adopt different attentional strategies or potentially disengage from the task due to comprehension difficulties. However, we would like to emphasize that our hearing-impaired participants have extended high-frequency (EHF) hearing loss, with impairment only at frequencies above 8 kHz. Their condition is likely not severe enough to cause them to adopt a markedly different attentional strategy for this task. Moreover, it is possible that our normalhearing listeners may also adopt varying attentional strategies, yet the comparison still revealed notable differences.We have added the caveat in the Discussion section on p.8 of the manuscript.
(2) In the EEG recording and preprocessing section, you state that the EEG was filtered between 0.1Hz and 45Hz. Why did you choose this very broadband frequency range? In the literature, speech responses are robustly identified between 0.5Hz/1Hz and 8Hz. Would these results emerge using a narrower and lower frequency band? Considering the goal of your study, it might also be interesting to run your analysis pipeline on conventional frequency bands, such as Delta and Theta, since you are looking into the processing of information at different temporal scales.
Indeed, we have decomposed the epoched EEG time series for each section into six classic frequency bands components (delta 1–3 Hz, theta 4–7 Hz, alpha 8–12 Hz, beta 12–20 Hz, gamma 30–45 Hz) by convolving the data with complex Morlet wavelets as implemented in MNE-Python (version 0.24.0). The number of cycles in the Morlet wavelets was set to frequency/4 for each frequency bin. The power values for each time point and frequency bin were obtained by taking the square root of the resulting time-frequency coefficients. These power values were normalized to reflect relative changes (expressed in dB) with respect to the 500 ms pre-stimulus baseline. This yielded a power value for each time point and frequency bin for each section. We specifically examined the delta and theta bands, and computed the correlation between the regression outcome (R<sup>2</sup> in the shape of number of subject * sensor * time were flattened for computing correlation) for the five linguistic predictors from these bands and those obtained using data from all frequency bands. The results showed high correlation coefficients (see the correlation matrix in Supplementary Figures S2 for the attended and unattended speech). Therefore, we opted to use the epoched EEG data from all frequency bands for our analyses. We have added this clarification in the Results section on p.5 and the “EEG recording and preprocessing” section in “Materials and Methods” on p.11 of the manuscript.
(3) A paragraph with more information on the HM-LSTM would be useful to understand the model used without relying on the Chung et al. (2017) paper. In particular, I think the updating mechanism of the model should be clarified. It would also be interesting to modify the updating factor of the model, along the lines of Schmitt et al. (2021), to assess whether a HM-LSTM with faster or slower updates can better describe the neural activity of hearing-impaired listeners. That is, perhaps the difference between hearing-impaired and normal-hearing participants lies in the temporal dynamics, and not necessarily in a completely different attentional strategy (or disengagement from the stimuli, as I mentioned above).
Thank you for the suggestion. We have added more details on our HM-LSTM model on p.10 “Hierarchical multiscale LSTM model” in “Materials and Methods”: Our HM-LSTM model consists of 4 layers, at each layer, the model implements a COPY or UPDATE operation at each time step t. The COPY operation maintains the current cell state of without any changes until it receives a summarized input from the lower layer. The UPDATE operation occurs when a linguistic boundary is detected in the layer below, but no boundary was detected at the previous time step t-1. In this case, the cell updates its summary representation, similar to standard RNNs. We agree that exploring modifications to the model’s updating factor would be an interesting direction. However, since we have already observed contrasts between normal-hearing and hearing-impaired listeners using the current model’s update parameters, we believe discussing additional hypotheses would overextend the scope of this paper.
(4) When explaining how you extracted phoneme information, you mention that "the inputs to the model were the vector representations of the phonemes". It is not clear to me whether you extracted specific phonetic features (e.g., "p" sound vs. "b" sound), or simply the phoneme onsets. Could you clarify this point in the text, please?
The model inputs were individual phonemes from two sentences, each transformed into a 1024-dimensional vector using a simple lookup table. This lookup table stores embeddings for a fixed dictionary of all unique phonemes in Chinese. This approach is a foundational technique in many advanced NLP models, enabling the representation of discrete input symbols in a continuous vector space. We have added this clarification on p.10 of the manuscript.
Reviewer #3 (Public Review):
Summary:
The authors aimed to investigate how the brain processes different linguistic units (from phonemes to sentences) in challenging listening conditions, such as multi-talker environments, and how this processing differs between individuals with normal hearing and those with hearing impairments. Using a hierarchical language model and EEG data, they sought to understand the neural underpinnings of speech comprehension at various temporal scales and identify specific challenges that hearing-impaired listeners face in noisy settings.
Strengths:
Overall, the combination of computational modeling, detailed EEG analysis, and comprehensive experimental design thoroughly investigates the neural mechanisms underlying speech comprehension in complex auditory environments.
The use of a hierarchical language model (HM-LSTM) offers a data-driven approach to dissect and analyze linguistic information at multiple temporal scales (phoneme, syllable, word, phrase, and sentence). This model allows for a comprehensive neural encoding examination of how different levels of linguistic processing are represented in the brain.
The study includes both single-talker and multi-talker conditions, as well as participants with normal hearing and those with hearing impairments. This design provides a robust framework for comparing neural processing across different listening scenarios and groups.
Weaknesses:
The analyses heavily rely on one specific computational model, which limits the robustness of the findings. The use of a single DNN-based hierarchical model to represent linguistic information, while innovative, may not capture the full range of neural coding present in different populations. A low-accuracy regression model-fit does not necessarily indicate the absence of neural coding for a specific type of information. The DNN model represents information in a manner constrained by its architecture and training objectives, which might fit one population better than another without proving the non-existence of such information in the other group. To address this limitation, the authors should consider evaluating alternative models and methods. For example, directly using spectrograms, discrete phoneme/syllable/word coding as features, and performing feature-based temporal response function (TRF) analysis could serve as valuable baseline models. This approach would provide a more comprehensive evaluation of the neural encoding of linguistic information.
Our acoustic features are indeed direct the broadband envelopes and the log-mel spectrograms of the speech streams. The amplitude envelope of the speech signal was extracted using the Hilbert transform. The 129-dimension spectrogram and 1-dimension envelope were concatenated to form a 130-dimension acoustic feature at every 10 ms of the speech stimuli. Given the duration of our EEG recordings, which span over 10 minutes, conducting multivariate TRF (mTRF) analysis with such high-dimensional predictors was not feasible. Instead, we used ridge regression to predict EEG responses across 9 temporal latencies, ranging from -100 ms to +300 ms, with additional 50 ms latencies surrounding sentence offsets. To evaluate the model's performance, we extracted the R<sup>2</sup> values at each latency, providing a temporal profile of regression performance over the analyzed time period. This approach is conceptually similar to TRF analysis.
We agree that including baseline models for the linguistic features is important, and we have now added results from mTRF analysis using phoneme, syllable, word, phrase, and sentence rates as discrete predictors (i.e., marking a value of 1 at each unit boundary offset). Our EEG data spans the entire 10-minute duration for each condition, sampled at 10-ms intervals. The TRF results for our main comparison—attended versus unattended conditions— showed similar patterns to those observed using features from our HM-LSTM model. At the phoneme and syllable levels, normal-hearing listeners showed marginally significantly higher TRF weights for attended speech compared to unattended speech at approximately -80 to 150 ms after phoneme offsets (t=2.75, Cohen’s d=0.87, p=0.057), and 120 to 210 ms after syllable offsets (t=3.96, Cohen’s d=0.73d = 0.73, p=0.083). At the word and phrase levels, normalhearing listeners exhibited significantly higher TRF weights for attended speech compared to unattended speech at 190 to 290 ms after word offsets (t=4, Cohen’s d=1.13, p=0.049), and around 120 to 290 ms after phrase offsets (t=5.27, Cohen’s d=1.09, p=0.045). For hearing-impaired listeners, marginally significant effects were observed at 190 to 290 ms after word offsets (t=1.54, Cohen’s d=0.6, p=0.059), and 180 to 290 ms after phrase offsets (t=3.63, Cohen’s d=0.89, p=0.09). These results have been added on p.7 of the manuscript, and the corresponding figure is included as Supplementary F2.
It is not entirely clear if the DNN model used in this study effectively serves the authors' goal of capturing different linguistic information at various layers. Specifically, the results presented in Figure 3C are somewhat confusing. While the phonemes are labeled, the syllables, words, phrases, and sentences are not, making it difficult to interpret how the model distinguishes between these levels of linguistic information. The claim that "Hidden-layer activity for samevowel sentences exhibited much more similar distributions at the phoneme and syllable levels compared to those at the word, phrase and sentence levels" is not convincingly supported by the provided visualizations. To strengthen their argument, the authors should use more quantified metrics to demonstrate that the model indeed captures phrase, word, syllable, and phoneme information at different layers. This is a crucial prerequisite for the subsequent analyses and claims about the hierarchical processing of linguistic information in the brain.
Quantitative measures such as mutual information, clustering metrics, or decoding accuracy for each linguistic level could provide clearer evidence of the model's effectiveness in this regard.
In Figure 3C, we used color-coding to represent the activity of five hidden layers after dimensionality reduction. Each dot on the plot corresponds to one test sentence. Only phonemes are labeled because each syllable in our test sentences contains the same vowels (see Table S1). The results demonstrate that the phoneme layer effectively distinguishes different phonemes, while the higher linguistic layers do not. We believe these findings provide evidence that different layers capture distinct linguistic information. Additionally, we computed the correlation coefficients between each pair of linguistic predictors, as shown in Figure 3B. We think this analysis serves a similar purpose to computing the mutual information between pairs of hidden-layer activities for our constructed sentences. Furthermore, the mTRF results based on rate models of the linguistic features we presented earlier align closely with the regression results using the hidden-layer activity from our HM-LSTM model. This further supports the conclusion that our model successfully captures relevant information across these linguistic levels. We have added the clarification on p.5 of the manuscript.
The formulation of the regression analysis is somewhat unclear. The choice of sentence offsets as the anchor point for the temporal analysis, and the focus on the [-100ms, +300ms] interval, needs further justification. Since EEG measures underlying neural activity in near real-time, it is expected that lower-level acoustic information, which is relatively transient, such as phonemes and syllables, would be distributed throughout the time course of the entire sentence. It is not evident if this limited time window effectively captures the neural responses to the entire sentence, especially for lower-level linguistic features. A more comprehensive analysis covering the entire time course of the sentence, or at least a longer temporal window, would provide a clearer understanding of how different linguistic units are processed over time. Additionally, explaining the rationale behind choosing this specific time window and how it aligns with the temporal dynamics of speech processing would enhance the clarity and validity of the regression analysis.
Thank you for pointing this out. We chose this time window as lexical or phrasal processing typically occurs 200 ms after stimulus offsets (Bemis & Pylkkanen, 2011; Goldstein et al., 2022; Li et al., 2024; Li & Pylkkänen, 2021). Additionally, we included the -100 to 200 ms time period in our analysis to examine phoneme and syllable level processing (e.g., Gwilliams et al., 2022). Using the entire sentence duration was not feasible, as the sentences in the stimuli vary in length, making statistical analysis challenging. Additionally, since the stimuli consist of continuous speech, extending the time window would risk including linguistic units from subsequent sentences. This would introduce ambiguity as to whether the EEG responses correspond to the current or the following sentence. We have added this clarification on p.12 of the manuscript.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
As I mentioned, I think the OSF repo needs to be changed to give anyone access. I would recommend pursuing the lines of thought I mentioned in the public review to make this study complete and to allow it to fit into the already existing literature to facilitate comparisons.
Yes the OSF folder is now public. We have made revisions following all reviewers’ suggestions.
There are some typos in figure labels, e.g. 2B.
Thank you for pointing it out! We have now revised the typo in Figure 2B.
Reviewer #2 (Recommendations For The Authors):
(1) I was able to access all of the audio files and code for the study, but no EEG data was shared in the OSF repository. Unless there is some ethical and/or legal constraint, my understanding of eLife's policy is that the neural data should be made publicly available as well.
The preprocessed EEG data in .npy format in the OSF repository.
(2) The line-plots in Figures 4B,5B, and 6B have very similar colours. They would be easier to interpret if you changed the line appearance as well as the colours. E.g., dotted line for hearingimpaired listeners, thick line for normal-hearing.
Thank you for the suggestion! We have now used thicker lines for normal-impaired listeners in all our line plots.
Reviewer #3 (Recommendations For The Authors):
(1) The authors may consider presenting raw event-related potentials (ERPs) or spatiotemporal response profiles before delving into the more complex regression encoding analysis. This would provide a clearer foundational understanding of the neural activity patterns. For example, it is not clear if the main claims, such as the neural activity in the normal-hearing group encoding phonetic information in attended speech better than in unattended speech, are directly observable. Showing ERP differences or spatiotemporal response pattern differences could support these claims more straightforwardly. Additionally, training pattern classifiers to test if different levels of information can be decoded from EEG activity in specific groups could provide further validation of the findings.
We have now included results from more traditional mTRF analyses using phoneme, syllable, word, phrase, and sentence rates as baseline models (see p.7 of the manuscript and Figure S3). The results show similar patterns to those observed in our current analyses. While we agree that classification analyses would be very interesting, our regression analyses have already demonstrated distinct EEG patterns for each linguistic level. Consequently, classification analyses would likely yield similar results unless a different method for representing linguistic information at these levels is employed. To the best of our knowledge, no other computational model currently exists that can simultaneously represent these linguistic levels.
(2) Is there any behavioral metric suggesting that these hearing-impaired participants do have deficits in comprehending long sentences? The self-rated intelligibility is useful, but cannot fully distinguish between perceiving lower-level phonetic information vs longer sentence comprehension.
In the current study, we included only self-rated intelligibility tests. We acknowledge that this approach might not fully distinguish between the perception of lower-level phonetic information and higher-level sentence comprehension. However, it remains unclear what type of behavioral test would effectively address this distinction. Furthermore, our primary aim was to use the behavioral results to demonstrate that our hearing-impaired listeners experienced speech comprehension difficulties in multi-talker environments, while relying on the EEG data to investigate comprehension challenges at various linguistic levels.
Minor:
(1) Page 2, second line in Introduction, "Phonemes occur over ..." should be lowercase.
According to APA format, the first word after the colon is capitalized if it begins a complete sentence (https://blog.apastyle.org/apastyle/2011/06/capitalization-after-colons.html). Here
the sentence is a complete sentence so we used uppercase for “phonemes”.
(2) Page 8, second paragraph "...-100ms to 100ms relative to sentence onsets", should it be onsets or offsets?
This is typo and it should be offsets. We have now revised it.
References
Bemis, D. K., & Pylkkanen, L. (2011). Simple composition: An MEG investigation into the comprehension of minimal linguistic phrases. Journal of Neuroscience, 31(8), 2801– 2814.
Gao, C., Li, J., Chen, J., & Huang, S. (2024). Measuring meaning composition in the human brain with composition scores from large language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 11295–11308). Association for Computational Linguistics.
Goldstein, A., Zada, Z., Buchnik, E., Schain, M., Price, A., Aubrey, B., Nastase, S. A., Feder, A., Emanuel, D., Cohen, A., Jansen, A., Gazula, H., Choe, G., Rao, A., Kim, C., Casto, C., Fanda, L., Doyle, W., Friedman, D., … Hasson, U. (2022). Shared computational principles for language processing in humans and deep language models. Nature Neuroscience, 25(3), Article 3.
Gwilliams, L., King, J.-R., Marantz, A., & Poeppel, D. (2022). Neural dynamics of phoneme sequences reveal position-invariant code for content and order. Nature Communications, 13(1), Article 1.
Huth, A. G., de Heer, W. A., Griffiths, T. L., Theunissen, F. E., & Gallant, J. L. (2016). Natural speech reveals the semantic maps that tile human cerebral cortex. Nature, 532(7600), 453–458.
Li, J., Lai, M., & Pylkkänen, L. (2024). Semantic composition in experimental and naturalistic paradigms. Imaging Neuroscience, 2, 1–17.
Li, J., & Pylkkänen, L. (2021). Disentangling semantic composition and semantic association in the left temporal lobe. Journal of Neuroscience, 41(30), 6526–6538.
Maris, E., & Oostenveld, R. (2007). Nonparametric statistical testing of EEG- and MEG-data. Journal of Neuroscience Methods, 164(1), 177–190.
Schmitt, L.-M., Erb, J., Tune, S., Rysop, A. U., Hartwigsen, G., & Obleser, J. (2021). Predicting speech from a cortical hierarchy of event-based time scales. Science Advances, 7(49), eabi6070.
Schrimpf, M., Blank, I. A., Tuckute, G., Kauf, C., Hosseini, E. A., Kanwisher, N., Tenenbaum, J. B., & Fedorenko, E. (2021). The neural architecture of language: Integrative modeling converges on predictive processing. Proceedings of the National Academy of Sciences, 118(45), e2105646118.
Sugimoto, Y., Yoshida, R., Jeong, H., Koizumi, M., Brennan, J. R., & Oseki, Y. (2024). Localizing Syntactic Composition with Left-Corner Recurrent Neural Network Grammars. Neurobiology of Language, 5(1), 201–224.
-
-
www.researchsquare.com www.researchsquare.com
-
eLife Assessment
This fundamental work has the potential to advance our understanding of brain activity using electrophysiological data, by proposing a completely new approach to reconstructing EEG data that challenges the assumptions typically made in the solutions to Maxwell's equations. However, the evidence supporting the approach is incomplete and requires further evaluations, in particular comparisons with existing, standard reconstruction methods. The work may be of broad interest to neuroscientists and neuroimaging.
-
Reviewer #1 (Public review):
I want to reiterate my comment from the first round of reviews: that I am insufficiently familiar with the intricacies of Maxwell's equations to assess the validity of the assumptions and the equations being used by WETCOW. The work ideally needs assessing by someone more versed in that area, especially given the potential impact of this method if valid.
Effort has been made in these revisions to improve explanations of the proposed approach (a lot of new text has been added) and to add new simulations.
However, the authors have still not compared their method on real data with existing standard approaches for reconstructing data from sensor to physical space. Refusing to do so because existing approaches are deemed inappropriate (i.e. they "are solving a different problem") is illogical.
Similarly, refusing to compare their method with existing standard approaches for spatio-temporally describing brain activity, just because existing approaches are deemed inappropriate, is illogical.
For example, the authors say that "it's not even clear what one would compare [between the new method and standard approaches]". How about:
(1) Qualitatively: compare EEG activation maps. I.e. compare what you would report to a researcher about the brain activity found in a standard experimental task dataset (e.g. their gambling task). People simply want to be able to judge, at least qualitatively on the same data, what the most equivalent output would be from the two approaches. Note, both approaches do not need to be done at the same spatial resolution if there are constraints on this for the comparison to be useful.
and
(2) Quantitatively: compare the correlation scores between EEG activation maps and fMRI activation maps
The abstract claims that there is a "direct comparison with standard state-of-the-art EEG analysis in a well-established attention paradigm", but no actual comparison appears to have been completed in the paper.
-
Reviewer #2 (Public review):
Summary:
The manuscript claims to present a novel method for direct imaging of electric field networks from EEG data with higher spatiotemporal resolution than even fMRI. Validation of the EEG reconstructions with EEG/FMRI, EEG, and iEEG datasets are presented. Subsequently, reconstructions from a large EEG datasets of subjects performing a gambling task are presented.
Strengths:
If true and convincing, the proposed theoretical framework and reconstruction algorithm can revolutionise the use of EEG source reconstructions.
Weaknesses:
There is very little actual information in the paper about either the forward model or the novel method of reconstruction. Only citations to prior work by the authors are given with absolutely no benchmark comparisons, making the manuscript difficult to read and interpret in isolation to their prior body of work.
Comments on revisions:
This is a major rewrite of the paper. The authors have improved the discourse vastly. There is now a lot of didactics included but they are not always relevant to the paper. The section on Maxwell's equation does a disservice to the literature in prior work in bioelectromagnetism and does not even address the issues raised in classic text books by Plonsey et al. There is no logical "backwardness" in the literature. They are based on the relative values of constants in biological tissues. Several sections of the appendix discuss in terms of weather predictions and could just be written specifically for the problem here. There are reinventions of many standard ideas in terms of physics discourses, like Bayesian theory or PCA etc. I think that the paper remains quite opaque and many of the original criticisms remain, especially as they relate to multimodal datasets. The overall algorithm still remains poorly described. The comparisons to benchmark remain unaddressed and the authors state that they couldn't get Loreta to work and so aborted that. The figures are largely unaltered, although they have added a few more, and do not clearly depict the ideas. Again, no benchmark comparisons are provided to evaluate the results and the performance in comparison to other benchmarks.
-
Author response:
The following is the authors’ response to the current reviews.
Reviewer 1 (Public Review):
I want to reiterate my comment from the first round of reviews: that I am insufficiently familiar with the intricacies of Maxwell’s equations to assess the validity of the assumptions and the equations being used by WETCOW. The work ideally needs assessing by someone more versed in that area, especially given the potential impact of this method if valid.
We appreciate the reviewer’s candor. Unfortunately, familiarity with Maxwell’s equations is an essential prerequisite for assessing the veracity of our approach and our claims.
Effort has been made in these revisions to improve explanations of the proposed approach (a lot of new text has been added) and to add new simulations. However, the authors have still not compared their method on real data with existing standard approaches for reconstructing data from sensor to physical space. Refusing to do so because existing approaches are deemed inappropriate (i.e. they “are solving a different problem”) is illogical.
Without understanding the importance of our model for brain wave activity (cited in the paper) derived from Maxwell’s equations in inhomogeneous and anisotropic brain tissue, it is not possible to critically evaluate the fundamental difference between our method and the standard so-called “source localization” method which the Reviewer feels it is important to compare our results with. Our method is not “source localization” which is a class of techniques based on an inappropriate model for static brain activity (static dipoles sprinkled sparsely in user-defined areas of interest). Just because a method is “standard” does not make it correct. Rather, we are reconstructing a whole brain, time dependent electric field potential based upon a model for brain wave activity derived from first principles. It is comparing two methods that are “solving different problems” that is, by definition, illogical.
Similarly, refusing to compare their method with existing standard approaches for spatio-temporally describing brain activity, just because existing approaches are deemed inappropriate, is illogical.
Contrary to the Reviewer’s assertion, we do compare our results with three existing methods for describing spatiotemporal variations of brain activity.
First, Figures 1, 2, and 6 compare the spatiotemporal variations in brain activity between our method and fMRI, the recognized standard for spatiotemporal localization of brain activity. The statistical comparison in Fig 3 is a quantitative demonstration of the similarity of the activation patterns. It is important to note that these data are simultaneous EEG/fMRI in order to eliminate a variety of potential confounds related to differences in experimental conditions.
Second, Fig 4 (A-D) compares our method with the most reasonable “standard” spatiotemporal localization method for EEG: mapping of fields in the outer cortical regions of the brain detected at the surface electrodes to the surface of the skull. The consistency of both the location and sign of the activity changes detected by both methods in a “standard” attention paradigm is clearly evident. Further confirmation is provided by comparison of our results with simultaneous EEG/fMRI spatial reconstructions (E-F) where the consistency of our reconstructions between subjects is shown in Fig 5.
Third, measurements from intra-cranial electrodes, the most direct method for validation, are compared with spatiotemporal estimates derived from surface electrodes and shown to be highly correlated.
For example, the authors say that “it’s not even clear what one would compare [between the new method and standard approaches]”. How about:
(1) Qualitatively: compare EEG activation maps. I.e. compare what you would report to a researcher about the brain activity found in a standard experimental task dataset (e.g. their gambling task). People simply want to be able to judge, at least qualitatively on the same data, what the most equivalent output would be from the two approaches. Note, both approaches do not need to be done at the same spatial resolution if there are constraints on this for the comparison to be useful.
(2) Quantitatively: compare the correlation scores between EEG activation maps and fMRI activation maps
These comparison were performed and already in the paper.
(1) Fig 4 compares the results with a standard attention paradigm (data and interpretation from Co-author Dr Martinez, who is an expert in both EEG and attention). Additionally, Fig 12 shows detected regions of increased activity in a well-known brain circuit from an experimental task (’reward’) with data provided by Co-author Dr Krigolson, an expert in reward circuitry.
(2) Correlation scores between EEG and fMRI are shown in Fig 3.
(3) Very high correlation between the directly measured field from intra-cranial electrodes in an epilepsy patient and those estimated from only the surface electrodes is shown in Fig 9.
There are an awful lot of typos in the new text in the paper. I would expect a paper to have been proof read before submitting.
We have cleaned up the typos.
The abstract claims that there is a “direct comparison with standard state-of-the-art EEG analysis in a well-established attention paradigm”, but no actual comparison appears to have been completed in the paper.
On the contrary, as mentioned above, Fig 4 compares the results of our method with the state-of-the-art surface spatial mapping analysis, with the state-of-the-art time-frequency analysis, and with the state-of-the-art fMRI analysis
Reviewer 2 (Public Review):
This is a major rewrite of the paper. The authors have improved the discourse vastly.
There is now a lot of didactics included but they are not always relevant to the paper.
The technique described in the paper does in fact leverage several novel methods we have developed over the years for analyzing multimodal space-time imaging data. Each of these techniques has been described in detail in separate publications cited in the current paper. However, the Reviewers’ criticisms stated that the methods were non-standard and they were unfamiliar with them. In lieu of the Reviewers’ reading the original publications, we added a significant amount of text indeed intended to be didactic. However, we can assume the Reviewer that nothing presented was irrelevant to the paper. We certainly had no desire to make the paper any longer than it needed to be.
The section on Maxwell’s equation does a disservice to the literature in prior work in bioelectromagnetism and does not even address the issues raised in classic text books by Plonsey et al. There is no logical “backwardness” in the literature. They are based on the relative values of constants in biological tissues.
This criticism highlights the crux of our paper. Contrary to the assertion that we have ignored the work of Plonsey, we have referenced it in the new additional text detailing how we have constructed Maxwell’s Equations appropriate for brain tissue, based on the model suggested by Plonsey that allows the magnetic field temporal variations to be ignored but not the time-dependence electric fields.
However, the assumption ubiquitous in the vast prior literature of bioelectricity in the brain that the electric field dynamics can be “based on the relative values of constants in biological tissues”, as the Reviewer correctly summarizes, is precisely the problem. Using relative average tissue properties does not take into account the tissue anisotropy necessary to properly account for correct expressions for the electric fields. As our prior publications have demonstrated in detail, taking into account the inhomogeneity and anisotropy of brain tissue in the solution to Maxwell’s Equations is necessary for properly characterizing brain electrical fields, and serves as the foundation of our brain wave theory. This led to the discovery of a new class of brain waves (weakly evanescent transverse cortical waves, WETCOW).
It is this brain wave model that is used to estimate the dynamic electric field potential from the measurements made by the EEG electrode array. The standard model that ignores these tissue details leads to the ubiquitous “quasi-static approximation” that leads to the conclusion that the EEG signal cannot be spatial reconstructed. It is indeed this critical gap in the existing literature that is the central new idea in the paper.
There are reinventions of many standard ideas in terms of physics discourses, like Bayesian theory or PCA etc.
The discussion of Bayesian theory and PCA is in response to the Reviewer complaint that they were unfamiliar with our entropy field decomposition (EFD) method and the request that we compare it with other “standard” methods. Again, we have published extensively on this method (as referenced in the manuscript) and therefore felt that extensive elaboration was unnecessary. Having been asked to provide such elaboration and then being pilloried for it therefore feels somewhat inappropriate in our view. This is particularly disappointing as the Reviewer claims we are presenting “standard” ideas when in fact the EFD is new general framework we developed to overcome the deficiencies in standard “statistical” and probabilistic data analysis methods that are insufficient for characterizing non-linear, nonperiodic, interacting fields that are the rule, rather than the exception, in complex dynamical systems, such as brain electric fields (or weather, or oceans, or ....).
The EFD is indeed a Bayesian framework, as this is the fundamental starting point for probability theory, but it is developed in a unique and more general fashion than previous data analysis methods. (Again, this is detailed in several references in the papers bibliography. The Reviewer’s requested that an explanation be included in the present paper, however, so we did so). First, Bayes Theorem is expressed in terms of a field theory that allows an arbitrary number of field orders and coupling terms. This generality comes with a penalty, which is that it’s unclear how to assess the significance of the essentially infinite number of terms. The second feature is the introduction of a method by which to determine the significant number of terms automatically from the data itself, via the our theory of entropy spectrum pathways (ESP), which is also detailed in a cited publication, and which produces ranked spatiotemporal modes from the data. Rather than being “reinventions of many standard ideas” these are novel theoretical and computational methods that are central to the EEG reconstruction method presented in the paper.
I think that the paper remains quite opaque and many of the original criticisms remain, especially as they relate to multimodal datasets. The overall algorithm still remains poorly described. benchmarks.
It’s not clear how to assess the criticisms that the algorithm is poorly described yet there is too much detail provided that is mistakenly assessed as “standard”. Certainly the central wave equations that are estimated from the data are precisely described, so it’s not clear exactly what the Reviewer is referring to.
The comparisons to benchmark remain unaddressed and the authors state that they couldn’t get Loreta to work and so aborted that. The figures are largely unaltered, although they have added a few more, and do not clearly depict the ideas. Again, no benchmark comparisons are provided to evaluate the results and the performance in comparison to other benchmarks.
As we have tried to emphasize in the paper, and in the Response to Reviewers, the standard so-called “source localization” methods are NOT a benchmark, as they are solving an inappropriate model for brain activity. Once again, static dipole “sources” arbitrarily sprinkled on pre-defined regions of interest bear little resemblance to observed brain waves, nor to the dynamic electric field wave equations produced by our brain wave theory derived from a proper solution to Maxwell’s equations in the anisotropic and inhomogeneous complex morphology of the brain.
The comparison with Loreta was not abandoned because we couldn’t get it to work, but because we could not get it to run under conditions that were remotely similar to whole brain activity described by our theory, or, more importantly, by an rationale theory of dynamic brain activity that might reproduce the exceedingly complex electric field activity observed in numerous neuroscience experiments.
We take issue with the rather dismissive mention of “a few more” figures that “do not clearly depict the idea” when in fact the figures that have been added have demonstrated additional quantitative validation of the method.
The following is the authors’ response to the original reviews.
Public Reviews:
Reviewer 1 (Public Review):
The paper proposes a new source reconstruction method for electroencephalography (EEG) data and claims that it can provide far superior spatial resolution than existing approaches and also superior spatial resolution to fMRI. This primarily stems from abandoning the established quasi-static approximation to Maxwell’s equations.<br /> The proposed method brings together some very interesting ideas, and the potential impact is high. However, the work does not provide the evaluations expected when validating a new source reconstruction approach. I cannot judge the success or impact of the approach based on the current set of results. This is very important to rectify, especially given that the work is challenging some long- standing and fundamental assumptions made in the field.
We appreciate the Reviewer’s efforts in reviewing this paper and have included a significant amount of new text to address their concerns.
I also find that the clarity of the description of the methods, and how they link to what is shown in the main results hard to follow.
We have added significantly more detail on the methods, including more accessible explanations of the technical details, and schematic diagrams to visualize the key processing components.
I am insufficiently familiar with the intricacies of Maxwell’s equations to assess the validity of the assumptions and the equations being used by WETCOW. The work therefore needs assessing by someone more versed in that area. That said, how do we know that the new terms in Maxwell’s equations, i.e. the time-dependent terms that are normally missing from established quasi-static-based approaches, are large enough to need to be considered? Where is the evidence for this?
The fact that the time-dependent terms are large enough to be considered is essentially the entire focus of the original papers [7,8]. Time-dependent terms in Maxwell’s equations are generally not important for brain electrodynamics at physiological frequencies for homogeneous tissues, but this is not true for areas with stroung inhomogeneity and ansisotropy.
I have not come across EFD, and I am not sure many in the EEG field will have. To require the reader to appreciate the contributions of WETCOW only through the lens of the unfamiliar (and far from trivial) approach of EFD is frustrating. In particular, what impact do the assumptions of WETCOW make compared to the assumptions of EFD on the overall performance of SPECTRE?
We have added an entire new section in the Appendix that provides a very basic introduction to EFD and relates it to more commonly known methods, such as Fourier and Independent Components Analyses.
The paper needs to provide results showing the improvements obtained when WETCOW or EFD are combined with more established and familiar approaches. For example, EFD can be replaced by a first-order vector autoregressive (VAR) model, i.e. y<sub>t</sub> = Ay<sub>t−1</sub> + e<sub>t</sub> (where y<sub>t</sub> is [num<sub>gridpoints</sub> ∗ 1] and A is [num<sub>gridpoints</sub> ∗ num<sub>gridpoints</sub>] of autoregressive parameters).
The development of EFD, which is independent of WETCOW, stemmed from the necessity of developing a general method for the probabilistic analysis of finitely sampled non-linear interacting fields, which are ubiquitous in measurements of physical systems, of which functional neuroimaging data (fMRI, EEG) are excellent examples. Standard methods (such as VAR) are inadequate in such cases, as discussed in great detail in our EFD publications (e.g., [12,37]). The new appendix on EFD reviews these arguments. It does not make sense to compare EFD with methods which are inappropriate for the data.
The authors’ decision not to include any comparisons with established source reconstruction approaches does not make sense to me. They attempt to justify this by saying that the spatial resolution of LORETA would need to be very low compared to the resolution being used in SPECTRE, to avoid compute problems. But how does this stop them from using a spatial resolution typically used by the field that has no compute problems, and comparing with that? This would be very informative. There are also more computationally efficient methods than LORETA that are very popular, such as beamforming or minimum norm.
he primary reason for not comparing with ’source reconstruction’ (SR) methods is that we are are not doing source reconstruction. Our view of brain activity is that it involves continuous dynamical non-linear interacting fields througout the entire brain. Formulating EEG analysis in terms of reconstructing sources is, in our view, like asking ’what are the point sources of a sea of ocean waves’. It’s just not an appropriate physical model. A pre-chosen limited distribution of static dipoles is just a very bad model for brain activity, so much so that it’s not even clear what one would compare. Because in our view, as manifest in our computational implementation, one needs to have a very high density of computational locations throughout the entire brain, including white matter, and the reconstructed modes are waves whose extent can be across the entire brain. Our comments about the low resolution of computational methods for SR techniques really is expressing the more overarching concern that they are not capable of, or even designed for, detecting time-dependent fields of non-linear interacting waves that exist everywhere througout the brain. Moreover, the SR methods always give some answer, but in our view the initial conditions upon which those methods are based (pre-selected regions of activity with a pre-selected number of ’sources’) is a highly influential but artificial set of strong computational constraints that will almost always provide an answer consist with (i.e., biased toward) the expectations of the person formlating the problem, and is therefore potentially misleading.
In short, something like the following methods needs to be compared:
(1) Full SPECTRE (EFD plus WETCOW)
(2) WETCOW + VAR or standard (“simple regression”) techniques
(3) Beamformer/min norm plus EFD
(4) Beamformer/min norm plus VAR or standard (“simple regression”) techniques
The reason that no one has previously ever been able to solve the EEG inverse problem is due to the ubiquitous use of methods that are too ’simple’, i.e., are poor physical models of brain activity. We have spent a decade carefully elucidating the details of this statement in numerous highly technical and careful publications. It therefore serves no purpose to return to the use of these ’simple’ methods for comparison. We do agree, however, that a clearer overview of the advantages of our methods is warranted and have added significant additional text in this revision towards that purpose.
This would also allow for more illuminating and quantitative comparisons of the real data. For example, a metric of similarity between EEG maps and fMRI can be computed to compare the performance of these methods. At the moment, the fMRI-EEG analysis amounts to just showing fairly similar maps.
We disagree with this assessment. The correlation coefficient between the spatially localized activation maps is a conservative sufficient statistic for the measure of statistically significant similarity. These numbers were/are reported in the caption to Figure 5, and have now also been moved to, and highlighted in, the main text.
There are no results provided on simulated data. Simulations are needed to provide quantitative comparisons of the different methods, to show face validity, and to demonstrate unequivocally the new information that SPECTRE can ’potentially’ provide on real data compared to established methods. The paper ideally needs at least 3 types of simulations, where one thing is changed at a time, e.g.:
(1) Data simulated using WETCOW plus EFD assumptions
(2) Data simulated using WETCOW plus e.g. VAR assumptions
(3) Data simulated using standard lead fields (based on the quasi-static Maxwell solutions) plus e.g. VAR assumptions
These should be assessed with the multiple methods specified earlier. Crucially the assessment should be quantitative showing the ability to recover the ground truth over multiple realisations of realistic noise. This type of assessment of a new source reconstruction method is the expected standard
We have now provided results on simulated data, along with a discussion on what entails a meaningful simulation comparison. In short, our original paper on the WETCOW theory included a significant number of simulations of predicted results on several spatial and temporal scales. The most relevant simulation data to compare with the SPECTRE imaging results are the cortical wave loop predicted by WETCOW theory and demonstrated via numerical simulation in a realistic brain model derived from high resolution anatomical (HRA) MRI data. The most relevant data with which to compare these simulations are the SPECTRE recontruction from the data that provides the closest approximation to a “Gold Standard” - reconstructions from intra-cranial EEG (iEEG). We have now included results (new Fig 8) that demonstrate the ability of SPECTRE to reconstruct dynamically evolving cortical wave loops in iEEG data acquired in an epilepsy patient that match with the predicted loop predicted theoretically by WETCOW and demonstrated in realistic numerical simulations.
The suggested comparison with simple regression techniques serves no purpose, as stated above, since that class of analysis techniques was not designed for non-linear, non-Gaussian, coupled interacting fields predicted by the WETCOW model. The explication of this statement is provided in great detail in our publications on the EFD approach and in the new appendix material provided in this revision. The suggested simulation of the dipole (i.e., quasi-static) model of brain activity also serves no purpose, as our WETCOW papers have demonstrated in great detail that is is not a reasonable model for dynamic brain activity.
Reviewer 2 (Public Review):
Strengths:
If true and convincing, the proposed theoretical framework and reconstruction algorithm can revolutionize the use of EEG source reconstructions.
Weaknesses:
There is very little actual information in the paper about either the forward model or the novel method of reconstruction. Only citations to prior work by the authors are cited with absolutely no benchmark comparisons, making the manuscript difficult to read and interpret in isolation from their prior body of work.
We have now added a significant amount of material detailing the forward model, our solution to the inverse problem, and the method of reconstruction, in order to remedy this deficit in the previous version of the paper.
Recommendations for the authors:
Reviewer 1 (Recommendations):
It is not at all clear from the main text (section 3.1) and the caption, what is being shown in the activity patterns in Figures 1 and 2. What frequency bands and time points etc? How are the values shown in the figures calculated from the equations in the methods?
We have added detailed information on the frequency bands reconstructed and the activity pattern generation and meaning. Additional information on the simultaneous EEG/fMRI acquisition details has been added to the Appendix.
How have the activity maps been thresholded? Where are the color bars in Figures 1 and 2?
We have now included that information in new versions of the figures. In addition, the quantitative comparison between fMRI and EEG are presented is now presented in a new Figure 2 (now Figure 3).
P30 “This term is ignored in the current paper”. Why is this term ignored, but other (time-dependent) terms are not?
These terms are ignored because they represent higher order terms that complicate the processing (and intepretation) but do not substatially change the main results. A note to this effect has been added to the text.
The concepts and equations in the EFD section are not very accessible (e.g. to someone unfamiliar with IFT).
We have added a lengthy general and more accessible description of the EFD method in the Appendix.
Variables in equation 1, and the following equation, are not always defined in a clear, accessible manner. What is
?
We have added additional information on how Eqn 1 (now Eqn 3) is derived, and the variables therein.
In the EFD section, what do you mean conceptually by α, i.e. “the coupled parameters α”?
This sentence has been eliminated, as it was superfluous and confusing.
How are the EFD and WETCOW sections linked mathematically? What is ψ (in eqn 2) linked to in the WETCOW section (presumably ϕ<sub>ω</sub>?) ?
We have added more introductory detail at the beginning of the Results to describe the WETCOW theory and how this is related to the inverse problem for EEG.
What is the difference between data d and signal s in section 6.1.3? How are they related?
We have added a much more detailed Appendix A where this (and other) details are provided.
What assumptions have been made to get the form for the information Hamiltonian in eqn3?
Eq 3 (now Eqn A.5) is actually very general. The approximations come in when constructing the interaction Hamiltonian H<sub>i</sub>.
P33 “using coupling between different spatio-temporal points that is available from the data itself” I do not understand what is meant by this.
This was a poorly worded sentence, but this section has now been replaced by Appendix A, which now contains the sentence that prior information “is contained within the data itself”. This refers to the fact that the prior information consists of correlations in the data, rather than some other measurements independent of the original data. This point is emphasized because in many Bayesian application, prior information consists of knowledge of some quantity that were acquired independently from the data at hand (e.g., mean values from previous experiments)
Reviewer 2 (Recommendations):
Abstract
The first part presents validation from simultaneous EEG/fMRI data, iEEG data, and comparisons with standard EEG analyses of an attention paradigm. Exactly what constitutes adequate validation or what metrics were used to assess performance is surprisingly absent.
Subsequently, the manuscript examines a large cohort of subjects performing a gambling task and engaging in reward circuits. The claim is that this method offers an alternative to fMRI.
Introduction
Provocative statements require strong backing and evidence. In the first paragraph, the “quasi-static” assumption which is dominant in the field of EEG and MEG imaging is questioned with some classic citations that support this assumption. Instead of delving into why exactly the assumption cannot be relaxed, the authors claim that because the assumption was proved with average tissue properties rather than exact, it is wrong. This does not make sense. Citations to the WETCOW papers are insufficient to question the quasi-static assumption.
The introduction purports to validate a novel theory and inverse modeling method but poorly outlines the exact foundations of both the theory (WETCOW) and the inverse modeling (SPECTRE) work.
We have added a new introductory subsection (“A physical theory of brain waves”) to the Results section that provides a brief overview of the foundations of the WETCOW theory and an explicit description of why the quasi-static approximation can be abandoned. We have expanded the subsequent subsection (“Solution to the inverse EEG problem”) to more clearly detail the inverse modeling (SPECTRE) method.
Section 3.2 Validation with fMRI
Figure 1 supposedly is a validation of this promising novel theoretical approach that defies the existing body of literature in this field. Shockingly, a single subject data is shown in a qualitative manner with absolutely no quantitative comparison anywhere to be found in the manuscript. While there are similarities, there are also differences in reconstructions. What to make out of these discrepancies? Are there distortions that may occur with SPECTRE reconstructions? What are its tradeoffs? How does it deal with noise in the data?
It is certainly not the case that there are no quantitative comparisons. Correlation coefficients, which are the sufficient statistics for comparison of activation regions, are given in Figure 5 for very specific activation regions. Figure 9 (now Figure 11) shows a t-statistic demonstrating the very high significance of the comparison between multiple subjects. And we have now added a new Figure 7 demonstrating the strongly correlated estimates for full vs surface intra-cranial EEG reconstructions. To make this more clear, we have added a new section “Statistical Significance of the Results”.
We note that a discussion of the discrepancies between fMRI and EEG was already presented in the Supplementary Material. Therein we discuss the main point that fMRI and EEG are measuring different physical quantities and so should not be expected to be identical. We also highlight the fact that fMRI is prone to significant geometrical distortions for magnetic field inhomogeities, and to physiological noise. To provide more visibility for this important issue, we have moved this text into the Discussion section.
We do note that geometric distortions in fMRI data due to suboptimal acquisitions and corrections is all too common. This, coupled with the paucity of open source simultaneous fMRI-EEG data, made it difficult to find good data for comparison. The data on which we performed the quantitative statistical comparison between fMRI and EEG (Fig 5) was collected by co-author Dr Martinez, and was of the highest quality and therefore sufficient for comparison. The data used in Fig 1 and 2 was a well publicized open source dataset but had significant fMRI distortions that made quantitative comparison (i.e., correlation coefficents between subregions in the Harvard-Oxford atlas) suboptimal. Nevertheless, we wanted to demonstrate the method in more than one source, and feel that visual similarity is a reasonble measure for this data.
Section 3.2 Validation with fMRI
Figure 2 Are the sample slices being shown? How to address discrepancies? How to assume that these are validations when there are such a level of discrepancies?
It’s not clear what “sample slices” means. The issue of discrepancies is addressed in the response to the previous query.
Section 3.2 Validation with fMRI
Figure 3 Similar arguments can be made for Figure 3. Here too, a comparison with source localization benchmarks is warranted because many papers have examined similar attention data.
Regarding the fMRI/EEG comparison, these data are compared quantitatively in the text and in Figure 5.
Regarding the suggestion to perform standard ’source localization’ analysis, see responses to Reviewer 1.
Section 3.2 Validation with fMRI
Figure 4 While there is consistency across 5 subjects, there are also subtle and not-so-subtle differences.
What to make out of them?
Discrepancies in activations patterns between individuals is a complex neuroscience question that we feel is well beyond the scope of this paper.
Section 3.2 Validation with fMRI
Figures 5 & 6 Figure 5 is also a qualitative figure from two subjects with no appropriate quantification of results across subjects. The same is true for Figure 6.
On the contrary, Figure 5 contains a quantitative comparison, which is now also described in the text. A quantitative comparison for the epilepsy data in Fig 6 (and C.4-C.6) is now shown in Fig 7.
Section 3.2 Validation with fMRI
Given the absence of appropriate “validation” of the proposed model and method, it is unclear how much one can trust results in Section 4.
We believe that the quantitative comparisons extant in the original text (and apparently missed by the Reviewer) along with the additional quantitative comparisons are sufficient to merit trust in Section 4.
Section 3.2 Validation with fMRI
What are the thresholds used in maps for Figure 7? Was correction for multiple comparisons performed? The final arguments at the end of section 4 do not make sense. Is the claim that all results of reconstructions from SPECTRE shown here are significant with no reason for multiple comparison corrections to control for false positives? Why so?
We agree that the last line in Section 4 is misleading and have removed it.
Section 3.2 Validation with fMRI
Discussion is woefully inadequate in addition to the inconclusive findings presented here.
We have added a significant amount of text to the Discussion to address the points brought up by the Reviewer. And, contrary to the comments of this Reviewer, we believe the statistically significant results presented are not “inconclusive”.
Supplementary Materials
This reviewer had an incredibly difficult time understanding the inverse model solution. Even though this has been described in a prior publication by the authors, it is important and imperative that all details be provided here to make the current manuscript complete. The notation itself is so nonstandard. What is Σ<sup>ij</sup>, δ<sup>ij</sup>? Where is the reference for equation (1)? What about the equation for <sup>ˆ</sup>(R)? There are very few details provided on the exact implementation details for the Fourier-space pseudo-spectral approach. What are the dimensions of the problem involved? How were different tissue compartments etc. handled? Equation 1 holds for the entire volume but the measurements are only made on the surface. How was this handled? What is the WETCOW brain wave model? I don’t see any entropy term defined anywhere - where is it?
We have added more detail on the theoretical and numerical aspects of the inverse problem in two new subsections “Theory” and “Numerical Implementation” in the new section “Solution to the inverse EEG problem”.
Supplementary Materials
So, how can one understand even at a high conceptual level what is being done with SPECTRE?
We have added a new subsection “Summary of SPECTRE” that provides a high conceptual level overview of the SPECTRE method outlined in the preceding sections.
Supplementary Materials
In order to understand what was being presented here, it required the reader to go on a tour of the many publications by the authors where the difficulty in understanding what they actually did in terms of inverse modeling remains highly obscure and presents a huge problem for replicability or reproducibility of the current work.
We have now included more basic material from our previous papers, and simplified the presentation to be more accessible. In particular, we have now moved the key aspects of the theoretic and numerical methods, in a more readable form, from the Supplementary Material to the main text, and added a new Appendix that provides a more intuitive and accessible overview of our estimation procedures.
Supplementary Materials
How were conductivity values for different tissue types assigned? Is there an assumption that the conductivity tensor is the same as the diffusion tensor? What does it mean that “in the present study only HRA data were used in the estimation procedure?” Does that mean that diffusion MRI data was not used? What is SYMREG? If this refers to the MRM paper from the authors in 2018, that paper does not include EEG data at all. So, things are unclear here.
The conductivity tensor is not exactly the same as the diffusion tensor in brain tissues, but they are closely related. While both tensors describe transport properties in brain tissue, they represent different physical processes. The conductivity tensor is often assumed to share the same eigenvectors as the diffusion tensor. There is a strong linear relationship between the conductivity and diffusion tensor eigenvalues, as supported by theoretical models and experimental measurements. For the current study we only used the anatomical data for estimatition and assignment of different tissue types and no diffusion MRI data was used. To register between different modalities, including MNI, HRA, function MRI, etc., and to transform the tissue assignment into an appropriate space we used the SYMREG registration method. A comment to the effect has been added to the text.
Supplementary Materials
How can reconstructed volumetric time-series of potential be thought of as the EM equivalent of an fMRI dataset? This sentence doesn’t make sense.
This sentence indeed did not make sense and has been removed.
Supplementary Materials
Typical Bayesian inference does not include entropy terms, and entropy estimation doesn’t always lend to computing full posterior distributions. What is an “entropy spectrum pathway”? What is µ∗? Why can’t things be made clear to the reader, instead of incredible jargon used here? How does section 6.1.2 relate back to the previous section?
That is correct that Bayesian inference typically does not include entropy terms. We believe that their introduction via the theory of entropy spectrum pathways (ESP) is a significant advance in Bayesian estimation as it provides highly relevent prior information from within the data itself (and therefore always available in spatiotemporal data) that facilitates a practical methodology for the analysis of complex non-linear dynamical system, as contained in the entropy field decomposition (EFD).
Section 6.1.3 has now been replaced by a new Appendix A that discusses ESP in a much more intuitive and conceptual manner.
Supplementary Materials
Section 6.1.3 describes entropy field decomposition in very general terms. What is “non-period”? This section is incomprehensible. Without reference to exactly where in the process this procedure is deployed it is extremely difficult to follow. There seems to be an abuse of notation of using ϕ for eigenvectors in equation (5) and potentials earlier. How do equations 9-11 relate back to the original problem being solved in section 6.1.1? What are multiple modalities being described here that require JESTER?
Section 6.1.3 has now been replaced by a new Appendix A that covers this material in a much more intuitive and conceptual manner.
Supplementary Materials
Section 6.3 discusses source localization methods. While most forward lead-field models assume quasistatic approximations to Maxwell’s equations, these are perfectly valid for the frequency content of brain activity being measured with EEG or MEG. Even with quasi-static lead fields, the solutions can have frequency dependence due to the data having frequency dependence. Solutions do not have to be insensitive to detailed spatially variable electrical properties of the tissues. For instance, if a FEM model was used to compute the forward model, this model will indeed be sensitive to the spatially variable and anisotropic electrical properties. This issue is not even acknowledged.
The frequency dependence of the tissue properties is not the issue. Our theoretical work demonstrates that taking into account the anisotropy and inhomogeneity of the tissue is necessary in order to derive the existence of the weakly evanescent transverse cortical waves (WETCOW) that SPECTRE is detecting. We have added more details about the WETCOW model in the new Section “A physical theory of brain wave” to emphasize this point.
Supplementary Materials
Arguments to disambiguate deep vs shallow sources can be achieved with some but not all source localization algorithms and do not require a non-quasi-static formulation. LORETA is not even the main standard algorithm for comparison. It is disappointing that there are no comparisons to source localization and that this is dismissed away due to some coding issues.
Again, we are not doing ’source localization’. The concept of localized dipole sources is anathema to our brain wave model, and so in our view comparing SPECTRE to such methods only propagates the misleading idea that they are doing the same thing. So they are definitely not dismissed due to coding issues. However, because of repeated requests to do compare SPECTRE with such methods, we attempted to run a standard source localization method with parameters that would at least provide the closest approximation to what we were doing. This attempt highlighted a serious computational issue in source localization methods that is a direct consequence of the fact that they are not attempting to do what SPECTRE is doing - describing a time-varying wave field, in the technical definition of a ’field’ as an object that has a value at every point in space-time.
-
-
www.medrxiv.org www.medrxiv.org
-
eLife Assessment
The study presents valuable findings on Mendelian randomization-phenome-wide association, with BMI associated with health outcomes, and there is a focus on sex differences. The phenotype and genotype data are convincing. The work will be of interest to researchers and clinicians in epidemiology, public health and medicine.
-
Reviewer #2 (Public review):
Summary:
In this present Mendelian randomization-phenome-wide association study, the authors found BMI to be positively associated with many health-related conditions, such as heart disease, heart failure, and hypertensive heart disease. They also found sex differences in some traits, such as cancer, psychological disorders, and ApoB.
Strengths:
The use of the UK-biobank study with detailed phenotype and genotype information.
Comments on revisions:
I believe the authors have presented convincing arguments for the novelty and interpretation of their study. I have no additional comments.
-
Author response:
The following is the authors’ response to the original reviews
eLife Assessment
The study presents some useful findings on Mendelian randomization-phenome-wide association, with BMI associated with health outcomes, and there is a focus on sex differences. Although there are some solid phenotype and genotype data, some of the data are incomplete and could be better presented, perhaps benefiting from more rigorous approaches. Confirmation and further assessment of the observed sex differences will add further value.
Thank you for your positive comments. We have revised the analysis based on your feedback and that from the two reviewers. Specifically, we implemented a stricter multiple testing correction approach, improved the figures, included additional figures in the Supplementary Materials, considered the sex differences more rigorously and reported them in more detail. A comprehensive description of the revisions is provided below.
Public Reviews:
Reviewer #1 (Public review):
Summary:
This study uses information from the UK Biobank and aims to investigate the role of BMI on various health outcomes, with a focus on differences by sex. They confirm the relevance of many of the well-known associations between BMI and health outcomes for males and females and suggest that associations for some endpoints may differ by sex. Overall their conclusions appear supported by the data. The significance of the observed sex variations will require confirmation and further assessment.
Strengths:
This is one of the first systematic evaluations of sex differences between BMI and health outcomes. The hypothesis that BMI may be associated with health differentially based on sex is relevant and even expected. As muscle is heavier than adipose tissue, and as men typically have more muscle than women, as a body composition measure BMI is sometimes prone to classifying even normal weight/muscular men as obese, while this measure is more lenient when used in women. Confirmation of the many well-known associations is as expected and attests to the validity of their approach. Demonstration of the possible sex differences is interesting, with this work raising the need for further study.
Thank you for your valuable comments. We are grateful for the time and effort you have devoted to reviewing our manuscript. We have strengthened our paper by adding your insightful comment about the rationale for sex-specific analysis to the introduction:
Weaknesses:
(1) Many of the statistical decisions appeared to target power at the expense of quality/accuracy. For example, they chose to use self-reported information rather than doctor diagnoses for disease outcomes for which both types of data were available.
Thank you for your valuable comments. We apologize for the lack of clarity in our original description of the phenotypes. Information about health in the UK Biobank was obtained at baseline from tests, measurements and self reports. Subsequently comprehensive data linkage to hospital admissions, death registries and cancer registries was implemented. However, data linkage to primary care data, such as doctor diagnoses, has not been comprehensively implemented for the UK Biobank, possibly for logistic reasons. Doctor diagnoses are only available for about half the cohort, (https://www.ukbiobank.ac.uk/enable-your-research/about-our-data/health-related-outcomes-data). So, we used self-reported diagnoses because they are substantially more comprehensive than the doctor diagnoses. We have explained this point by making the following change to the Methods:
“Where attributes were available from both self-report and doctor diagnosis, we used self-reports. This is because comprehensive record linkage to doctor diagnoses has not yet been fully implemented for the UK Biobank, so information from doctor diagnoses may not fully represent the broader UK Biobank cohort.”
(2) Despite known problems and bias arising from the use of one sample approach, they chose to use instruments from the UK Biobank instead of those available from the independent GIANT GWAS, despite the difference in sample size being only marginally greater for UKB for the context. With the way the data is presented, it is difficult to assess the extent to which results are compatible across approaches.
Thank you for your comments. We agree completely about the issues with a one sample approach, please accept our apologies for not explaining our rationale. The sex-specific GIANT GWAS study is similar in size to the UK Biobank GWAS. However, the sex-specific GIANT GWAS is much less densely genotyped (~2,5 million variants) than the sex-specific UK Biobank GWAS (~10 million variants), so has less power, hence our use of the UK Biobank. To make this clear, we have added the number of variants in each study to the method section. Nevertheless, we also repeated analysis using sex-specific GIANT, as now given in the methods by making the following change
We amended the description in the first paragraph of the results section:
“Initial analysis using sex-specific BMI from GIANT yielded similar estimates as when using sex-specific BMI from the UK Biobank but had fewer SNPs resulting in wider confidence intervals (S Table 1) and fewer significant associations (S Table 1). Analysis using sex-combined GIANT yielded more significant associations but lacks granularity, so we presented the results obtained using sex-specific BMI from the UK Biobank.”
In the discussion we also made the following changes:
“Tenth, although this study primarily utilized sex-specific BMI, we also conducted analyses using overall BMI from GIANT including the UK Biobank, which gave a generally similar interpretation (S Table 1). Using sex-specific BMI from the UK Biobank and GIANT may lead to lower statistical power than using overall population BMI but allows for the detection of traits that are affected differently by BMI by sex. Including findings from the overall population BMI from sex-combined GIANT (S Table 1) makes the results more comparable to previous similar studies.”
(3) The approach to multiple testing correction appears very lenient, although the lack of accuracy in the reporting makes it difficult to know what was done exactly. The way it reads, FDR correction was done separately for men, and then for women (assuming that the duplication in tests following stratification does not affect the number of tests). In the second stage, they compared differences by sex using Z-test, apparently without accounting for multiple testing.
Thank you, we have accounted for multiple comparisons when considering differences by sex and have made corresponding changes. Specifically, in the methods, we changed:
“We obtained differences by sex using a z-test (Paternoster et al., 1998), which as recommended was on a linear scale for dichotomous outcomes (Knol et al., 2007; Rothman, 2008), then we identified which ones remained after allowing for false discovery”
We have made the following changes to the results section:
“We found significant differences by sex in the associations of BMI with 105 health-related attributes (p-value<0.05); 46 phenotypes remained after allowing for false discovery (Table 1). Of these 46 differences most (35) were in magnitude but not direction, such as for SHBG, ischemic heart disease, heart attack, and facial aging, while 11 were directionally different.
Notably, BMI was more strongly positively associated with myocardial infarction, major coronary heart disease events, ischemic heart disease, heart attack, and facial aging in men than in women. BMI was more strongly positively associated with diastolic blood pressure, and hypothyroidism/myxoedema in women than men. BMI was more strongly inversely associated with LDL-c, hay fever and allergic rhinitis in men than women. BMI was more strongly inversely associated with SHBG in women than men.
BMI was inversely associated with ApoB, iron deficiency anemia, hernia, and total testosterone in men, while positively associated with these traits in women (Table 1). BMI was inversely associated with sensitivity/hurt feelings, and ever seeking medical advice for nerves, anxiety, tension, or depression in men. However, BMI was positively associated with sensitivity/hurt feelings and ever seeking medical advice for these same issues in women. BMI was positively associated with muscle or soft tissue injuries and haemorrhage from respiratory passages in men, whilst inversely associated with these traits in women.”
We have correspondingly amended the discussion to reflect these changes by adding:
“Whether the difference in ischemic heart disease rates between men and women that emerged in the US and the UK the late 19th century (Nikiforov & Mamaev, 1998) is explained by rising BMI remains to be determined.”
(4) Presentation lacks accuracy in a few places, hence assessment of the accuracy of the statements made by the authors is difficult.
Thank you, we have revised the whole manuscript in order to improve clarity.
(5) Conclusion (Abstract) "These findings highlight the importance of retaining a healthy BMI" is rather uninformative, especially as they claim that for some attributes the effects of BMI may be opposite depending on sex/gender.
Thank you for your comments. We have changed the conclusion of the abstract, as given below:
“Our study revealed that BMI might affect a wide range of health-related attributes and also highlights notable sex differences in its impact, including opposite associations for certain attributes, such as ApoB; and stronger effects in men, such as for cardiovascular diseases. Our findings underscore the need for nuanced, sex-specific policy related to BMI to address inequities in health.”.
We have changed the Impact statement, as given below:
“BMI may affect a wide range of health-related attributes and there are notable sex differences in its impact, including opposite associations for certain attributes, such as ApoB; and stronger effects in men, such as for cardiovascular diseases. Our findings underscore the need for nuanced, sex-specific policy related to BMI.”
We have changed the conclusion of the paper, as given below:
“Our contemporary systematic examination found BMI associated with a broad range of health-related attributes. We also found significant sex differences in many traits, such as for cardiovascular diseases, underscoring the importance of addressing higher BMI in both men and women possibly as means of redressing differences in life expectancy. Ultimately, our study emphasizes the harmful effects of obesity and the importance of nuanced, sex-specific policy related to BMI to address inequities.in health.”
Reviewer #2 (Public review):
Summary:
In this present Mendelian randomization-phenome-wide association study, the authors found BMI to be positively associated with many health-related conditions, such as heart disease, heart failure, and hypertensive heart disease. They also found sex differences in some traits such as cancer, psychological disorders, and ApoB.
Strengths:
The use of the UK-biobank study with detailed phenotype and genotype information.
Thank you for your valuable comments. We are grateful for the time and effort you have devoted to reviewing our manuscript.
Weaknesses:
(1) Previous studies have performed this analysis using the same cohort, with in-depth analysis. See this paper: Searching for the causal effects of body mass index in over 300,000 participants in UK Biobank, using Mendelian randomization. https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.10079i51
Thank you for your valuable comments. We checked the paper carefully. It gives sex-specific estimates when the outcome was assessed in different ways in men and women, for example the question about number of children was asked in terms of live births in women and number of children fathered in men. In addition, for some significant findings, the authors investigated differences by sex. However, the paper did not use sex-specific BMI or sex-specific outcomes systematically. We have added this paper to our introduction and amended the text to explain the novelty of our study compared to previous studies.
“Previous phenome-wide association studies using MR (MR-PheWASs) have identified impacts of sex-combined BMI on endocrine disorders, circulatory diseases, inflammatory and dermatological conditions, some biomarkers and feelings of nervousness (Hyppönen et al., 2019; Millard et al., 2015; Millard et al. 2019), but did not systematically use sex-specific BMI for the exposure or sex-specific outcomes.”
(2) I believe that the authors' claim, "To our knowledge, no sex-specific PheWAS has investigated the effects of BMI on health outcomes," is not well supported. They have not cited a relevant paper that conducted both overall and sex-stratified PheWAS using UK Biobank data with a detailed analysis. Given the prior study linked above, I am uncertain about the additional contributions of the present research.
Thank you for your valuable comments, please accept our apologies for this oversight. As explained above, we have checked very carefully. There are three previous PheWAS for BMI, Hyppönen et al., 2019, Millard et al., 2015 and Millard et al. 2019. Hyppönen et al., 2019 and Millard et al., 2015 are not sex-specific. Millard et al. 2019 used sex-combined instruments, but some sex-specific outcomes, when the questions were asked sex-specifically, such as age at puberty asked as “age when periods started (menarche)” in women and “relative age of first facial hair” and “relative age voice broke” in men. When they found a factor significantly associated with BMI, they sometimes analyze it further including sex-specific analysis, but they did not do the analysis systematically for men and women with sex-specific BMI and sex-specific outcomes. We have amended the introduction to clarify this point.
“To our knowledge, no sex-specific PheWAS has investigated the effects of BMI on health outcomes (Hyppönen et al., 2019; Millard et al., 2015; Millard et al. 2009). To address this gap, we conducted a sex-specific PheWAS, using the largest available sex-specific GWAS of BMI, to explore the impact of sex-specific BMI on sex-specific health-related attributes”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Presentation, accuracy, and referencing:
(1) The quality of the English language needs to be checked, including that all sentences carry all components required (including verbs).
We thank the reviewer for this suggestion. The manuscript has undergone language editing by a native English-speaker, with particular attention to grammatical completeness (including verb consistency and sentence structure). We have also clarified ambiguities and inconsistencies in terms pointed out by the native English speakers. All revisions have been implemented in the updated manuscript.
(2) The accuracy of statements needs to be checked. For example, in lines 82-83 it is not true that 2015/2019 was 'before the advent of large-scale GWAs studies". In the context of the above in lines 83-85, how can reference be made to a study published in 2020 calling that 'previous' MR studies and how a trial published in 2016 is 'recent'? Please revise, and please also check the manuscript for any other issues with accuracy of this kind.
We thank the reviewer for this suggestion. We have checked the manuscript and revised these sentences to be clearer, by making the following change.
“Previous phenome-wide association studies using MR (MR-PheWASs) have identified impacts of sex-combined BMI on endocrine disorders, circulatory diseases, inflammatory and dermatological conditions, some biomarkers and feelings of nervousness (Hyppönen et al., 2019; Millard et al., 2015; Millard et al. 2019), but did not systematically use sex-specific BMI for the exposure or sex-specific outcomes. Previous MR studies and trials of incretins have expanded our knowledge about a broad range of effects of BMI (Larsson et al., 2020; Marso et al., 2016).”
(3) The adequacy of referencing will need to be checked, e.g. line 136 "as recommended by UK biobank" is vague and needs to be referenced.
We thank the reviewer for this suggestion. We have added citations.
“We categorized attributes as age at recruitment, physical measures, lifestyle and environmental, medical conditions, operations, physiological factors, cognitive function, health and medical history, sex-specific factors, blood assays and urine assays, based on the UK Biobank categories (https://biobank.ndph.ox.ac.uk/ukb/cats.cgi).”
(4) The accurate use of terminology needs to be checked. For example, BMI is a measure of adiposity, while high BMI (typically >30) is used to index obesity.
We thank you for your comments. We have changed the descriptions into “overweight/obesity” throughout.
(5) Figure 1, Please check that complete information is given for 'selection criteria' and that the rationale for all information included is clear. For example, it is currently unclear what is the distinction between the bottom two sections which both present a number of features included in the analyses? Also, the Box detailing exclusion of 3585 variables does not give the criteria for these exclusions. Please add.
Thank you for your comments. We have represented and revised Figure 1. Specifically, we have revised the bottom two sections to give each reason for exclusion and the number excluded for that reason. The updated “Excluded: 3,572 phenotypes, for the reason listed below:” box now contains bullet-points giving each reason for exclusion in the box (e.g. age of certain diseases/disorders onset: 26, alcohol: 56).
(6) Figure 4, does not look to be of typical publication quality.
We thank you for your comments. We have used different colors to make it smaller and more readable. Please see Table 1.
Analyses:
(1) As it stands, it is very difficult for a reader to confirm the conclusion that similar findings are obtained both when using instruments from the UKB and GIANT based on data presented (Stable 1 and 2). I suggested two things.
a) Organise stable 1 and 2 by significance and category, with separation by highlighting for those which are significant under correction. I would consider merging these two tables, such that it would be easy for the reader to make the comparisons side by side. Consider presenting separate tables for the analyses for women and men.
We thank you for your comments. We have followed your helpful advice and merged S Table 1 and S Table 2 into S Table 1. Furthermore, we have also merged S Table 5 to S Table 1.
b) In Stable 3, please add information from related comparisons using the GIANT instruments. To support the authors' claim that associations are similar, but only the precision of estimation differed, you could consider adding information for numbers of associations for those that are directionally consistent and which have an association at least under nominal significance'. For associations where this does not hold, I would refrain from making a claim that the results are not affected by the choice of instrument (or biases relating to the analysis conducted).
We thank you for your comments. Among 42 significant sex-specific associations identified in both the UK Biobank and the sex-specific GIANT consortium for men, all showed consistent directions of effect. Similarly, for women, all of the 45 significant associations exhibited consistent directions for UK Biobank compared with GIANT instruments.
In the sex-specific UK Biobank, there are 203 significant associations in men, and 232 significant associations in women. We have added: in the sex-specific GIANT, there are 46 significant associations in men, and 84 significant associations in women. In the sex-combined GIANT, there are 246 significant associations in men, and 276 significant associations in women. We have provided all this information in S Table 2.
We added the following descriptions at the end of the results section:
“Of the 42 significant sex-specific associations identified in both the UK Biobank and the sex-specific GIANT consortium for men, all were directionally consistent. Similarly, for women, all 45 such significant associations were directionally consistent.
We amended the following descriptions in the first paragraph of the results section:
“Initial analysis using sex-specific BMI from the GIANT yielded similar estimates as when using sex-specific BMI from the UK Biobank but had fewer SNPs resulting in wider confidence intervals (S Table 1) and fewer significant associations (S Table 2). Analysis using sex-combined GIANT yielded more significant associations but lacks granularity, so we presented the results obtained using sex-specific BMI from the UK Biobank.”
In the methods, we changed:
“We obtained differences by sex using a z-test (Paternoster et al., 1998), which as recommended was on a linear scale for dichotomous outcomes (Knol et al., 2007; Rothman, 2008), then we identified which ones remained after allowing for false discovery.”
We have made the following changes to the results section:
“We found significant differences by sex in the associations of BMI with 105 health-related attributes (p-value<0.05); 46 phenotypes remained after allowing for false discovery (Table 1). Of these 46 differences most (35) were in magnitude but not direction, such as for SHBG, ischemic heart disease, heart attack, and facial aging, while 11 were directionally different.
Notably, BMI was more strongly positively associated with myocardial infarction, major coronary heart disease events, ischemic heart disease, heart attack, and facial aging in men than in women. BMI was more strongly positively associated with diastolic blood pressure, and hypothyroidism/myxoedema in women than men. BMI was more strongly inversely associated with LDL-c, hay fever and allergic rhinitis in men than women. BMI was more strongly inversely associated with SHBG in women than men.
BMI was inversely associated with ApoB, iron deficiency anemia, hernia, and total testosterone in men, while positively associated with these traits in women (Table 1). BMI was inversely associated with sensitivity/hurt feelings, and ever seeking medical advice for nerves, anxiety, tension, or depression in men. However, BMI was positively associated with sensitivity/hurt feelings and ever seeking medical advice for these same issues in women. BMI was positively associated with muscle or soft tissue injuries and haemorrhage from respiratory passages in men, whilst inversely associated with these traits in women.”
(2) It is not clear what statistical criteria were used to determine sex differences, and the strategy/presentation should be clarified. In lines 229-231, it is implied that the 'significance' in one gender, but not in the other is used to indicate a difference. However, 'comparison of p-values' is not a valid statistical approach, and a more formal test (accounting for multiple testing would be warranted). It may be that a systematic approach has been implemented, but please check that it is adequately and accurately described to the reader.
Please accept our apologies for being unclear. Multiple comparisons are for independent phenotypes however, here, some phenotypes cannot be independent, therefore, using multiple comparisons in men and women separately is quite strict. We added multiple comparisons for the assessment of sex-differences, which is now given in Table 1. Initially, there were 105 significant associations (p value for sex-difference<0.05) (Table 1), and 46 associations remained after FDR correction (Table 1).
Furthermore, we have made additional minor changes to clarify the wording.
Knol, M. J., van der Tweel, I., Grobbee, D. E., Numans, M. F., & Geerlings, M. I. (2007). Estimating interaction on an additive scale between continuous determinants in a logistic regression model. Int J Epidemiol, 36(5), 1111-1118.
Nikiforov, S. V., & Mamaev, V. B. (1998). The development of sex differences in cardiovascular disease mortality: a historical perspective. Am J Public Health, 88(9), 1348-1353. https://doi.org/10.2105/ajph.88.9.1348
Paternoster, R., Brame, R., Mazerolle, P., & Piquero, A. (1998). Using the correct statistical test for the equality of regression coefficients. Criminology, 36(4), 859-866.
Rothman, K. (2008). Greenland S, Lash TL (ed.). Modern Epidemiology. In: Philadelphia: Lippincott Wolliams & Wilkins.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study investigates both online responses to, and offline replay of, visual motion sequences. Sophisticated MEG analyses provide convincing evidence for both feature-specific and non-specific sequence representations. These intriguing findings will be of interest to perception and learning researchers alike.
-
Reviewer #1 (Public review):
Summary:
The study identifies two types of activation: one that is cue-triggered and non-specific to motion directions, and another that is specific to the exposed motion directions but occurs in a reversed manner. The finding that activity in the medial temporal lobe (MTL) preceded that in the visual cortex suggests that the visual cortex may serve as a platform for the manifestation of replay events, which potentially enhance visual sequence learning.
Evaluations:
Identifying the two types of activation after exposure to a sequence of motion directions is very interesting. The experimental design, procedures and analyses are solid. The findings are interesting and novel.
In the original submission, it was not immediately clear to me why the second type of activation was suggested to occur spontaneously. The procedural differences in the analyses that distinguished between the two types of activation need to be a little better clarified. However, this concern has been satisfactorily addressed in the revision.
-
Reviewer #2 (Public review):
This paper shows and analyzes an interesting phenomenon. It shows that when people are exposed to sequences of moving dots (That is moving dots in one direction, followed by another direction etc.), that showing either the starting movement direction, or ending movement direction causes a coarse-grained brain response that is similar to that elicited by the complete sequence of 4 directions. However, they show by decoding the sensor responses that this brain activity actually does not carry information about the actual sequence and the motion directions, at least not on the time scale of the initial sequence. They also show a reverse reply on a highly-compressed time scale, which is elicited during the period of elevated activity, and activated by the first and last elements of the sequence, but not others. Additionally, these replays seem to occur during periods of cortical ripples, similar to what is found in animal studies.
These results are intriguing. They are based on MEG recordings in humans, and finding such replays in humans is novel. Also, this is based on what seems to be sophisticated statistical analysis. The statistical methodology seems valid, but due to its complexity it is not easy to understand. The methods especially those described in figures 3 and 4 should be explained better.
Comments on second revised version by editorial team:
In response to the reviewer, the authors have substantially expanded and clarified their description of the methodology in this version of the manuscript.
-
Author response:
The following is the authors’ response to the previous reviews
Public Reviews:
Reviewer #1 (Public review):
Summary:
The study identifies two types of activation: one that is cue-triggered and nonspecific to motion directions, and another that is specific to the exposed motion directions but occurs in a reversed manner. The finding that activity in the medial temporal lobe (MTL) preceded that in the visual cortex suggests that the visual cortex may serve as a platform for the manifestation of replay events, which potentially enhance visual sequence learning.
Evaluations:
Identifying the two types of activation after exposure to a sequence of motion directions is very interesting. The experimental design, procedures and analyses are solid. The findings are interesting and novel.
In the original submission, it was not immediately clear to me why the second type of activation was suggested to occur spontaneously. The procedural differences in the analyses that distinguished between the two types of activation need to be a little better clarified. However, this concern has been satisfactorily addressed in the revision.
We thank the reviewer for his/her positive evaluation and thoughtful comments.
Reviewer #2 (Public review):
This paper shows and analyzes an interesting phenomenon. It shows that when people are exposed to sequences of moving dots (That is moving dots in one direction, followed by another direction etc.), that showing either the starting movement direction, or ending movement direction causes a coarsegrained brain response that is similar to that elicited by the complete sequence of 4 directions. However, they show by decoding the sensor responses that this brain activity actually does not carry information about the actual sequence and the motion directions, at least not on the time scale of the initial sequence. They also show a reverse reply on a highly-compressed time scale, which is elicited during the period of elevated activity, and activated by the first and last elements of the sequence, but not others. Additionally, these replays seem to occur during periods of cortical ripples, similar to what is found in animal studies.
These results are intriguing. They are based on MEG recordings in humans, and finding such replays in humans is novel. Also, this is based on what seems to be sophisticated statistical analysis. The statistical methodology seems valid, but due to its complexity it is not easy to understand. The methods especially those described in figures 3 and 4 should be explained better.
We thank the reviewer’s detailed evaluation. As suggested, we have further revised the Methods and Results sections, particularly the descriptions related to Figures 3 and 4, to enhance clarity. Please see the revisions highlighted in red in the revised manuscript.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The most important results here are in Figure 4, and they rely on methods explained in Figure 3. Figure 4 and the results in the figure are confusing.
What is the red bar in 4B,E. What are the units of the Y axis in figure 4B,E?
Does sequenceness have units? How do we interpret these magnitudes apart from the line of statistical significance? Shouldn't there be two lines, one for forward replay and the other for backward replay rather than a single line with positive and negative values? The term sequnceness is defined in figure 3, and is key. The replayed sequence in figure 4A,D seems to last about 120 ms.
What is the meaning of having significance only within a window of 28-36 ms?
We thank the reviewer’s careful reading and insightful comments. We apologize for the lack of clarity regarding these details in the previous version. As mentioned above, we have revised the Methods and Results sections to enhance clarity throughout the manuscript. For convenience, we provide detailed explanations addressing the specific points raised by the reviewer below.
First, the red bars in Figures 4B and 4E indicate the lags when the evidence of sequenceness surpassed the statistical significance threshold, as determined by permutation testing. We have now explicitly clarified this in the revised figure captions.
Second, sequenceness doesn’t have units. It corresponds to the regression coefficient (β) obtained from the second-level GLM in the TDLM framework. Specifically, in the first step of TDLM, we constructed an empirical transition matrix that quantifies the evidence for all possible transitions (e.g., 0° → 90°) at each time lag (Δt). In the second step, we evaluated the extent to which each model transition matrix (e.g., forward or backward transitions) predicts the empirical transition matrix at each Δt, yielding second-level β values. Sequenceness is defined as the difference between the β values for the forward and backward transition models, reflecting the relative strength and directionality of sequential replay. As it is derived from regression coefficients, sequenceness is inherently a unitless measure.
Regarding the interpretation of sequenceness magnitudes beyond statistical significance, the β values reflect the extent to which the model transition matrix explains variance in the empirical transition matrix. While larger β values suggest stronger sequenceness, absolute magnitudes are influenced by various factors, such as between-participant noise. Therefore, the key criterion for interpreting these values is whether they surpass permutationbased significance thresholds, which indicate that the observed sequenceness is unlikely to have occurred by chance.
Third, as the reviewer correctly pointed out, we initially computed two separate regression lines, one for forward replay and the other for backward replay. We then defined sequenceness as the contrast between the forward and backward replay (forward minus backward). This contrast approach is commonly used in previous studies to remove between-participant variance in the sequential replay per se, which may arise due to variability in task engagement or measurement sensitivity (Liu et al., 2021; Nour et al., 2021).
Finally, regarding the duration of replay events, the example sequences shown in Figures 4A and 4D indeed span about 120 ms in total. However, the time lag (Δt) between successive reactivation peaks within these sequences is about 30 ms. This is in line with the findings shown in Figures 4B and 4E, where statistical significance is observed at a time lag window of 28 – 36 ms on the x-axis. It is important to note that the x-axis in these plots represents the time lag (Δt) between sequential reactivations, rather than absolute time.
We hope these clarifications address the reviewer’s concerns, and we have revised the manuscript accordingly to make these points clearer to readers.
The methods here are not simple and not simple to explain. The new version is easier to understand. From the new version it seems that the methodology is sound. It should be still clarified and better explained.
We have carefully revised the manuscript to better explain the methodology. We appreciate the reviewer’s feedback, which is valuable in improving the clarity of our work.
Now that I understand what they mean by decoding probability, I think that this term is confusing or even misleading. The decoding accuracy is the probability that the direction of motion classification was correct. It seems the so-called decoding probability is value of the logistic regression after normalizing the sum to 1. If this is a standard term it can probably be kept, if not another term would be better.
Thank you for the reviewer’s comment. We agree that the term decoding probability may initially seem confusing. However, decoding probability is a commonly used term in the neural decoding literature, particularly in human studies (e.g., Liu et al., 2019; Nour et al., 2021; Turner et al., 2023). To maintain consistency with previous work, we have kept this term in the manuscript. We appreciate the opportunity to clarify this point.
References
Liu, Y., Dolan, R. J., Higgins, C., Penagos, H., Woolrich, M. W., Ólafsdóttir, H. F., Barry, C., Kurth-Nelson, Z., & Behrens, T. E. (2021). Temporally delayed linear modelling (TDLM) measures replay in both animals and humans. eLife, 10, e66917. https://doi.org/10.7554/eLife.66917
Liu, Y., Dolan, R. J., Kurth-Nelson, Z., & Behrens, T. E. J. (2019). Human Replay Spontaneously Reorganizes Experience. Cell, 178(3), 640-652.e14. https://doi.org/10.1016/j.cell.2019.06.012
Nour, M. M., Liu, Y., Arumuham, A., Kurth-Nelson, Z., & Dolan, R. J. (2021). Impaired neural replay of inferred relationships in schizophrenia. Cell, 184(16), 4315-4328.e17. https://doi.org/10.1016/j.cell.2021.06.012
Turner, W., Blom, T., & Hogendoorn, H. (2023). Visual Information Is Predictively Encoded in Occipital Alpha/Low-Beta Oscillations. Journal of Neuroscience, 43(30), 5537–5545. https://doi.org/10.1523/JNEUROSCI.0135-23.2023
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study introduces a self-supervised machine learning method to classify C. elegans postures and behaviors directly from video data, offering an alternative to the skeleton-based approaches that rely on often error-prone tracking. This novel approach holds promise for advancing ethology research. That said, the strength of evidence is currently incomplete, as key aspects - including measuring head-tail orientation, increased behavioral interpretability, and quantitative comparisons to established methods - are underdeveloped and would benefit from further validation.
-
Reviewer #1 (Public review):
Summary:
The submitted article reports the development of an unsupervised learning method that enables quantification of behaviour and poses of C. elegans from 15 minute long videos and presents a spatial map of both. The entire pipeline is a two part process, with the first part based on contrastive learning that represents spatial poses onto an embedded space, while the second part uses a transformer encoder to enable estimation of masked parts in a spatiotemporal sequence.
Strengths:
This analysis approach will prove to be useful for the C. elegans community. The application of the method on various age-related videos on various strains presents a good use-case for the approach. The manuscript is well written and presented.
Specific comments:
(1) One of the main motivations as mentioned in the introduction as well as emphasized in the discussion section is that this approach does not require key-point estimation for skeletonization and is also not dependent on the eigenworm approach for pose estimation. However, the eigenworm data has been estimated using the Tierpsy tracker in videos used in this work and stored as metadata. This is subsequently used for interpretation. It is not clear at this point, how else the spatial embedded map may be interpreted without using this kind of pose estimates obtained from other approaches. Please elaborate and comment.
(2) As per the manuscript, the second part of the pipeline is used to estimate the masked sequences of the spatiotemporal behavioral feature. However, it is not clear what the numbers listed in Fig. 2.3 represent?
(3) It is not clear how motion speed is linked to individual poses as mentioned in Figs. 4 (b) and (c).
-
Reviewer #2 (Public review):
Summary:
The manuscript by Maurice and Katarzyna describes a self-supervised, annotation-free deep-learning approach capable of quantitatively representing complex poses and behaviors of C. elegans directly from video pixel values. Their method overcomes limitations inherent to traditional methods relying on skeletonization or keypoint tracking, which often fail with highly coiled or self-intersecting worms. By applying self-supervised contrastive learning and a Transformer-based network architecture, the authors successfully capture diverse behavioral patterns and depict the aging trajectory of behavioral repertoire. This provides a useful new tool for behavioral research in C. elegans and other flexible-bodied organisms.
Strengths:
Reliable tracking and segmentation of complex poses remain significant bottlenecks in C. elegans behavioral research, and the authors made valuable attempts to address these challenges. The presented method offers several advantages over existing tools, including freedom from manual labeling, independence from explicit skeletonization or keypoint tracking, and the capability to capture highly coiled or overlapping poses. Thus, the proposed method would be useful to the C. elegans research community.
The research question is clearly defined. Methods and results are engagingly presented, and the manuscript is concise and well-organized.
Weaknesses:
(1) In the abstract, the claim of an 'unbiased' approach is not well-supported. The method is still affected by dataset biases, as mentioned in the aging results (section 4.3).<br /> (2) In section 3.2, the rationale behind rotating worm images to a vertical orientation is unclear.<br /> (3) The methods section is clearly written but uses overly technical language, making it less accessible to the audience of eLife, the majority of whom are biologists. Clearer explanations of key methods and the rationale behind their selection are needed. For example, in section 3.3, the authors should briefly explain in simple language what contrastive learning is, why they chose it, and why this method potentially achieves their goal.<br /> (4) The reason why the gray data points could not be resolved by Tierpsy is not quantitatively described. Are they all due to heavily coiled or overlapping poses?<br /> (5) In section 4.1, generating pose representations grouped by genetic strains would provide insights into strain-specific differences resolved by the proposed method.<br /> (6) Fig. 3a requires clarification. Highly bent poses (red points) intuitively should be close to highly coiled poses (gray points). The authors should explain the observed greenish/blueish points interfacing with the gray points.<br /> (7) In Fig. 3a, some colored points overlap with the gray point cloud. Why can Tierpsy resolve these overlapping points representing highly coiled poses? A more systematic quantitative comparison between Tierpsy and the proposed method is required.<br /> (8) The claim in section 4.2 regarding strain separation in pose embedding spaces is unsupported by Fig. 3a, which lacks strain-based distinctions. As mentioned in point #5, showing pose representations grouped by different strains is required.<br /> (9) In section 4.2, how the authors could verify the statement, "This likely occurs since most strains share common behaviors such as simple forward locomotion"?<br /> (10) An important weakness of the proposed method is its low direct interpretability, as it is not based on handcrafted features. To better interpret the pose/behavior embedding space, it would be helpful to compare it against more basic Tierpsy features in Fig. 3 and 4. This comparison could reveal what understandable features were learned by the neural network, thereby increasing human interpretability.<br /> (11) The main conclusion of section 4.3 is not sufficiently tested. Is Fig. 5a generated only from data of N2 animals? To quantitatively verify the statement, "Young individuals appear to display a wide range of behaviors, while as they age their behavior repertoire reduces," the authors should perform a formal analysis of behavioral variability throughout aging.<br /> (12) In Fig. 5a, better visualization of aging trajectories could include plotting the center of mass along with variance of the point cloud over time.<br /> (13) To better reveal aging trajectories of behavioral changes for different genetic backgrounds, it would be meaningful to generate behavior representations for different strains as they age.<br /> (14) As a methods paper, the ease of use for other researchers should be explicitly addressed, and source code and datasets should be provided.
-
Reviewer #3 (Public review):
Summary:
In this paper, the authors present an unsupervised learning approach to represent C. elegans poses and temporal sequences of poses in low-dimensional spaces by directly using pixel values from video frames. The method does not rely on the exact identification of the worm's contour/midline, nor on the identification of the head and tail prior to analyzing behavioral parameters. In particular, using contrastive learning, the model represents worm poses in low-dimensional spaces, while a transformer encoder neural network embeds sequences of worm postures over short time scales. The study evaluates this newly developed method using a dataset of different C. elegans genetic strains and aging individuals. The authors compared the representations inferred by the unsupervised learning with features extracted by an established approach, which relies on direct identification of the worm's posture and its head-tail direction.
Strengths:
The newly developed method provides a coarse classification of C. elegans posture types in a low-dimensional space using a relatively simple approach that directly analyzes video frames. The authors demonstrate that representations of postures or movements of different genotypes, based on pixel values, can be distinguishable to some extent.
Weaknesses:
- A significant disadvantage of the presented method is that it does not include the direction of the worm's body (e.g., head/tail identification). This highly limits the detailed and comprehensive identification of the worm's behavioral repertoire (on- and off-food), which requires body directionality in order to infer behaviors (for example, classifying forward vs. reverse movements). In addition, including a mix of opposite postures as input to the new method may create significant classification artifacts in the low-dimensional representation-such that, for example, curvature at opposite parts of the body could cluster together. This concern applies both to the representation of individual postures and to the representation of sequences of postures.<br /> - The authors state that head-tail direction can be inferred during forward movement. This is true when individuals are measured off-food, where they are highly likely to move forward. However, when animals are grown on food, head-tail identification can also be based on quantifying the speed of the two ends of the worm (the head shows side-to-side movements). This does not require identifying morphological features. See, for example, Harel et al. (2024) or Yemini et al. (2013).<br /> - Another confounding parameter that cannot be distinguished using the presented method is the size of individuals. Size can differ between genotypes, as well as with aging. This can potentially lead to clustering of individuals based on their size rather than behavior.<br /> - There is no quantitative comparison between classification based on the presented method and methods that rely on identifying the skeleton.
-
Author response:
We thank the editors and the reviewers for their valuable comments and for taking the time to evaluate our manuscript.
Answers to Reviewer 1:
(1) The core contribution of our method is that it learns meaningful spatiotemporal embeddings directly from image data without requiring pose estimation or eigenworm-based features as input. The learned embedding space can serve as a foundation for downstream tasks such as behavioral classification, clustering, or anomaly detection, further supporting its utility beyond visualization through eigenworm-derived features. Here we use the Tierpsy-derived features for latent space interpretation and for validation that our approach does indeed encode meaningful postural information. Additionally, without any Tierpsy-calculated features users can still color embeddings by known metadata like mutation or age and compare different strains to each other.
(2) The numbers shown in Fig. 2.3 are illustrative placeholders intended to conceptually represent a vector of behavioral features. They do not correspond to any specific measurements or carry intrinsic meaning. We agree that this may lead to confusion, and we will clarify this in the revised manuscript.
(3) The visualizations in Figs. 4 (b) and (c) show the embeddings of sequences of behavior, rather than individual poses. Therefore, motion-related features such as speed are related to temporal patterns in those sequences rather than static postures. The color overlays reflect average motion characteristics (e.g., speed) of short behavior clips projected into the embedding space, rather than being directly linked to any single frame or pose.
Answers to Reviewer 2:
(1) In the abstract, our use of the term "unbiased" refers specifically to the avoidance of human-generated bias through feature engineering—i.e., the model does not rely on handcrafted features or predefined pose representations – the representations are based on data only. However, we agree that the model is still subject to dataset biases and will rectify this in the revised manuscript.
(2) The worm images are rotated to a common vertical orientation to remove orientation as a source of variability in the input. This ensures that the model focuses on learning pose and behavioral dynamics rather than arbitrary head-tail or angular positioning. While data augmentation could in theory account for this variability, we found in our preliminary experiments that applying this preprocessing step led to more stable and interpretable embeddings.
(3) We agree that simplifying the technical explanations would enhance the manuscript’s accessibility. In the revised version, we will briefly introduce contrastive learning in a less technical language.
(4) The gray points in Fig. 3a represent frames that Tierpsy could not resolve, primarily due to coiled, self-intersecting, or overlapping worm postures as Tierpsy uses skeletonization to estimate the centerline. This approach can fail if kind of challenging elements are part of the image.
(5) We appreciate this suggestion and consider it for a revised version of the manuscript.
(6) Although it may seem intuitive for highly bent (red) poses to lie near coiled (gray) ones in the embedding space, the clustering pattern observed reflects how the network organizes pose information. The red/orange cluster consists of distinguishable bent poses that are visually distinct and consistently separable from other postures. In contrast, the greenish and blueish poses are less strongly bent and may share more visual overlap with the unresolved (gray) images.
(7) The overlap occurs because some highly bent or coiled worms can still be (partially) resolved by Tierpsy, depending on specific pose conditions (e.g., head and tail not touching, not self-overlapping). However, Tierpsy fails to consistently resolve such frames. We will describe these cases in more detail in the revised manuscript.
(8) Thank you, we agree this claim needs to be better supported and will develop it in the revision.
(9) To support this statement we mainly visualized the respective sequences embedded in this area of the embedding space and found that it mostly consists of common behaviors such as forward locomotion.
(10) We agree that interpretability is important and plan to include additional figures quantifications of the embedding space using more basic Tierpsy features.
(11) Fig. 5a is indeed based solely on N2 animals. In the revised manuscript we will include quantitative measures of behavioral variability and its change with age.
(12) We appreciate this suggestion and consider it for a revised version
(13) We agree this would be a valuable analysis. However, our current dataset primarily includes aging data for N2 animals. We acknowledge this limitation and consider adding more strains for future work.
(14) We will include links to our source code in the revised manuscript
Answers to Reviewer 3:
(1-2) Our current method is agnostic to head-tail orientation, which indeed restricts the ability to distinguish behaviors that rely on directional cues. We made this design choice as we believe that correctly identifying head/tail orientation can be a challenging task that may introduce additional biases or fail in difficult imaging conditions. However, we fully agree that integrating directional information would improve behavioral resolution, and this is a natural extension of our current framework. In future work, we aim to incorporate head-tail disambiguation.
(3) We explicitly designed our preprocessing and training pipeline to encourage size invariance, for example by resizing individuals to a consistent scale, as the focus of our work is to encode posture and movement only. However, we acknowledge that absolute size information is lost in this process, which can be informative for distinguishing genotypes or age-related changes.
(4) We agree that a direct quantitative comparison between our embedding-based representations and skeleton-based feature sets would strengthen the paper. Our current focus was to assess whether meaningful behavioral features could be learned from a skeleton-free representation.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This paper presents important findings regarding the rhythmicity of overlapping target and distractor processing and how this affects behaviour. The methods are, in general, clearly laid out and defensible; however, the evidence supporting the central claims is incomplete due to potential biases in the analysis methods. Further control analyses would be beneficial and could greatly strengthen the conclusions that can be drawn from these data.
-
Reviewer #1 (Public review):
Summary:
Using a combination of EEG and behavioural measurements, the authors investigate the degree to which processing of spatially-overlapping targets (coherent motion) and distractors (affective images) are sampled rhythmically and how this affects behaviour. They found that both target processing (via measurement of amplitude modulations of SSVEP amplitude to target frequency) and distractor processing (via MVPA decoding accuracy of bandpassed EEG relative to distractor SSVEP frequency) displayed a pronounced rhythm at ~1Hz, time-locked to stimulus onset. Furthermore, the relative phase of this target/distractor sampling predicted the accuracy of coherent motion detection across participants.
Strengths:
(1) The authors are addressing a very interesting question with respect to sampling of targets and distractors, using neurophysiological measurements to their advantage in order to parse out target and distractor processing.
(2) The general EEG analysis pipeline is sensible and well-described.
(3) The main result of rhythmic sampling of targets and distractors is striking and very clear even on a participant level.
(4) The authors have gone to quite a lot of effort to ensure the validity of their analyses, especially in the Supplementary Material.
(5) It is incredibly striking how the phases of both target and distractor processing are so aligned across trials for a given participant. I would have thought that any endogenous fluctuation in attention or stimulus processing like that would not be so phase aligned. I know there is literature on phase resetting in this context, the results seem very strong here and it is worth noting. The authors have performed many analyses to rule out signal processing artifacts, e.g., the sideband and beating frequency analyses.
Weaknesses:
(1) In general, the representation of target and distractor processing is a bit of a reach. Target processing is represented by SSVEP amplitude, which is most likely going to be related to the contrast of the dots, as opposed to representing coherent motion energy, which is the actual target. These may well be linked (e.g., greater attention to the coherent motion task might increase SSVEP amplitude), but I would call it a limitation of the interpretation. Decoding accuracy of emotional content makes sense as a measure of distractor processing, and the supplementary analysis comparing target SSVEP amplitude to distractor decoding accuracy is duly noted.
(2) Comparing SSVEP amplitude to emotional category decoding accuracy feels a bit like comparing apples with oranges. They have different units and scales and probably reflect different neural processes. Is the result the authors find not a little surprising in this context? This relationship does predict performance and is thus intriguing, but I think this methodological aspect needs to be discussed further. For example, is the phase relationship with behaviour a result of a complex interaction between different levels of processing (fundamental contrast vs higher order emotional processing)?
-
Reviewer #2 (Public review):
Summary:
In this study, Xiong et al. investigate whether rhythmic sampling - a process typically observed in the attended processing of visual stimuli - extends to task-irrelevant distractors. By using EEG with frequency tagging and multivariate pattern analysis (MVPA), they aimed to characterize the temporal dynamics of both target and distractor processing and examine whether these processes oscillate in time. The central hypothesis is that target and distractor processing occur rhythmically, and the phase relationship between these rhythms correlates with behavioral performance.
Major Strengths:
(1) The extension of rhythmic attentional sampling to include distractors is a novel and interesting question.
(2) The decoding of emotional distractor content using MVPA from SSVEP signals is an elegant solution to the problem of assessing distractor engagement in the absence of direct behavioral measures.
(3) The finding that relative phase (between 1 Hz target and distractor processes) predicts behavioral performance is compelling.
Major Weaknesses and Limitations:
(1) Incomplete Evidence for Rhythmicity at 1 Hz: The central claim of 1 Hz rhythmic sampling is insufficiently validated. The windowing procedure (0.5s windows with 0.25s step) inherently restricts frequency resolution, potentially biasing toward low-frequency components like 1 Hz. Testing different window durations or providing controls would significantly strengthen this claim.
(2) No-Distractor Control Condition: The study lacks a baseline or control condition without distractors. This makes it difficult to determine whether the distractor-related decoding signals or the 1 Hz effect reflect genuine distractor processing or more general task dynamics.
(3) Decoding Near Chance Levels: The pairwise decoding accuracies for distractor categories hover close to chance (~55%), raising concerns about robustness. While statistically above chance, the small effect sizes need careful interpretation, particularly when linked to behavior.
(4) No Clear Correlation Between SSVEP and Behavior: Neither target nor distractor signal strength (SSVEP amplitude) correlates with behavioral accuracy. The study instead relies heavily on relative phase, which - while interesting - may benefit from additional converging evidence.
(5) Phase-analysis: phase analysis is performed between different types of signals hindering their interpretability (time-resolved SSVEP amplitude and time-resolved decoding accuracy).
Appraisal of Aims and Conclusions:
The authors largely achieved their stated goal of assessing rhythmic sampling of distractors. However, the conclusions drawn - particularly regarding the presence of 1 Hz rhythmicity - rest on analytical choices that should be scrutinized further. While the observed phase-performance relationship is interesting and potentially impactful, the lack of stronger and convergent evidence on the frequency component itself reduces confidence in the broader conclusions.
Impact and Utility to the Field:
If validated, the findings will advance our understanding of attentional dynamics and competition in complex visual environments. Demonstrating that ignored distractors can be rhythmically sampled at similar frequencies to targets has implications for models of attention and cognitive control. However, the methodological limitations currently constrain the paper's impact.
Additional Context and Considerations:
(1) The use of EEG-fMRI is mentioned but not leveraged. If BOLD data were collected, even exploratory fMRI analyses (e.g., distractor modulation in visual cortex) could provide valuable converging evidence.
(2) In turn, removal of fMRI artifacts might introduce biases or alter the data. For instance, the authors might consider investigating potential fMRI artifact harmonics around 1 Hz to address concerns regarding induced spectral components.
-
Author response:
Reviewer 1:
(1) In general, the representation of target and distractor processing is a bit of a reach. Target processing is represented by SSVEP amplitude, which is most likely going to be related to the contrast of the dots, as opposed to representing coherent motion energy, which is the actual target. These may well be linked (e.g., greater attention to the coherent motion task might increase SSVEP amplitude), but I would call it a limitation of the interpretation. Decoding accuracy of emotional content makes sense as a measure of distractor processing, and the supplementary analysis comparing target SSVEP amplitude to distractor decoding accuracy is duly noted.
We agree with the reviewer. This is certainly a limitation and will be acknowledged as such in the revised manuscript.
(2) Comparing SSVEP amplitude to emotional category decoding accuracy feels a bit like comparing apples with oranges. They have different units and scales and probably reflect different neural processes. Is the result the authors find not a little surprising in this context? This relationship does predict performance and is thus intriguing, but I think this methodological aspect needs to be discussed further. For example, is the phase relationship with behaviour a result of a complex interaction between different levels of processing (fundamental contrast vs higher order emotional processing)?
Traditionally, the SSVEP amplitude at the distractor frequency is used to quantify distractor processing. Given that the target SSVEP amplitude is stronger than that for the distractor, it is possible that the distractor SSVEP amplitude is contaminated by the target SSVEP amplitude due to spectral power leakage; see Figure S4 for a demonstration of this. Because of this issue we therefore introduce the use of decoding accuracy as an index of distractor processing. This has not been done in the SSVEP literature. The lack of correlation between the distractor SSVEP amplitude and the distractor decoding accuracy, although it is kind of like comparing apples with oranges as pointed out by the reviewer, serves the purpose of showing that these two measures are not co-varying, and the use of decoding accuracy is free from the influence of the distractor SSVEP amplitude and thereby free from the influence by the target SSVEP amplitude. This is an important point. We will provide a more thorough discussion of this point in the revised manuscript.
Reviewer 2:
(1) Incomplete Evidence for Rhythmicity at 1 Hz: The central claim of 1 Hz rhythmic sampling is insufficiently validated. The windowing procedure (0.5s windows with 0.25s step) inherently restricts frequency resolution, potentially biasing toward low-frequency components like 1 Hz. Testing different window durations or providing controls would significantly strengthen this claim.
This is an important point. We plan to follow the reviewer’s suggestion and repeat our analysis using different window sizes to test the robustness of the observed 1Hz rhythmicity. In addition, we plan to also apply the Hilbert transform to extract time-point-by-time-point amplitude envelopes, which will provide a window-free estimation of the distractor strength and further validate the presence of the low-frequency 1Hz dynamics.
(2) No-Distractor Control Condition: The study lacks a baseline or control condition without distractors. This makes it difficult to determine whether the distractor-related decoding signals or the 1 Hz effect reflect genuine distractor processing or more general task dynamics.
We agree with the reviewer. This is certainly a limitation and will be acknowledged as such in the revised manuscript.
(3) Decoding Near Chance Levels: The pairwise decoding accuracies for distractor categories hover close to chance (~55%), raising concerns about robustness. While statistically above chance, the small effect sizes need careful interpretation, particularly when linked to behavior.
This is a good point. In addition to acknowledging this in the revised manuscript, we will carry out two additional analyses to test this issue further. First, we will implement a random permutation procedure, in which the trial labels are randomly shuffled and the null-hypothesis distribution for decoding accuracy is built, and compare the decoding accuracy from the actual data to this distribution. Second, we will perform a temporal generalization analysis to examine whether the neural representations of the distractor drift over the course of an entire trial, which is 11 seconds long. Recent studies suggest that even when the stimulus stays the same, their neural representations may drift over time.
(4) No Clear Correlation Between SSVEP and Behavior: Neither target nor distractor signal strength (SSVEP amplitude) correlates with behavioral accuracy. The study instead relies heavily on relative phase, which - while interesting - may benefit from additional converging evidence.
We felt that what the reviewer pointed out is actually the main point of our study, namely, it is not the overall target or distractor strength that matters for behavior, it is their temporal relationship that matters for behavior. This reveals a novel neuroscience principle that has not been reported in the past. We will stress this point further in the revised manuscript.
(5) Phase-analysis: phase analysis is performed between different types of signals hindering their interpretability (time-resolved SSVEP amplitude and time-resolved decoding accuracy).
The time-resolved SSVEP amplitude is used to index the temporal dynamics of target processing whereas the time-resolved decoding accuracy is used to index the temporal dynamics of distractor processing. As such, they can be compared, using relative phase for example, to examine how temporal relations between the two types of processes impact behavior. This said, we do recognize the reviewer’s concern that these two processes are indexed by two different types of signals. We plan to normalize each time course, make them dimensionless, and then compute the temporal relations between them.
Appraisal of Aims and Conclusions:
The authors largely achieved their stated goal of assessing rhythmic sampling of distractors. However, the conclusions drawn - particularly regarding the presence of 1 Hz rhythmicity - rest on analytical choices that should be scrutinized further. While the observed phase-performance relationship is interesting and potentially impactful, the lack of stronger and convergent evidence on the frequency component itself reduces confidence in the broader conclusions.
Impact and Utility to the Field:
If validated, the findings will advance our understanding of attentional dynamics and competition in complex visual environments. Demonstrating that ignored distractors can be rhythmically sampled at similar frequencies to targets has implications for models of attention and cognitive control. However, the methodological limitations currently constrain the paper's impact.
Thanks for these comments and positive assessment of our work’s potential implications and impact. We will try our best in the revision process to address the concerns.
Additional Context and Considerations:
(1) The use of EEG-fMRI is mentioned but not leveraged. If BOLD data were collected, even exploratory fMRI analyses (e.g., distractor modulation in visual cortex) could provide valuable converging evidence.
Indeed, leveraging fMRI data in EEG studies would be very beneficial, as having been demonstrated in our previous work. However, given that this study concerns the temporal relationship between target and distractor processing, it is felt that fMRI, given its well-known limitation in temporal resolution, has limited potential to contribute. We will be exploring this rich dataset in other ways where the two modalities are integrated to gain more insights not possible with either modality used alone.
(2) In turn, removal of fMRI artifacts might introduce biases or alter the data. For instance, the authors might consider investigating potential fMRI artifact harmonics around 1 Hz to address concerns regarding induced spectral components.
We have done extensive work in the area of simultaneous EEG-fMRI and have not encountered artifacts with a 1Hz rhythmicity. Also, the fact that the temporal relations between target processing and distractor processing at 1Hz predict behavior is another indication that the 1Hz rhythmicity is a neuroscientific effect not an artifact. However, we will be looking into this carefully and address this in the revision process.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study proposes an updated analysis technique that allows researchers to identify rhythms in behavior. If the proposed analyses control the rate of false positives, this will be an important contribution for all neuroscientists interested in rhythmic cognition. At present, the strength of evidence is incomplete, as the simulations ignore one crucial aspect of temporal structure in behavior.
-
Reviewer #1 (Public review):
Summary:
Some years ago, Brookshire proposed a method to identify oscillations in behavioural data that controls for effects of aperiodic trends. Such trends can produce false positive results if not controlled for. Although this method successfully controlled for this issue, it was also relatively insensitive to true effects, and it remained unclear whether it was unable to replicate published evidence for behavioural oscillations because they were false positives or the method could not detect them. In simulated data, Harris & Beale show that their revised version of the method proposed by Brookshire is more sensitive to effects and equally unsusceptible to false positives. When applied to available data, this new version indeed revealed evidence for behavioural oscillations. This paper is therefore an important piece in the puzzle of the ongoing debate on behavioural oscillations.
Strengths:
(1) The paper is well written and compact.
(2) The new method proposed is tested thoroughly, and its application in simulated data shows its properties.
(3) It is very important that the code is made publicly available.
(4) The fact that this new version identifies behavioural oscillations in available datasets can resolve the current debate on the existence of such oscillations.
Weaknesses:
I see the following weaknesses as minor.
(1) I wonder whether the frequency-dependent results (e.g., Figures 7 and 8) need to be seen in light of the sampling rate used in the simulations. For example, a lower sampling rate might be sufficient if only low frequencies are of interest in the data and lead to higher sensitivity as the number of trials (per time point) can be increased. Conversely, a higher sampling rate might lead to a higher sensitivity for the detection of effects at higher frequencies.
(2) The behavioural oscillations from individual participants do not need to have common phases for this analysis to reveal an effect. However, this also means that in a scenario where they do have common phases, this similarity remains "unused" by the analysis (e.g., due to similar phases, the oscillation could be easier to identify on the group level as signals that are not phase locked are averaged out). In such a scenario, it remains unclear whether the analysis proposed is the most sensitive one.
-
Reviewer #2 (Public review):
Summary
Dozens of published studies have investigated rhythms in behavior. These studies have typically tested for oscillations by shuffling the timestamps of the individual observations and comparing the resulting shuffled spectra with the empirical spectrum. However, that shuffling-in-time method leads to strongly inflated rates of false positives. Brookshire (2022) suggested a method that controls the rate of false positives (the "AR-surrogate method"). In the current study, Harris and Beale propose a modification of the AR-surrogate analysis method with the goal of increasing the sensitivity while maintaining a low rate of false positives.
This study is carefully conducted and it addresses an interesting question. However, the simulations were performed in a way that ignores one important source of temporal structure: non-oscillatory patterns that are consistent across subjects. In order to know whether the updated AR-surrogate method would control the rate of false positives in real behavioral data, we need to know whether it controls the rate of false positives when the data includes aperiodic patterns that are consistent across subjects.
Strengths
This study was constructed carefully and written up very clearly. It's a clever idea to analyze the time series separately for each participant. After examining how the updated AR-surrogate method behaves when the simulated data includes consistency across subjects, this will be a useful contribution to the field.
Weaknesses
When describing their simulations of behavioral data, the authors write: "Each participant's data was produced by creating an independent idealised time-course of 1-second length, sampled at 60 Hz."
Because these simulations generated a totally independent time-course for every subject, they don't capture an important source of aperiodic structure in real behavior: consistent non-oscillatory patterns that occur across subjects. In other words, these simulations do not account for any pattern that remains after averaging across subjects. The literature is rich with patterns that persist across subjects, including all the studies of behavioral oscillations that analyze their data after averaging across subjects (e.g., Landau & Fries, 2012; Fiebelkorn, Saalmann, & Kastner, 2013, etc). As a consequence, I suspect that the reported increase in power comes at the expense of a corresponding increase in false positives, but that the false positives aren't captured here due to the lack of consistency across simulated subjects.
It's therefore possible that the authors' updated AR-surrogate method would mistakenly conclude that behavior oscillates when it only includes aperiodic consistency across subjects. Since that kind of aperiodic structure is ubiquitous, this analysis could lead to very high rates of false positives. Luckily, it's easy to find out whether this is the case - the authors could simulate data using an idealized time-course that is consistent across subjects.
-
Reviewer #3 (Public review):
Summary:
This work revises the autoregressive surrogate (AR-surrogate) method proposed in 2022 by Brookshire to estimate the oscillatory content of behavioural time series. The main issue raised by Brookshire was the inadequacy of methods used in a series of papers that rely on shuffling the time axis of the behavioural data. Brookshire argued that while this approach tests for temporal structures, it does not differentiate between 1/f activity and true oscillatory signals. The AR-surrogate, on the other hand, removes aperiodic activity and should therefore provide a more accurate representation of oscillatory behaviour.
In this well-written paper, Harris and Beale clearly describe an improvement to Brookshire's method, which has been called into question for its low sensitivity.
Strengths:
The starting point of this work is that oscillatory patterns should be tested at the individual participant level rather than at the group level. This is critical because anyone working with behavioural data will know that averaging across participants generates distorted time series. Averaging also assumes phase consistency across participants, which may not always be valid.
Once freed from this limitation, the results presented here are exciting and convincingly demonstrate a significant improvement over the original implementation.
The authors have devised a series of tests that systematically assess the effects of participant and trial number, and effect size on the accuracy of AR-surrogate results. This is particularly useful, as it may guide researchers in designing appropriate behavioural experiments.
Weaknesses:
The method proposed here is undoubtedly an improvement on the original. However, its biggest limitation is the restriction on the frequencies that can be investigated. This is acknowledged by the authors, who rightly point out that there is still room for improvement. Another issue is that modulation depths below 10-15% may be difficult to detect.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This carefully conducted study aims to understand how the early visual experience of premature infants induces lasting deficits, including compromised motion processing. This important question is addressed in a ferret animal model, in which the developing visual system was exposed prematurely to patterned visual input by opening one or both eyes early, at a time when both retinal waves and light traveling through closed lids drive sensory responses. Convincing evidence is presented, suggesting that eye opening at this time impacts temporal frequency tuning and elevates spontaneous firing rates. These findings will have great relevance for neuroscientists studying visual system development, particularly in the context of premature birth.
-
Reviewer #1 (Public review):
The authors note that very premature infants experience the visual world early and, as a consequence, sustain lasting deficits including compromised motion processing. Here they investigate the effects of early eye opening in ferret, choosing a time point after birth when both retinal waves and light traveling through closed lids drive sensory responses. The laboratory has long experience in quantitative studies of visual response properties across development and this study reflects their expertise.
The investigators find little or no difference in mean orientation and direction selectivity, or in spatial frequency tuning, as a result of early eye opening but marked differences in temporal frequency tuning. These changes are especially interesting as they relate to deficits seen in prematurely delivered children. Temporal frequency bandwidth for responses evoked from early-opened contralateral eyes were broader than for controls; this is the case for animals in which either one or both eyes were opened prematurely. Further, when only one eye was opened early, responses to low temporal frequencies were relatively stronger.
The investigators also found changes in firing rate and signs of response to visual stimuli. Premature eye-opening increased spontaneous rates in all test configurations. When only one eye was opened early, firing rates recorded from the ipsilateral cortex were strongly suppressed, with more modest effects in other test cases.
As the authors' discussion notes, these observations are just a starting point for studies underlying mechanism. The experiments are so difficult to perform and so carefully described that the results will be foundational for future studies of how premature birth influences cortical development.
-
Reviewer #2 (Public review):
In this paper, Griswold and Van Hooser investigate what happens if animals are exposed to patterned visual experience too early, before its natural onset. To this end, they make use of the benefits of the ferret as a well-established animal model for visual development. Ferrets naturally open their eyes around postnatal day 30; here, Griswold and Van Hooser opened either one or both eyes prematurely. Subsequent recordings in the mature primary visual cortex show that while some tuning properties like orientation and direction selectivity developed normally, the premature visual exposure triggered changes in temporal frequency tuning and overall firing rates. These changes were widespread, in that they occurred even for neurons responding to the eye that was not opened prematurely. These results demonstrate that the nature of the visual input well before eye opening can have profound consequences on the developing visual system.
The conclusions of this paper are well supported by the data, but some aspects of the data could be clarified, and the discussion could be extended.
(1) The assessment of the tuning properties is based on fits to the data. Presumably, neurons for which the fits were poor were excluded? It would be useful to know what the criteria were, how many neurons were excluded, and whether there was a significant difference between the groups in the numbers of neurons excluded (which could further point to differences between the groups).
(2) For the temporal frequency data, low- and high-frequency cut-offs are defined, but then only used for the computation of the bandwidth. Given that the responses to low temporal frequencies change profoundly with premature eye opening, it would be useful to directly compare the low- and high-frequency cut-offs between groups, in addition to the index that is currently used.
(3) In addition to the tuning functions and firing rates that have been analyzed so far, are there any differences in the temporal profiles of neural responses between the groups (sustained versus transient responses, rates of adaptation, latency)? If the temporal dynamics of the responses are altered significantly, that could be part of an explanation for the altered temporal tuning.
(4) It would be beneficial for the general interpretation of the results to extend the discussion. First, it would be useful to provide a more detailed discussion of what type of visual information might make it through the closed eyelids (the natural state), in contrast to the structured information available through open eyes. Second, it would be useful to highlight more clearly that these data were collected in peripheral V1 by discussing what might be expected in binocular, more central V1 regions. Third, it would be interesting to discuss the observed changes in firing rates in the context of the development of inhibitory neurons in V1 (which still undergo significant changes through the time period of premature visual experience chosen here).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The study presents solid evidence that the location of synapses on dendritic branches, as well as synaptic plasticity of excitatory and inhibitory synapses, influences the ability of a neuron to discriminate combinations of sensory stimuli. The ideas in this work are very interesting, presenting an important direction in the computational neuroscience field about how to harness the computational power of "active dendrites" for solving learning tasks.
-
Reviewer #1 (Public review):
Summary:
This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." In the absence of inhibitory plasticity, the proposed mechanisms result in good performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Interestingly, adding inhibitory plasticity improves classification performance even when input features are randomly distributed.
Strengths:
The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.
Weaknesses:
The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation.
-
Reviewer #2 (Public review):
Summary:
The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.
Strengths:
The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
Summary:
This computational modeling study builds on multiple previous lines of experimental and theoretical research to investigate how a single neuron can solve a nonlinear pattern classification task. The authors construct a detailed biophysical and morphological model of a single striatal medium spiny neuron, and endow excitatory and inhibitory synapses with dynamic synaptic plasticity mechanisms that are sensitive to (1) the presence or absence of a dopamine reward signal, and (2) spatiotemporal coincidence of synaptic activity in single dendritic branches. The latter coincidence is detected by voltage-dependent NMDA-type glutamate receptors, which can generate a type of dendritic spike referred to as a "plateau potential." The proposed mechanisms result in moderate performance on a nonlinear classification task when specific input features are segregated and clustered onto individual branches, but reduced performance when input features are randomly distributed across branches. Given the high level of complexity of all components of the model, it is not clear which features of which components are most important for its performance. There is also room for improvement in the narrative structure of the manuscript and the organization of concepts and data.
Strengths:
The integrative aspect of this study is its major strength. It is challenging to relate low-level details such as electrical spine compartmentalization, extrasynaptic neurotransmitter concentrations, dendritic nonlinearities, spatial clustering of correlated inputs, and plasticity of excitatory and inhibitory synapses to high-level computations such as nonlinear feature classification. Due to high simulation costs, it is rare to see highly biophysical and morphological models used for learning studies that require repeated stimulus presentations over the course of a training procedure. The study aspires to prove the principle that experimentally-supported biological mechanisms can explain complex learning.
Weaknesses:
The high level of complexity of each component of the model makes it difficult to gain an intuition for which aspects of the model are essential for its performance, or responsible for its poor performance under certain conditions. Stripping down some of the biophysical detail and comparing it to a simpler model may help better understand each component in isolation. That said, the fundamental concepts behind nonlinear feature binding in neurons with compartmentalized dendrites have been explored in previous work, so it is not clear how this study represents a significant conceptual advance. Finally, the presentation of the model, the motivation and justification of each design choice, and the interpretation of each result could be restructured for clarity to be better received by a wider audience.
Thank you for the feedback! We agree that the complexity of our model can make it challenging to intuitively understand the underlying mechanisms. To address this, we have revised the manuscript to include additional simulations and clearer explanations of the mechanisms at play.
In the revised introduction, we now explicitly state our primary aim: to assess to what extent a biophysically detailed neuron model can support the theory proposed by Tran-Van-Minh et al. and explore whether such computations can be learned by a single neuron, specifically a projection neuron in the striatum. To achieve this, we focus on several key mechanisms:
(1) A local learning rule: We develop a learning rule driven by local calcium dynamics in the synapse and by reward signals from the neuromodulator dopamine. This plasticity rule is based on the known synaptic machinery for triggering LTP or LTD in the corticostriatal synapse onto dSPNs (Shen et al., 2008). Importantly, the rule does not rely on supervised learning paradigms and neither is a separate training and testing phase needed.
(2) Robust dendritic nonlinearities: According to Tran-Van-Minh et al., (2015) sufficient supralinear integration is needed to ensure that e.g. two inputs (i.e. one feature combination in the NFBP, Figure 1A) on the same dendrite generate greater somatic depolarization than if those inputs were distributed across different dendrites. To accomplish this we generate sufficiently robust dendritic plateau potentials using the approach in Trpevski et al., (2023).
(3) Metaplasticity: Although not discussed much in more theoretical work, our study demonstrates the necessity of metaplasticity for achieving stable and physiologically realistic synaptic weights. This mechanism ensures that synaptic strengths remain within biologically plausible ranges during training, regardless of initial synaptic weights.
We have also clarified our design choices and the rationale behind them, as well as restructured the interpretation of our results for greater accessibility. We hope these revisions make our approach and findings more transparent and easier to engage with for a broader audience.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
This study extends three previous lines of work:
(1) Prior computational/phenomenological work has shown that the presence of dendritic nonlinearities can enable single neurons to perform linearly non-separable tasks like XOR and feature binding (e.g. Tran-Van-Minh et al., Front. Cell. Neurosci., 2015).
Prior computational and phenomenological work, such as Tran-Van-Minh et al. (Front. Cell. Neurosci., 2015), directly inspired our study, as we now explicitly state in the introduction (page 4, lines 19-22). While Tran-Van-Minh theoretically demonstrated that these principles could solve the NFBP, it remains untested to what extent this can be achieved quantitatively in biophysically detailed neuron models using biologically plausible learning rules - which is what we test here.
(2) This study and a previous biophysical modeling study (Trpevski et al., Front. Cell. Neurosci., 2023) rely heavily on the finding from Chalifoux & Carter, J. Neurosci., 2011 that blocking glutamate transporters with TBOA increases dendritic calcium signals. The proposed model thus depends on a specific biophysical mechanism for dendritic plateau potential generation, where spatiotemporally clustered inputs must be co-activated on a single branch, and the voltage compartmentalization of the branch and the voltage-dependence of NMDARs is not enough, but additionally glutamate spillover from neighboring synapses must activate extrasynaptic NMDARs. If this specific biophysical implementation of dendritic plateau potentials is essential to the findings in this study, the authors have not made that connection clear. If it is a simple threshold nonlinearity in dendrites that is important for the model, and not the specific underlying biophysical mechanisms, then the study does not appear to provide a conceptual advance over previous studies demonstrating nonlinear feature binding with simpler implementations of dendritic nonlinearities.
We appreciate the feedback on the hypothesized role of glutamate spillover in our model. While the current manuscript and Trpevski et al. (2023) emphasize glutamate spillover as a plausible biophysical mechanism to provide sufficiently robust and supralinear plateau potentials, we acknowledge, however, that the mechanisms of supralinearity of dendritic integration, might not depend solely on this specific mechanism in other types of neurons. In Trpevski et al (2023) we, however, realized that if we allow too ‘graded’ dendritic plateaus, using the quite shallow Mg-block reported in experiments, it was difficult to solve the NFBP. The conceptual advance of our study lies in demonstrating that sufficiently nonlinear dendritic integration is needed and that this can be accounted for by assuming spillover in SPNs—but regardless of its biophysical source (e.g. NMDA spillover, steeper NMDA Mg block activation curves or other voltage dependent conductances that cause supralinear dendritic integration)—it enables biophysically detailed neurons to solve the nonlinear feature binding problem. To address this point and clarify the generality of our conclusions, we have revised the relevant sections in the manuscript to state this explicitly.
(3) Prior work has utilized "sliding-threshold," BCM-like plasticity rules to achieve neuronal selectivity and stability in synaptic weights. Other work has shown coordinated excitatory and inhibitory plasticity. The current manuscript combines "metaplasticity" at excitatory synapses with suppression of inhibitory strength onto strongly activated branches. This resembles the lateral inhibition scheme proposed by Olshausen (Christopher J. Rozell, Don H. Johnson, Richard G. Baraniuk, Bruno A. Olshausen; Sparse Coding via Thresholding and Local Competition in Neural Circuits. Neural Comput 2008; 20 (10): 2526-2563. doi: https://doi.org/10.1162/neco.2008.03-07-486). However, the complexity of the biophysical model makes it difficult to evaluate the relative importance of the additional complexity of the learning scheme.
We initially tried solving the NFBP with only excitatory plasticity, which worked reasonably well, especially if we assume a small population of neurons collaborates under physiological conditions. However, we observed that plateau potentials from distally located inputs were less effective, and we now explain this limitation in the revised manuscript (page 14, lines 23-37).
To address this, we added inhibitory plasticity inspired by mechanisms discussed in Castillo et al. (2011) , Ravasenga et al., and Chapman et al. (2022) , as now explicitly stated in the text (page 32, lines 23-26). While our GABA plasticity rule is speculative, it demonstrates that distal GABAergic plasticity can enhance nonlinear computations. These results are particularly encouraging, as it shows that implementing these mechanisms at the single-neuron level produces behavior consistent with network-level models like BCM-like plasticity rules and those proposed by Rozell et al. We hope this will inspire further experimental work on inhibitory plasticity mechanisms.
P2, paragraph 2: Grammar: "multiple dendritic regions, preferentially responsive to different input values or features, are known to form with close dendritic proximity." The meaning is not clear. "Dendritic regions" do not "form with close dendritic proximity."
Rewritten (current page 2, line 35)
P5, paragraph 3: Grammar: I think you mean "strengthened synapses" not "synapses strengthened".
Rewritten (current page 14, line 36)
P8, paragraph 1: Grammar: "equally often" not "equally much".
Updated (current page 10, line 2)
P8, paragraph 2: "This is because of the learning rule that successively slides the LTP NMDA Ca-dependent plasticity kernel over training." It is not clear what is meant by "sliding," either here or in the Methods. Please clarify.
We have updated the text and removed the word “sliding” throughout the manuscript to clarify that the calcium dependence of the kernels are in fact updated
P10, Figure 3C (left): After reading the accompanying text on P8, para 2, I am left not understanding what makes the difference between the two groups of synapses that both encode "yellow," on the same dendritic branch (d1) (so both see the same plateau potentials and dopamine) but one potentiates and one depresses. Please clarify.
Some "yellow" and "banana" synapses are initialized with weak conductances, limiting their ability to learn due to the relatively slow dynamics of the LTP kernel. These weak synapses fail to reach the calcium thresholds necessary for potentiation during a dopamine peak, yet they remain susceptible to depression under LTD conditions. Initially, the dynamics of the LTP kernel does not allow significant potentiation, even in the presence of appropriate signals such as plateau potentials and dopamine (page 10, lines 22–26). We have added a more detailed explanation of how the learning rule operates in the section “Characterization of the Synaptic Plasticity Rule” on page 9 and have clarified the specific reason why the weaker yellow synapses undergo LTD (page 11, lines 1–7).
As shown in Supplementary Figure 6, during subthreshold learning, the initial conductance is also low, which similarly hinders the synapses' ability to potentiate. However, with sufficient dopamine, the LTP kernel adapts by shifting closer to the observed calcium levels, allowing these synapses to eventually strengthen. This dynamic highlights how the model enables initially weak synapses to "catch up" under consistent activation and favorable dopaminergic conditions.
P9, paragraph 1: The phrase "the metaplasticity kernel" is introduced here without prior explanation or motivation for including this level of complexity in the model. Please set it up before you use it.
A sentence introducing metaplasticity has been added to the introduction (page 3, lines 36-42) as well as on page 9, where the kernel is introduced (page 9, lines 26-35)
P10, Figure 3D: "kernel midline" is not explained.
We have replotted fig 3 to make it easier to understand what is shown. Also, an explanation of the Kernel midpoint is added to the legend (current page 12, line 19)
P11, paragraph 1; P13, Fig. 4C: My interpretation of these data is that clustered connectivity with specific branches is essential for the performance of the model. Randomly distributing input features onto branches (allowing all 4 features to innervate single branches) results in poor performance. This is bad, right? The model can't learn unless a specific pre-wiring is assumed. There is not much interpretation provided at this stage of the manuscript, just a flat description of the result. Tell the reader what you think the implications of this are here.
Thanks for the suggestion - we have updated this section of the manuscript, adding an interpretation of the results that the model often fails to learn both relevant stimuli if all four features are clustered onto the same dendrite (page 13, lines 31-42).
In summary, when multiple feature combinations are encoded in the same dendrite with similar conductances, the ability to determine which combination to store depends on the dynamics of the other dendrite. Small variations in conductance, training order, or other stochastic factors can influence the outcome. This challenge, known as the symmetry-breaking problem, has been previously acknowledged in abstract neuron models (Legenstein and Maass, 2011). To address this, additional mechanisms such as branch plasticity—amplifying or attenuating the plateau potential as it propagates from the dendrite to the soma—can be employed (Legenstein and Maass, 2011).
P12, paragraph 2; P13, Figure 4E: This result seems suboptimal, that only synapses at a very specific distance from the soma can be used to effectively learn to solve a NFBP. It is not clear to what extent details of the biophysical and morphological model are contributing to this narrow distance-dependence, or whether it matches physiological data.
We have added Figure 5—figure supplement 1A to clarify why distal synapses may not optimally contribute to learning. This figure illustrates how inhibitory plasticity improves performance by reducing excessive LTD at distal dendrites, thereby enhancing stimulus discrimination. Relevant explanations have been integrated into Page 18, Lines 25-39 in the revised manuscript.
P14, paragraph 2: Now the authors are assuming that inhibitory synapses are highly tuned to stimulus features. The tuning of inhibitory cells in the hippocampus and cortex is controversial but seems generally weaker than excitatory cells, commensurate with their reduced number relative to excitatory cells. The model has accumulated a lot of assumptions at this point, many without strong experimental support, which again might make more sense when proposing a new theory, but this stitching together of complex mechanisms does not provide a strong intuition for whether the scheme is either biologically plausible or performant for a general class of problem.
We acknowledge that it is not currently known whether inhibitory synapses in the striatum are tuned to stimulus features. However, given that the striatum is a purely inhibitory structure, it is plausible that lateral inhibition from other projection neurons could be tuned to features, even if feedforward inhibition from interneurons is not. Therefore, we believe this assumption is reasonable in the context of our model. As noted earlier, the GABA plasticity rule in our study is speculative. However, we hope that our work will encourage further experimental investigations, as we demonstrate that if GABAergic inputs are sufficiently specific, they can significantly enhance computations (This is discussed on page 17, lines 8-15.).
P16, Figure 5E legend: The explanation of the meaning of T_max and T_min in the legend and text needs clarification.
The abbreviations T<sub>min</sub> and T<sub>max</sub> have been updated to CTL and CTH to better reflect their role in calcium threshold tracking. The Figure 5E legend and relevant text have been revised for clarity. Additionally, the Methods section has been reorganized for better readability.
P16, Figure 5B, C: When the reader reaches this paper, the conundrums presented in Figure 4 are resolved. The "winner-takes-all" inhibitory plasticity both increases the performance when all features are presented to a single branch and increases the range of somatodendritic distances where synapses can effectively be used for stimulus discrimination. The problem, then, is in the narrative. A lot more setup needs to be provided for the question related to whether or not dendritic nonlinearity and synaptic inhibition can be used to perform the NFBP. The authors may consider consolidating the results of Fig. 4 and 5 so that the comparison is made directly, rather than presenting them serially without much foreshadowing.
In order to facilitate readability, we have updated the following sections of the manuscript to clarify how inhibitory plasticity resolves challenges from Figure 4:
Figure 5B and Figure 5–figure supplement 1B: Two new panels illustrate the role of inhibitory plasticity in addressing symmetry problems.
Figure 5–figure supplement 1A: Shows how inhibitory plasticity extends the effective range of somatodendritic distances.
P18, Figure 6: This should be the most important figure, finally tying in all the previous complexity to show that NFBP can be partially solved with E and I plasticity even when features are distributed randomly across branches without clustering. However, now bringing in the comparison across spillover models is distracting and not necessary. Just show us the same plateau generation model used throughout the paper, with and without inhibition.
Figure updated. Accumulative spillover and no-spillover conditions have been removed.
P18, paragraph 2: "In Fig. 6C, we report that a subset of neurons (5 out of 31) successfully solved the NFBP." This study could be significantly strengthened if this phenomenon could (perhaps in parallel) be shown to occur in a simpler model with a simpler plateau generation mechanism. Furthermore, it could be significantly strengthened if the authors could show that, even if features are randomly distributed at initialization, a pruning mechanism could gradually transition the neuron into the state where fewer features are present on each branch, and the performance could approach the results presented in Figure 5 through dynamic connectivity.
To model structural plasticity is a good suggestion that should be investigated in later work, however, we feel that it goes beyond what we can do in the current manuscript. We now acknowledge that structural plasticity might play a role. For example we show that if we can assume ‘branch-specific’ spillover, that leads to sufficiently development of local dendritic non-linearities, also one can learn with distributed inputs. In reality, structural plasticity is likely important here, as we now state (current page 22, line 35-42).
P17, paragraph 2: "As shown in Fig. 6B, adding the hypothetical nonlinearities to the model increases the performance towards solving part of the NFBP, i.e. learning to respond to one relevant feature combination only. The performance increases with the amount of nonlinearity." This is not shown in Figure 6B.
Sentence removed. We have added a Figure 6 - figure supplement 1 to better explain the limitations.
P22, paragraph 1: The "w" parameter here is used to determine whether spatially localized synapses are co-active enough to generate a plateau potential. However, this is the same w learned through synaptic plasticity. Typically LTP and LTD are thought of as changing the number of postsynaptic AMPARs. Does this "w" also change the AMPAR weight in the model? Do the authors envision this as a presynaptic release probability quantity? If so, please state that and provide experimental justification. If not, please justify modifying the activation of postsynaptic NMDARs through plasticity.
This is an important remark. Our plasticity model differs from classical LTP models as it depends on the link between LTP and increased spillover as described by Henneberger et al., (2020).
We have updated the method section (page 27, lines 6-11), and we acknowledge, however, that in a real cell, learning might first strengthen the AMPA component, but after learning the ratio of NMDA/AMPA is unchanged ( Watt et al., 2004). This re-balancing between NMDA and AMPA might perhaps be a slower process.
Reviewer #2 (Public Review):
Summary:
The study explores how single striatal projection neurons (SPNs) utilize dendritic nonlinearities to solve complex integration tasks. It introduces a calcium-based synaptic learning rule that incorporates local calcium dynamics and dopaminergic signals, along with metaplasticity to ensure stability for synaptic weights. Results show SPNs can solve the nonlinear feature binding problem and enhance computational efficiency through inhibitory plasticity in dendrites, emphasizing the significant computational potential of individual neurons. In summary, the study provides a more biologically plausible solution to single-neuron learning and gives further mechanical insights into complex computations at the single-neuron level.
Strengths:
The paper introduces a novel learning rule for training a single multicompartmental neuron model to perform nonlinear feature binding tasks (NFBP), highlighting two main strengths: the learning rule is local, calcium-based, and requires only sparse reward signals, making it highly biologically plausible, and it applies to detailed neuron models that effectively preserve dendritic nonlinearities, contrasting with many previous studies that use simplified models.
Weaknesses:
I am concerned that the manuscript was submitted too hastily, as evidenced by the quality and logic of the writing and the presentation of the figures. These issues may compromise the integrity of the work. I would recommend a substantial revision of the manuscript to improve the clarity of the writing, incorporate more experiments, and better define the goals of the study.
Thanks for the valuable feedback. We have now gone through the whole manuscript updating the text, and also improved figures and added some supplementary figures to better explain model mechanisms. In particular, we state more clearly our goal already in the introduction.
Major Points:
(1) Quality of Scientific Writing: The current draft does not meet the expected standards. Key issues include:
i. Mathematical and Implementation Details: The manuscript lacks comprehensive mathematical descriptions and implementation details for the plasticity models (LTP/LTD/Meta) and the SPN model. Given the complexity of the biophysically detailed multicompartment model and the associated learning rules, the inclusion of only nine abstract equations (Eq. 1-9) in the Methods section is insufficient. I was surprised to find no supplementary material providing these crucial details. What parameters were used for the SPN model? What are the mathematical specifics for the extra-synaptic NMDA receptors utilized in this study? For instance, Eq. 3 references [Ca2+]-does this refer to calcium ions influenced by extra-synaptic NMDARs, or does it apply to other standard NMDARs? I also suggest the authors provide pseudocodes for the entire learning process to further clarify the learning rules.
The model is quite detailed but builds on previous work. For this reason, for model components used in earlier published work (and where models are already available via model repositories, such as ModelDB), we refer the reader to these resources in order to improve readability and to highlight what is novel in this paper - the learning rules itself. The learning rule is now explained in detail. For modelers that want to run the model, we have also provided a GitHub link to the simulation code. We hope this is a reasonable compromise to all readers, i.e, those that only want to understand what is new here (learning rule) and those that also want to test the model code. We explain this to the readers at the beginning of the Methods section.
ii. Figure quality. The authors seem not to carefully typeset the images, resulting in overcrowding and varying font sizes in the figures. Some of the fonts are too small and hard to read. The text in many of the diagrams is confusing. For example, in Panel A of Figure 3, two flattened images are combined, leading to small, distorted font sizes. In Panels C and D of Figure 7, the inconsistent use of terminology such as "kernels" further complicates the clarity of the presentation. I recommend that the authors thoroughly review all figures and accompanying text to ensure they meet the expected standards of clarity and quality.
Thanks for directing our attention to these oversights. We have gone through the entire manuscript, updating the figures where needed, and we are making sure that the text and the figure descriptions are clear and adequate and use consistent terminology for all quantities.
iii. Writing clarity. The manuscript often includes excessive and irrelevant details, particularly in the mathematical discussions. On page 24, within the "Metaplasticity" section, the authors introduce the biological background to support the proposed metaplasticity equation (Eq. 5). However, much of this biological detail is hypothesized rather than experimentally verified. For instance, the claim that "a pause in dopamine triggers a shift towards higher calcium concentrations while a peak in dopamine pushes the LTP kernel in the opposite direction" lacks cited experimental evidence. If evidence exists, it should be clearly referenced; otherwise, these assertions should be presented as theoretical hypotheses. Generally, Eq. 5 and related discussions should be described more concisely, with only a loose connection to dopamine effects until more experimental findings are available.
The “Metaplasticity” section (pages 30-32) has been updated to be more concise, and the abundant references to dopamine have been removed.
(2) Goals of the Study: The authors need to clearly define the primary objective of their research. Is it to showcase the computational advantages of the local learning rule, or to elucidate biological functions?
We have explicitly stated our goal in the introduction (page 4, lines 19-22). Please also see the response to reviewer 1.
i. Computational Advantage: If the intent is to demonstrate computational advantages, the current experimental results appear inadequate. The learning rule introduced in this work can only solve for four features, whereas previous research (e.g., Bicknell and Hausser, 2021) has shown capability with over 100 features. It is crucial for the authors to extend their demonstrations to prove that their learning rule can handle more than just three features. Furthermore, the requirement to fine-tune the midpoint of the synapse function indicates that the rule modifies the "activation function" of the synapses, as opposed to merely adjusting synaptic weights. In machine learning, modifying weights directly is typically more efficient than altering activation functions during learning tasks. This might account for why the current learning rule is restricted to a limited number of tasks. The authors should critically evaluate whether the proposed local learning rule, including meta-plasticity, actually offers any computational advantage. This evaluation is essential to understand the practical implications and effectiveness of the proposed learning rule.
Thank you for your feedback. To address the concern regarding feature complexity, we extended our simulations to include learning with 9 and 25 features, achieving accuracies of 80% and 75%, respectively (Figure 6—figure supplement 1A). While our results demonstrate effective performance, the absence of external stabilizers—such as error-modulated functions used in prior studies like Bicknell and Hausser (2021)—means that the model's performance can be more sensitive to occasional incorrect outcomes. For instance, while accuracy might reach 90%, a few errors can significantly affect overall performance due to the lack of mechanisms to stabilize learning.
In order to clarify the setup of the rule, we have added pseudocode in the revised manuscript (Pages 31-32) detailing how the learning rule and metaplasticity update synaptic weights based on calcium and dopamine signals. Additionally, we have included pseudocode for the inhibitory learning rule on Pages 34-35. In future work, we also aim to incorporate biologically plausible mechanisms, such as dopamine desensitization, to enhance stability.
ii. Biological Significance: If the goal is to interpret biological functions, the authors should dig deeper into the model behaviors to uncover their biological significance. This exploration should aim to link the observed computational features of the model more directly with biological mechanisms and outcomes.
As now clearly stated in the introduction, the goal of the study is to see whether and to what quantitative extent the theoretical solution of the NFBP proposed in Tran-Van-Minh et al. (2015) can be achieved with biophysically detailed neuron models and with a biologically inspired learning rule. The problem has so far been solved with abstract and phenomenological neuron models (Schiess et al., 2014; Legenstein and Maass, 2011) and also with a detailed neuron model but with a precalculated voltage-dependent learning rule (Bicknell and Häusser, 2021).
We have also tried to better explain the model mechanisms by adding supplementary figures.
Reviewer #2 (Recommendations For The Authors):
Minor:
(1) The [Ca]NMDA in Figure 2A and 2C can have large values even when very few synapses are activated. Why is that? Is this setting biologically realistic?
The elevated [Ca²⁺]NMDA with minimal synaptic activation arises from high spine input resistance, small spine volume, and NMDA receptor conductance, which scales calcium influx with synaptic strength. Physiological studies report spine calcium transients typically up to ~1 μM (Franks and Sejnowski 2002, DOI: 10.1002/bies.10193), while our model shows ~7 μM for 0.625 nS and around ~3 μM for 0.5 nS, exceeding this range. The calcium levels of the model might therefore be somewhat high compared to biologically measured levels - however, this does not impact the learning rule, as the functional dynamics of the rule remain robust across calcium variations.
(2) In the distributed synapses session, the study introduces two new mechanisms "Threshold spillover" and "Accumulative spillover". Both mechanisms are not basic concepts but quantitative descriptions of them are missing.
Thank you for your feedback. Based on the recommendations from Reviewer 1, we have simplified the paper by removing the "Accumulative spillover" and focusing solely on the "Thresholded spillover" mechanism. In the updated version of the paper, we refer to it only as glutamate spillover. However, we acknowledge (page 22, lines 40-42) that to create sufficient non-linearities, other mechanisms, like structural plasticity, might also be involved (although testing this in the model will have to be postponed to future work).
(3) The learning rule achieves moderate performance when feature-relevant synapses are organized in pre-designed clusters, but for more general distributed synaptic inputs, the model fails to faithfully solve the simple task (with its performance of ~ 75%). Performance results indicate the learning rule proposed, despite its delicate design, is still inefficient when the spatial distribution of synapses grows complex, which is often the case on biological neurons. Moreover, this inefficiency is not carefully analyzed in this paper (e.g. why the performance drops significantly and the possible computation mechanism underlying it).
The drop in performance when using distributed inputs (to a mean performance of 80%) is similar to the mean performance in the same situation in Bicknell and Hausser (2021), see their Fig. 3C. The drop in performance is due to that: i) the relevant feature combinations are not often colocalized on the same dendrite so that they can be strengthened together, and ii) even if they are, there may not be enough synapses to trigger the supralinear response from the branch spillover mechanism, i.e. the inputs are not summated in a supralinear way (Fig. 6B, most input configurations only reach 75%).
Because of this, at most one relevant feature combination can be learned. In the several cases when the random distribution of synapses is favorable for both relevant feature combinations to be learned, the NFBP is solved (Figs. 6B, some performance lines reach 100 % and 6C, example of such a case). We have extended the relevant sections of the paper trying to highlight the above mentioned mechanisms.
Further, the theoretical results in Tran-Van-Minh et al. 2015 already show that to solve the NFBP with supralinear dendrites requires features to be pre-clustered in order to evoke the supralinear dendritic response, which would activate the soma. The same number of synapses distributed across the dendrites i) would not excite the soma as strongly, and ii) would summate in the soma as in a point neuron, i.e. no supralinear events can be activated, which are necessary to solve the NFBP. Hence, one doesn’t expect distributed synaptic inputs to solve the NFBP with any kind of learning rule.
(4) Figure 5B demonstrates that on average adding inhibitory synapses can enhance the learning capabilities to solve the NFBP for different pattern configurations (2, 3, or 4 features), but since the performance for excitatory-only setup varies greatly between different configurations (Figure 4B, using 2 or 3 features can solve while 4 cannot), can the results be more precise about whether adding inhibitory synapses can help improve the learning with 4 features?
In response to the question, we added a panel to Figure 5B showing that without inhibitory synapses, 5 out of 13 configurations with four features successfully learn, while with inhibitory synapses, this improves to 7 out of 13. Figure 5—figure supplement 1B provides an explanation for this improvement: page 18 line 10-24
(5) Also, in terms of the possible role of inhibitory plasticity in learning, as only on-site inhibition is studied here, can other types of inhibition be considered, like on-path or off-path? Do they have similar or different effects?
This is an interesting suggestion for future work. We observed relevant dynamics in Figure 6A, where inhibitory synapses increased their weights on-site when randomly distributed. Previous work by Gidon and Segev (2012) examined the effects of different inhibitory types on NMDA clusters, highlighting the role of on-site and off-path inhibition in shunting. In our context, on-site inhibition in the same branch, appears more relevant for maintaining compartmentalized dendritic processing.
(6) Figure 6A is mentioned in the context of excitatory-only setup, but it depicts the setup when both excitatory and inhibitory synapses are included, which is discussed later in the paper. A correction should be made to ensure consistency.
We have updated the figure and the text in order to make it more clear that simulations are run both with and without inhibition in this context (page 21 line 4-13)
(7) In the "Ca and kernel dynamics" plots (Fig 3,5), some of the kernel midlines (solid line) are overlapped by dots, e.g. the yellow line in Fig 3D, and some kernel midlines look like dots, which leads to confusion. Suggest to separate plots of Ca and kernel dynamics for clarity.
The design of the figures has been updated to improve the visibility of the calcium and kernel dynamics during training.
(8) The formulations of the learning rule are not well-organized, and the naming of parameters is kind of confusing, e.g. T_min, T_max, which by default represent time, means "Ca concentration threshold" here.
The abbreviations of the thresholds ( T<sub>min</sub>, T<sub>max</sub> in the initial version) have been updated to CTL and CTH, respectively, to better reflect their role in tracking calcium levels. The mathematical formulations have further been reorganized for better readability. The revised Methods section now follows a more structured flow, first explaining the learning mechanisms, followed by the equations and their dependencies.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study by Jeong and Choi studied neural activity in the medial prefrontal cortex (mPFC) while rats performed a foraging paradigm in which they forage for rewards in the absence or presence of a threatening object (Lobsterbot). The authors present interesting observations suggesting that the mPFC population activity switches between distinct functional modes conveying distinct task variables- such as the distance to the reward location and types of threat-avoidance behaviors-depending on the location of the animal. Although the specific information represented by individual neurons remains to be clarified through further investigation, the reviewers thought that this study is solid, appreciated the value of studying neural coding in naturalistic settings, and felt that this work offers significant insights into how the mPFC operates during foraging behavior involving reward-threat conflict.
-
Reviewer #1 (Public review):
Summary:
In this study, Jeong and Choi examine neural correlates of behavior during a naturalistic foraging task in which rats must dynamically balance resource acquisition (foraging) with the risk of threat. Rats first learn to forage for sucrose reward from a spout, and when a threat is introduced (an attack-like movement from a "LobsterBot"), they adjust their behavior to continue foraging while balancing exposure to the threat, adopting anticipatory withdraw behaviors to avoid encounter with the LobsterBot. Using electrode recordings targeting the medial prefrontal cortex (PFC), they identify heterogenous encoding of task variables across prelimbic and infralimbic cortex neurons, including correlates of distance to the reward/threat zone and correlates of both anticipatory and reactionary avoidance behavior. Based on analysis of population responses, they show that prefrontal cortex switches between different regimes of population activity to process spatial information or behavioral responses to threat in a context-dependent manner. Characterization of the heterogenous coding scheme by which frontal cortex represents information in different goal states is an important contribution to our understanding of brain mechanisms underlying flexible behavior in ecological settings.
Strengths:
As many behavioral neuroscience studies employ highly controlled task designs, relatively less is generally known about how the brain organizes navigation and behavioral selection in naturalistic settings, where environment states and goals are more fluid. Here, the authors take advantage of a natural challenge faced by many animals - how to forage for resources in an unpredictable environment - to investigate neural correlates of behavior when goal states are dynamic. Related to his, they also investigate prefrontal cortex (PFC) activity is structured to support different functional "modes" (here, between a navigational mode and a threat-sensitive foraging mode) for flexible behavior. Overall, an important strength and real value of this study is the design of the behavioral experiment, which is trial-structured, permitting strong statistical methods for neural data analysis, yet still rich enough to encourage natural behavior structured by the animal's volitional goals. The experiment is also phased to measure behavioral changes as animals first encounter a threat, and then learn to adapt their foraging strategy to its presence. Characterization of this adaptation process is itself quite interesting and sets a foundation for further study of threat learning and risk management in the foraging context. Finally, the characterization of single-neuron and population dynamics in PFC in this naturalistic setting with fluid goal states is an important contribution to the field. Previous studies have identified neural correlates of spatial and behavioral variables in frontal cortex, but how these representations are structured, or how they are dynamically adjusted when animals shift their goals, has been less clear. The authors synthesize their main conclusions into a conceptual model for how PFC activity can support mode switching, which can be tested in future studies with other task designed and functional manipulations.
Weaknesses:
While the task design in this study is intentionally stimulus-rich and places minimal constraint on the animal to preserve naturalistic behavior, this also introduces confounds that limit interpretability of the neural analysis. For example, some variables which are the target of neural correlation analysis, such as spatial/proximity coding and coding of threat and threat-related behaviors, are naturally entwined. To their credit, the authors have included careful analyses and control conditions to disambiguate these variables and significantly improve clarity.
The authors also claim that the heterogenous coding of spatial and behavioral variables in PFC is structured in a particular way that depends on the animal's goals or context. As the authors themselves discuss, the different "zones" contain distinct behaviors and stimuli, and since some neurons are modulated by these events (e.g., licking sucrose water, withdrawing from the LobsterBot, etc.), differences in population activity may to some extent reflect behavior/event coding. The authors have included a control analysis, removing timepoints corresponding to salient events, to substantiate the claim that PFC neurons switch between different coding "modes." While this significantly strengthens evidence for their conclusion, this analysis still depends on relatively coarse labeling of only very salient events. Future experiment designs, which intentionally separate task contexts (e.g. navigation vs. foraging), could serve to further clarify the structure of coding across contexts and/or goal states.
Finally, while the study includes many careful, in-depth neural and behavioral analyses to support the notion that modal coding of task variables in PFC may play a role in organizing flexible, dynamic behavior, the study still lacks functional manipulations to establish any form of causality. This limitation is acknowledged in the text, and the report is careful not to over interpret suggestions of causal contribution, instead setting a foundation for future investigations.
-
Reviewer #2 (Public review):
Summary:
Jeong & Choi (2023) use a semi-naturalistic paradigm to tackle the question of how the activity of neurons in the mPFC might continuously encode different functions. They offer two possibilities: either there are separate dedicated populations encoding each function, or cells alter their activity dependent on the current goal of the animal. In a threat-avoidance task rats procurred sucrose in an area of a chamber where, after remaining there for some amount of time, a 'Lobsterbot' robot attacked. In order to initiate the next trial rats had to move through the arena to another area before returning to the robot encounter zone. Therefore the task has two key components: threat avoidance and navigating through space. Recordings in the IL and PL of the mPFC revealed encoding that depended on what stage of the task the animal was currently engaged in. When animals were navigating, neuronal ensembles in these regions encoded distance from the threat. However, whilst animals were directly engaged with the threat and simultaneously consuming reward, it was possible to decode from a subset of the population whether animals would evade the threat. Therefore the authors claim that neurons in the mPFC switched between two functional modes: representing allocentric spatial information, and representing egocentric information pertaining to the reward and threat. Finally, the authors propose a conceptual model based on these data whereby this switching of population encoding is driven by either bottom-up sensory information or top-down arbitration.
Strengths:
Whilst these multiple functions of activity in the mPFC have generally been observed in tasks dedicated to the study of a singular function, less work has been done in contexts where animals continuously switch between different modes of behaviour in a more natural way. Being able to assess whether previous findings of mPFC function apply in natural contexts is very valuable to the field, even outside of those interested in the mPFC directly. This also speaks to the novelty of the work; although mixed selectivity encoding of threat assessment and action selection has been demonstrated in some contexts (e.g. Grunfeld & Likhtik, 2018) understanding the way in which encoding changes on-the-fly in a self-paced task is valuable both for verifying whether current understanding holds true and for extending our models of functional coding in the mPFC.
The authors are also generally thoughtful in their analyses and use a variety of approaches to probe the information encoded in the recorded activity. In particular, they use relatively close analysis of behaviour as well as manipulating the task itself by removing the threat to verify their own results. The use of such a rich task also allows them to draw comparisons, e.g. in different zones of the arena or different types of responses to threat, that a more reduced task would not otherwise allow. Additional in-depth analyses in the updated version of the manuscript, particularly the feature importance analysis, as well as complimentary null findings (a lack of cohesive place cell encoding, and no difference in location coding dependent on direction of trajectory) further support the authors' conclusion that populations of cells in the mPFC are switching their functional coding based on task context rather than behaviour per se. Finally, the authors' updated model schematic proposes an intriguing and testable implementation of how this encoding switch may be manifested by looking at differentiable inputs to these populations.
Weaknesses:
The main existing weakness of this study is that its findings are correlational (as the authors highlight in the discussion). Future work might aim to verify and expand the authors' findings - for example, whether the elevated response of Type 2 neurons directly contributes to the decision-making process or just represents fear/anxiety motivation/threat level - through direct physiological manipulation. However, I appreciate the challenges of interpreting data even in the presence of such manipulations and some of the additional analyses of behaviour, for example the stability of animals' inter-lick intervals in the E-zone, go some way towards ruling out alternative behavioural explanations. Yet the most ideal version of this analysis is to use a pose estimation method such as DeepLabCut to more fully measure behavioural changes. This, in combination with direct physiological manipulation, would allow the authors to fully validate that the switching of encoding by this population of neurons in the mPFC has the functional attributes as claimed here.
-
Reviewer #3 (Public review):
Summary:
This study investigates how various behavioral features are represented in the medial prefrontal cortex (mPFC) of rats engaged in a naturalistic foraging task. The authors recorded electrophysiological responses of individual neurons as animals transitioned between navigation, reward consumption, avoidance, and escape behaviors. Employing a range of computational and statistical methods, including artificial neural networks, dimensionality reduction, hierarchical clustering, and Bayesian classifiers, the authors sought to predict from neural activity distinct task variables (such as distance from the reward zone and the success or failure of avoidance behavior). The findings suggest that mPFC neurons alternate between at least two distinct functional modes, namely spatial encoding and threat evaluation, contingent on the specific location.
Strengths:
This study attempt to address an important question: understanding the role of mPFC across multiple dynamic behaviors. The authors highlight the diverse roles attributed to mPFC in previous literature and seek to explain this apparent heterogeneity. They designed an ethologically relevant foraging task that facilitated the examination of complex dynamic behavior, collecting comprehensive behavioral and neural data. The analyses conducted are both sound and rigorous.
Weaknesses:
Because the study still lacks experimental manipulation, the findings remain correlational. The authors have appropriately tempered their claims regarding the functional role of the mPFC in the task. The nature of the switch between functional modes encoding distinct task variables (i.e., distance to reward, and threat-avoidance behavior type) is not established. Moreover, the evidence presented to dissociate movement from these task variables is not fully convincing, particularly without single-session video analysis of movement. Specifically, while the new analyses in Figure 7 are informative, they may not fully account for all potential confounding variables arising from changes in context or behavior.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer 1 (Public Review):
Thank you for the helpful comments. Below, we have quoted the relevant sections from the revised manuscript as we respond to the reviewer’s comments item-by-item.
Weaknesses:
While the task design in this study is intentionally stimulus-rich and places a minimal constraint on the animal to preserve naturalistic behavior, this is, unfortunately, a double-edged sword, as it also introduces additional variables that confound some of the neural analysis. Because of this, a general weakness of the study is a lack of clear interpretability of the task variable neural correlates. This is a limitation of the task, which includes many naturally correlated variables - however, I think with some additional analyses, the authors could strengthen some of their core arguments and significantly improve clarity.
We acknowledge the weakness and have included additional analyses to compensate for it. The details are as follows in our reply to the subsequent comments.
For example, the authors argue, based on an ANN decoding analysis (Figure 2b), that PFC neurons encode spatial information - but the spatial coordinate that they decode (the distance to the active foraging zone) is itself confounded by the fact that animals exhibit different behavior in different sections of the arena. From the way the data are presented, it is difficult to tell whether the decoder performance reflects a true neural correlate of distance, or whether it is driven by behavior-associated activity that is evoked by different behaviors in different parts of the arena. The author's claim that PFC neurons encode spatial information could be substantiated with a more careful analysis of single-neuron responses to supplement the decoder analysis. For example, 1) They could show examples of single neurons that are active at some constant distance away from the foraging site, regardless of animal behavior, and 2) They could quantify how many neurons are significantly spatially modulated, controlling for correlates of behavior events. One possible approach to disambiguate this confound could be to use regression-based models of neuron spiking to quantify variance in neuron activity that is explained by spatial features, behavioral features, or both.
First of all, we would like to point out that while the recording was made during naturalistic foraging with minimal constraints behaviorally, a well-trained rat displayed an almost fixed sequence of actions within each zone. The behavioral repertoire performed in each zone was very different from each other: exploratory behaviors in the N-zone, navigating back and forth in the F-zone, and licking sucrose while avoiding attacks in the E-zone. Therefore, the entire arena is not only divided by the geographical features but also by the distinct set of behaviors performed in each zone. This is evident in the data showing a higher decoding accuracy of spatial distance in the F-zone than in the N- or E-zone. In this sense, the heterogeneous encoding reflects heterogenous distribution of dominant behaviors (navigation in the F-zone and attack avoidance while foraging in the E-zone) and hence corroborate the reviewer’s comment at a macroscopic scale encompassing the entire arena.
Having said that, the more critical question is whether the neural activity is more correlated with microscopic behaviors at every moment rather than the location decoded in the F-zone. As the reviewer suggested, the first-step is to analyze single-neuron activity to identify whether direct neural correlates of location exist. To this end, traditional place maps were constructed for individual neurons. Most neurons did not show cohesive place fields across different regions, indicating little-to-no direct place coding by individual neurons. Only a few neurons displayed recognizable place fields in a consistent manner. However, even these place fields were irregular and patchy, and therefore, nothing comparable to the place cells or grid cells found in the hippocampus or entorhinal cortex. Some examples firing maps have been added to Figure 2 and characterized in the text as below.
“To determine whether location-specific neural activity exists at the single-cell level in our mPFC data, a traditional place map was constructed for individual neurons. Although most neurons did not show cohesive place fields across different regions in the arena, a few neurons modulated their firing rates based on the rat’s current location. However, even these neurons were not comparable to place cells in the hippocampus (O’Keefe & Dostrovsky, 1971) or grid cells in the entorhinal cortex (Hafting et al., 2005) as the place fields were patchy and irregular in some cases (Figure 2B; Units 66 and 125) or too large, spanning the entire zone rather than a discrete location within it (Units 26 and 56). The latter type of neuron has been identified in other studies (e.g., Kaefer et al., 2020).”
Next, to verify whether the location decoding reflects neuronal activity due to external features or particular type of action, predicted location was compared between the opposite directions within the F-zone, inbound and outbound in reference to the goal area (Lobsterbot). If the encoding were specifically tied to a particular action or environmental stimuli, there should be a discrepancy when the ANN decoder trained with outbound trajectory is tested for predictions on the inbound path, and vice versa. However, the results showed no significant difference between the two trajectories, suggesting that the decoded distance was not simply reflecting neural responses to location-specific activities or environmental cues during navigation.
“To determine whether the accuracy of the regressor varied depending on the direction of movement, we compared the decoding accuracy of the regressor for outbound (from the N- to E-zone) vs. inbound (from the E- to N- zone) navigation within the F-zone. There was no significant difference in decoding accuracy between outbound vs. inbound trips (paired t-test; t(39) = 1.52, p =.136), indicating that the stability of spatial encoding was maintained regardless of the moving direction or perceived context (Figure 2E).”
Additionally, we applied the same regression analysis on a subset of data that were recorded while the door to the robot compartment was closed during the Lobsterbot sessions. This way, it is possible to test the decoding accuracy when the most salient spatial feature, the Lobsterbot, is blocked out of sight. The subset represents an average of 38.92% of the entire session. Interestingly, the decoding accuracy with the subset of data was higher accuracy than that with the entire dataset, indicating that the neural activities were not driven by a single salient landmark. This finding supports our conclusion that the location information can be decoded from a population of neurons rather than from individual neurons that are associated with environmental or proprioceptive cues. We have added the following description of results in the manuscript.
“Previous analyses indicated that the distance regressor performed robustly regardless of movement direction, but there is a possibility that the decoder detects visual cues or behaviors specific to the E-zone. For example, neural activity related to Lobsterbot confrontation or licking behavior might be used by the regressor to decode distance. To rule out this possibility, we analyzed a subset of data collected when the compartment door was closed, preventing visual access to the Lobsterbot and sucrose port and limiting active foraging behavior. The regressor trained on this subset still decoded distance with a MAE of 12.14 (± 3.046) cm (paired t-test; t(39) = 12.17, p <.001). Notably, the regressor's performance was significantly higher with this subset than with the full dataset (paired t-test; t(39) = 9.895, p <.001).”
As for the comment on “using regression-based models of neuron spiking to quantify variance in neuron activity that is explained by spatial features, behavioral features, or both”, it is difficult to separate a particular behavioral event let alone timestamping it since the rat’s location was being monitored in the constantly-moving, naturalistic stream of behaviors. However, as mentioned above, a new section entitled “Overlapping populations of mPFC neurons adaptively encode spatial information and defensive decision” argues against single-neuron based account by performing the feature importance analysis. The results showed that even when the top 20% of the most informative neurons were excluded, the remaining neural population could still decode both distance and events. This analysis supports the idea of a population-wide mode shift rather than distinct subgroups of neurons specialized in processing different sensory or motor events. This idea is also expressed in the schematic diagrams featured in Figure 8 of the revision.
To substantiate the claim that PFC neurons really switch between different coding "modes," the authors could include a version of this analysis where they have regressed out, or otherwise controlled for, these confounds. Otherwise, the claim that the authors have identified "distinctively different states of ensemble activity," as opposed to simple coding of salient task features, seems premature.
A key argument in our study is that the mPFC neurons encode different abstract internal representations (distance and avoidance decision) at the level of population. This has been emphasized in the revision with additional analyses and discussions. Most of all, we performed single neuron-based analysis for both spatial encoding (place fields for individual neurons) and avoidance decision (PETHs for head entry and head withdrawal) and contrasted the results with the population analysis. Although some individual neurons displayed a fractured “place cell-like” activity, and some others showed modulated firing at the head-entry and the head-withdrawal events, the ensemble decoding extracted distance information for the current location of the animal at a much higher accuracy. Furthermore, the PCA analysis identified abstract feature dimensions especially regarding the activity in the E-zone that cannot be attributable to a small number of sensory- or motor-related neurons.
To mitigate the possibility that the PCA is driven primarily by a small subset of units responsive to salient behavioral events, we also applied PCA to the dataset excluding the activity in the 2-second time window surrounding the head entry and withdrawal. While this approach does not eliminate all cue- or behavior-related activity within the E-zone, it does remove the neural activity associated with emotionally significant events, such as entry into the E-zone, the first drop of sucrose, head withdrawal, and the attack. Even without these events, the PC identified in the E-zone was still separated from those in the F-zone and N-zone. This result again argues in support of distinct states of ensemble activity formed in accordance with different categories of behaviors performed in different zones. Finally, the Naïve Bayesian classifier trained with ensemble activity in the E-zone was able to predict the success and failure of avoidance that occur a few seconds later, indicating that the same population of neurons are encoding the avoidance decision rather than the location of the animal.
Reviewer 1 (Recommendations):
The authors include an analysis (Figure 4) of population responses using PCA on session-wide data, which they use to support the claim that PFC neurons encode distinctive neural states, particularly between the encounter zone and nesting/foraging zones. However, because the encounter zone contains unique stimulus and task events (sucrose, threat, etc.), and the samples for PCA are drawn from the entire dataset (including during these events), it seems likely that the Euclidean distance measures analyzed in Figure 4b are driven mostly by the neural correlates of these events rather than some more general change in "state" of PFC dynamics. This does not invalidate this analysis but renders it potentially redundant with the single neuron results shown in Figure 5 - and I think the interpretation of this as supporting a state transition in the coding scheme is somewhat misleading. The authors may consider performing a PCA/population vector analysis on the subset of timepoints that do not contain unique behavior events, rather than on session-wide data, or otherwise equalizing samples that correspond to behavioral events in different zones. Observing a difference in PC-projected population vectors drawn from samples that are not contaminated by unique encounter-related events would substantiate the idea that there is a general shift in neural activity that is more related to the change in context or goal state, and less directly to the distinguishing events themselves.
Thank you for the comments. Indeed, this is a recurring theme where the reviewers expressed concerns and doubts about heterogenous encoding of different functional modes. Besides the systematic presentation of the results in the manuscript, from PETH to ANN and to Bayesian classifier, we argue, however, that the activity of the mPFC neurons is better represented by the population rather than loose collection of stimulus- or event-related neurons.
The PCA results that we included as the evidence of distinct functional separation, might reflect activities driven by a small number of event-coding neurons in different zones. As mentioned in the public review, we conducted the same analysis on a subset of data that excluded neural activity potentially influenced by significant events in the E-zone. The critical times are defined as ± 1 second from these events and excluded from the neural data. Despite these exclusions, the results continued to show populational differences between zones, reinforcing the notion that neurons encode abstract behavioral states (decision to avoid or stay) without the sensory- or motor-related activity. Although this analysis does not completely eliminate all possible confounding factors emerging in different external and internal contexts, it provides extra support for the population-level switch occurring in different zones.
In Figure 7, the authors include a schematic that suggests that the number of neurons representing spatial information increases in the foraging zone, and that they overlap substantially with neurons representing behaviors in the encounter zone, such as withdrawal. They show in Figure 3 that location decoding is better in the foraging zone, but I could not find any explicit analysis of single-neuron correlates of spatial information as suggested in the schematic. Is there a formal analysis that lends support to this idea? It would be simple, and informative, to include a quantification of the fraction of spatial- and behavior-modulated neurons in each zone to see if changes in location coding are really driven by "larger" population representations. Also, the authors could quantify the overlap between spatial- and behavior-modulated neurons in the encounter zone to explicitly test whether neurons "switch" their coding scheme.
The Figure 7 (now Figure 8) is now completely revised. The schematic diagram is modified to show spatial and avoidance decision encoding by the overlapping population of mPFC neurons (Figure 8a). Most notably, there are very few neurons that encode location but not the avoidance decision or vice versa. This is indicated by the differently colored units in F-zone vs. E-zone. The model also included units that are “not” engaged in any type of encoding or engaged in only one-type of encoding although they are not the majority.
We have also added a schematic for hypothetical switching mechanisms (Figure 8b) to describe the conceptual scheme for the initiation of encoding-mode switching (sensory-driven vs. arbitrator-driven process)
“Two main hypotheses could explain this switch. A bottom-up hypothesis suggests sensory inputs or upstream signals dictate encoding priorities, while a top-down hypothesis proposes that an internal or external “arbitrator” selects the encoding mode and coordinates the relevant information (Figure 8B). Although the current study is only a first step toward finding the regulatory mechanism behind this switch, our control experiment, where rats reverted to a simple shuttling task, provide evidence that might favor the top-down hypothesis. The absence of the Lobsterbot degraded spatial encoding rather than enhancing it, indicating that simply reducing the task demand is not sufficient to activate one particular type of encoding mode over another. The arbitrator hypothesis asserts that the mPFC neurons are called on to encode heterogenous information when the task demand is high and requires behavioral coordination beyond automatic, stimulus-driven execution. Future studies incorporating multiple simultaneous tasks and carefully controlling contextual variables could help determine whether these functional shifts are governed by top-down processes involving specific neural arbitrators or by bottom-up signals.”
Related to this difference in location coding throughout the environment, the authors suggest in Figure 3a-b that location coding is better in the foraging zone compared to the nest or encounter zones, evidenced by better decoder performance (smaller error) in the foraging zone (Figure 3b). The authors use the same proportion of data from the three zones for setting up training/test sets for cross-validation, but it seems likely that overall, there are substantially more samples from the foraging zone compared to the other two zones, as the animal traverses this section frequently, and whenever it moves from the next into the encounter zone (based on the video). What does the actual heatmap of animal location look like? And, if the data are down-sampled such that each section contributes the same proportion of samples to decoder training, does the error landscape still show better performance in the foraging zone? It is important to disambiguate the effects of uneven sampling from true biological differences in neural activity.
Thank you for the comment. We agree with the concern regarding uneven data size from different sections of the arena. Indeed, as the heatmap below indicates, the rats spent most of their time in two critical locations, one being a transition area between N-and F-zone and the other near the sucrose port. This imbalance needs to be corrected. In fact we have included methodology to correct this biased sampling. In the result section “Non-navigational behavior reduces the accuracy of decoded location” we have the following results.
Author response image 1.
Heatmap of the animal’s position during one example session. (Left) Unprocessed occupancy plot. Each dot represents 0.2 seconds. Right) Smoothed occupancy plot using a Gaussian filter (sigma: 10 pixels, filter size: 1001 pixels). The white line indicates a 10 cm length.
“To correct for the unequal distribution of location visits (more visits to the F- than to other zones), the regressor was trained using a subset of the original data, which was equalized for the data size per distance range (see Materials and Methods). Despite the correction, there was a significant main effect of the zone (F(1.16, 45.43) = 119.2, p <.001) and the post hoc results showed that the MAEs in the N-zone (19.52 ± 4.46 cm; t(39) = 10.45; p <.001) and the E-zone (26.13 ± 7.57 cm; t(39) = 11.40; p <.001) had a significantly higher errors when compared to the F-zone (14.10 ± 1.64 cm).”
Also in the method section, we have stated that:
“In the dataset adjusted for uneven location visits, we divided distance values into five equally sized bins. Then, a sub-dataset was created that contains an equal number of data points for each of these bins.”
Why do the authors choose to use a multi-layer neural network (Figure 2b-c) to decode the animal's distance to the encounter zone?(…) The authors may consider also showing an analysis using simple regression, or maybe something like an SVM, in addition to the ANN approach.
We began with a simple linear regression model and progressed to more advanced methods, including SVM and multi-layer neural networks. As shown below, simpler methods could decode distance to some extent, but neural networks and random forest regressors outperformed others (Neural Network: 16.61 cm ± 3.673; Linear Regression: 19.85 cm ± 2.528; Quadratic Regression: 18.68 cm ± 4.674; SVM: 18.88 cm ± 2.676; Random Forest: 13.59 cm ± 3.174).
We chose the neural network model for two main reasons: (1) previous studies demonstrated its superior performance compared to Bayesian regressors commonly used for decoding neural ensembles, and (2) its generalizability and robustness against noisy data. Although the random forest regressor achieved the lowest decoding error, we avoided using it due to its tendency to overfit and its limited generalization to unseen data.
Overall, we expect similar results with other regressors but with different statistical power for decoding accuracy. Instead, we speculate that neural network’s use of multiple nodes contributes to robustness against noise from single-unit recordings and enables the network to capture distributed processing within neural ensembles.
In Figure 6c, the authors show a prediction of withdrawal behavior based on neural activity seconds before the behavior occurs. This is potentially very interesting, as it suggests that something about the state of neural dynamics in PFC is potentially related to the propensity to withdraw, or to the preparation of this behavior. However, another possibility is that the behaves differently, in more subtle ways, while it is anticipating threat and preparing withdrawal behavior - since PFC neurons are correlated with behavior, this could explain decoder performance before the withdrawal behavior occurs. To rule out this possibility, it would be useful to analyze how well, and how early, withdrawal success can be decoded only on the basis of behavioral features from the video, and then to compare this with the time course of the neural decoder. Another approach might be to decode the behavior on the basis of video data as well as neural data, and using a model comparison, measure whether inclusion of neural features significantly increases decoder performance.
We appreciate this important point, as mPFC activity might indeed reflect motor preparation preceding withdrawal behavior. Another reviewer raised a similar concern regarding potential micro-behavioral influences on mPFC activity prior to withdrawal responses. However, our behavioral analysis suggests that highly trained rats engage in sucrose licking which has little variability regardless of the subsequent behavioral decision. To support, 95% of inter-lick intervals were less than 0.25 seconds, which is not enough time to perform any additional behavior during encounters.
Author response image 2.
To further clarify this, we included additional video showing both avoidance and escape withdrawals at close range. This video was recorded during the development of the behavioral paradigm, though we did not routinely collect this view, as animals consistently exhibited stable licking behavior in the E-zone. As demonstrated in the video, the rat remains highly focused on the lick port with minimal body movement during encounters. Therefore, we believe that the neural ensemble dynamics observed in the mPFC are unlikely to be driven by micro-behavioral changes.
Reviewer 2 (Public Review):
Thank you for the positive comment on our behavior paradigm and constructive suggestions on additional analysis. We came to think that the role of mPFC could be better portrayed as representing and switching between different encoding targets under different contexts, which in part, was more clearly manifested by the naturalistic behavioral paradigm. In the revision we tried to convey this message more explicitly and provide a new perspective for this important aspect of mPFC function.
It is not clear what proportion of each of the ensembles recorded is necessary for decoding distance from the threat, and whether it is these same neurons that directly 'switch' to responding to head entry or withdrawal in the encounter phase within the total population. The PCA gets closest to answering this question by demonstrating that activity during the encounter is different from activity in the nesting or foraging zones, but in principle this could be achieved by neurons or ensembles that did not encode spatial parameters. The population analyses are focused on neurons sensitive to behaviours relating to the threat encounter, but even before dividing into subtypes etc., this is at most half of the recorded population.
In our study, the key idea we aim to convey is that mPFC neurons adapt their encoding schemes based on the context or functional needs of the ongoing task. Other reviewers also suggested strengthening the evidence that the same neurons directly switch between encoding two different tasks. The counteracting hypothesis to "switching functions within the same neurons" posits that there are dedicated subsets of neurons that modulate behavior—either by driving decisions/behaviors themselves or being driven by computations from other brain regions.
To test this idea, we included an additional analysis chapter in the results section titled Overlapping populations of mPFC neurons adaptively encode spatial information and defensive decision. In this section, we directly tested this hypothesis by examining each neuron's contribution to the distance regressor and the event classifier. The results showed that the histogram of feature importance—the contribution to each task—is highly skewed towards zero for both decoders, and removing neurons with high feature importance does not impair the decoder’s performance. These findings suggest that 1) there is no direct division among neurons involved in the two tasks, and 2) information about spatial/defensive behavior is distributed across neurons.
Furthermore, we tested whether there is a negative correlation between the feature importance of spatial encoding and avoidance encoding. Even if there were no “key neurons” that transmit a significant amount of information about either spatial or defensive behavior, it is still possible that neurons with higher information in the navigation context might carry less information in the active-foraging context, or vice versa. However, we did not observe such a trend, suggesting that mPFC neurons do not exhibit a preference for encoding one type of information over the other.
Lastly, another reviewer raised the concern that the PCA results, which we used as evidence of functional separation of different ensemble functions, might be driven by a small number of event-coding neurons. To address this, we conducted the same analysis on a subset of data that excluded neural activity potentially influenced by significant events in the E-zone. In the Peri-Event Time Histogram (PETH) analysis, we observed that some neurons exhibit highly-modulated activity upon arrival at the E-zone (head entry; HE) and immediately following voluntary departure or attack (head withdrawal; HW). We defined 'critical event times' as ± one second from these events and excluded neural data from these periods to determine if PCA could still differentiate neural activities across zones. Despite these exclusions, the results continued to show populational differences between zones, reinforcing the notion that neurons adapt their activity according to the context. We acknowledge that this analysis still cannot eliminate all of the confounding factors due to the context change, but we confirmed that excluding two significant events (delivery onset of sucrose and withdrawal movement) does not alter our result.
To summarize, these additional results further support the conclusion that spatial and avoidance information is distributed across the neural population rather than being handled by distinct subsets. The analyses revealed no negative correlation between spatial and avoidance encoding, and excluding event-driven neural activity did not alter the observed functional separation, confirming that mPFC neurons dynamically adjust their activity to meet contextual demands.
A second concern is also illustrated by Fig. 7: in the data presented, separate reward and threat encoding neurons were not shown - in the current study design, it is not possible to dissociate reward and threat responses as the data without the threat present were only used to study spatial encoding integrity.
Thank you for this valuable feedback. Other reviewers have also noted that Figure 7 (now Figure 8) is misleading and contains assertions not supported by our experiments. In response, we have revised the model to more accurately reflect our findings. We have eliminated the distinction between reward coding and threat coding neurons, simplifying it to focus on spatial encoding and avoidance encoding neurons. The updated figure will more appropriately align with our findings and claims. A. Distinct functional states (spatial vs. avoidance decision) encoded by the same population neurons are separable by the region (F- vs. E zone). B. Hypothetical control models by which mPFC neurons assume different functional states.
Thirdly, the findings of this work are not mechanistic or functional but are purely correlational. For example, it is claimed that analyzing activity around the withdrawal period allows for ascertaining their functional contributions to decisions. But without a direct manipulation of this activity, it is difficult to make such a claim. The authors later discuss whether the elevated response of Type 2 neurons might simply represent fear or anxiety motivation or threat level, or whether they directly contribute to the decision-making process. As is implicit in the discussion, the current study cannot differentiate between these possibilities. However, the language used throughout does not reflect this.
We acknowledge that our experiments only involve correlational study and this serves as weakness. Although we carefully managed to select word to not to be deterministic, we agree that some of the language might mislead readers as if we found direct functional contribution. Thus, we changed expressions as below.
“We then further analyzed the (functional contribution ->)correlation between neural activity and success and failure of avoidance behavior. If the mPFC neurons (encode ->)participate in the avoidance decisions, avoidance withdrawal (AW; withdrawal before the attack) and escape withdrawal (EW; withdrawal after the attack) may be distinguishable from decoded population activity even prior to motor execution.”
Also, we added part below in discussion section to clarify the limitations of the study.
“Despite this interesting conjecture, any analysis based on recording data is only correlational, mandating further studies with direct manipulation of the subpopulation to confirm its functional specificity.”
Fourthly, the authors mention the representation of different functions in 'distinct spatiotemporal regions' but the bulk of the analyses, particularly in terms of response to the threat, do not compare recordings from PL and IL although - as the authors mention in the introduction - there is prior evidence of functional separation between these regions.
Thank you for bringing this part to our attention. As we mentioned in the introduction, we acknowledge the functional differences between the PL and IL regions. Although differences in spatial encoding between these two areas were not deeply explored, we anticipated finding differences in event encoding, given the distinct roles of the PL and IL in fear and threat processing. However, our initial analysis revealed no significant differences in event encoding between the regions, and as a result, we did not emphasize these differences in the manuscript. To address this point, we have reanalyzed the data separately and included the following findings in the manuscript.
“However, we did not observe a difference in decoding accuracy between the PL and IL ensembles, and there were no significant interactions between regressor type (shuffled vs. original) and regions (mixed-effects model; regions: p=.996; interaction: p=.782). These results indicate that the population activity in both the PL and IL contains spatial information (Figure 2D, Video 3).
[…]
Furthermore, we analyzed whether there is a difference in prediction accuracy between sessions with different recorded regions, the PL and the IL. A repeated two-way ANOVA revealed no significant difference between recorded regions, nor any interaction (regions: F(1, 38) = 0.1828, p = 0.671; interaction: F(1, 38) = 0.1614, p = 0.690).
[…]
We also examined whether there is a significant difference between the PL and IL in the proportion of Type 1 and Type 2 neurons. In the PL, among 379 recorded units, 143 units (37.73%) were labeled as Type 1, and 75 units (19.79%) were labeled as Type 2. In contrast, in the IL, 156 units (61.66%) and 19 units (7.51%) of 253 recorded units were labeled as Type 1 and Type 2, respectively. A Chi-square analysis revealed that the PL contains a significantly higher proportion of Type 2 neurons (χ²(1, 632) = 34.85, p < .001), while the IL contains a significantly higher proportion of Type 1 neurons compared to the other region (χ²(1, 632) = 18.07, p < .001).”
To summarize our additional results, we did not observe performance differences in distance decoding or event decoding. The only difference we observed was the proportional variation of Type 1 and Type 2 neurons when we separated the analysis by brain region. These results are somewhat counterintuitive, considering the distinct roles of the two regions—particularly the PL in fear expression and the IL in extinction learning. However, since the studies mentioned in the introduction primarily used lesion and infusion methods, this discrepancy may be due to the different approach taken in this study. Considering this, we have added the following section to the discussion.
“Interestingly, we found no difference between the PL and IL in the decoding accuracy of distance or avoidance decision. This somewhat surprising considering distinct roles of these regions in the long line of fear conditioning and extinction studies, where the PL has been linked to fear expression and the IL to fear extinction learning (Burgos-Robles et al., 2009; Dejean et al., 2016; Kim et al., 2013; Quirk et al., 2006; Sierra-Mercado et al., 2011; Vidal-Gonzalez et al., 2006). On the other hand, more Type 2 neurons were found in the PL and more Type 1 neurons were found in the IL. To recap, typical Type 1 neurons increased the activity briefly after the head entry and then remained inhibited, while Type 2 neurons showed a burst of activity during head entry and sustained increased activity. One study employing context-dependent fear discrimination task (Kim et al., 2013) also identified two distinct types of PL units: short-latency CS-responsive units, which increased firing during the initial 150 ms of tone presentation, and persistently firing units, which maintained firing for up to 30 seconds. Given the temporal dynamics of Type 2 neurons, it is possible that our unsupervised clustering method may have merged the two types of neurons found in Kim et al.’s study.
While we did not observe decreased IL activity during dynamic foraging, prior studies have shown that IL excitability decreases after fear conditioning (Santini et al., 2008), and increased IL activity is necessary for fear extinction learning. In our paradigm, extinction learning was unlikely, as the threat persisted throughout the experiment. Future studies with direct manipulation of these subpopulations, particularly examining head withdrawal timing after such interventions, could provide insight into how these subpopulations guide behavior.”
Additionally, we made some changes in the introduction, mainly replacing the PL/IL with mPFC to be consistent with the main body of results and conclusion and also specifying the correlational nature of the recording study.
“Machine learning-based populational decoding methods, alongside single-cell analyses, were employed to investigate the correlations between neuronal activity and a range of behavioral indices across different sections within the foraging arena.”
Reviewer 2 (Recommendations):
The authors consistently use parametric statistical tests throughout the manuscript. Can they please provide evidence that they have checked whether the data are normally distributed? Otherwise, non-parametric alternatives are more appropriate.
Thank you for mentioning this important issue in the analysis. We re-ran the test of normality for all our data using the Shapiro-Wilk test with a p-value of .05 and found that the following data sets require non-parametric tests, as summarized in Author response table 1 below. For those analyses which did not pass the normality test, we used a non-parametric alternative test instead. We also updated the methods section. For instance, repeated measures ANOVA for supplementary figure S1 and PCA results were changed to the Friedman test with Dunn’s multiple comparison test.
Author response table 1.
Line 107: it is not clear here or in the methods whether a single drop of sucrose solution is delivered per lick or at some rate during the encounter, both during the habituation or in the final task. This is important information in order to understand how animals might make decisions about whether to stay or leave and how to interpret neural responses during this time period. Or is it a large drop, such that it takes multiple licks to consume? Please clarify.
The apparatus we used incorporated an IR-beam sensor-controlled solenoid valve. As the beam sensor was located right in front of the pipe, the rat’s tongue activated the sensor. As a result, each lick opened the valve for a brief period, releasing a small amount of liquid, and the rat had to continuously lick to gain access to the sucrose. We carefully regulated the flow of the liquid and installed a small sink connected to a vacuum pump, so any remaining sucrose not consumed by the rat was instantly removed from the port. We clarified how sucrose was delivered in the methods section and also in the results section.
Method:
“The sucrose port has an IR sensor which was activated by a single lick. The rat usually stays in front of the lick port and continuously lick up to a rate of 6.3 times per second to obtain sucrose. Any sucrose droplets dropped in the bottom sink were immediately removed by negative pressure so that the rat’s behavior was focused on the licking.”
Result:
“The lick port was activated by an IR-beam sensor, triggering the solenoid valve when the beam was interrupted. The rat gradually learned to obtain rewards by continuously licking the port.”
However, I'm not sure I understand the authors' logic in the interpretation: does the S-phase not also consist of goal-directed behaviour? To me, the core difference is that one is mediated by threat and the other by reward. In addition, it would be helpful to visualize the behaviour in the S-phase, particularly the number of approaches. This difference in the amount of 'experience' so to speak might drive some of the decrease in spatial decoding accuracy, even if travel distance is similar (it is also not clear how travel distance is calculated - is this total distance?) Ideally, this would also be included as a predictor in the GLM.
We agree that the behaviors observed during the shuttling phase can also be considered goal-directed, as the rat moves purposefully toward explicit goals (the sucrose port and the N-zone during the return trip). However, we argue that there is a significant difference in the level of complexity of these goals.
During the L-phase, the rat not only has to successfully navigate to the E-zone for sucrose but also pay attention to the robots, either to avoid an attack from the robot's forehead or escape the fast-striking motion of the claw. When the rat runs toward the E-zone, it typically takes a side-approaching path, similar to Kim and Choi (2018), and exhibits defensive behaviors such as a stretched posture, which were not observed in the S-phase. This behavioral characteristic differs from the S-phase, where the rat adopted a highly stereotyped navigation pattern fairly quickly (within 3 sessions), evidenced by more than 50 shuttling trajectories per session. In this phase, the rat exhibited more stimulus-response behavior, simply repeating the same actions over time without deliberate optimization.
In our additional experiment with two different levels of goal complexity (reward-only vs. reward/threat conflict), we used a between-subject design in which both groups experienced both the S-phase and L-phase before surgery and underwent only one type of session afterward. This approach ruled out the possibility of differences in contextual experience. Additionally, since we initially designed the S-phase as extended training, behaviors in the apparatus tended to stabilize after rats completed both the S-phase and L-phase before surgery. As a result, we compared the post-surgery Lobsterbot phase to the post-surgery shuttling phase to investigate how different levels of goal complexity shape spatial encoding strength.
To clarify our claim, we edited the paragraph below.
“This absence of spatial correlates may result from a lack of complex goal-oriented navigation behavior, which requires deliberate planning to acquire more rewards and avoid potential threats.
[…]
After the surgery, unlike the Lob-Exp group, the Ctrl-Exp group returned to the shuttling phase, during which the Lobsterbot was removed. With this protocol, both groups experienced sessions with the Lobsterbot, but the Ctrl-Exp group's task became less complex, as it was reduced to mere reward collection.
. Given these observations, along with the mPFC’s lack of consistency in spatial encoding, it is plausible that the mPFC operates in multiple functional modes, and the spatial encoding mode is preempted when the complexity of the task requires deliberate spatial navigation.”
Additionally, we added behavior data during initial S-phase into Supplementary Figure 1.
It is good point that the amount of experience might drive decrease in spatial decoding accuracy. To test this hypothesis, we added a new variable, the number of Lobsterbot sessions after surgery, to the previous GLM analysis. The updated model predicted the outcome variable with significant accuracy (F(4,44) = 10.31, p < .001), and with the R-squared value at 0.4838. The regression coefficients were as follows: presence of the Lobsterbot (2.76, standard error [SE] = 1.11, t = 2.42, p = .020), number of recorded cells (-0.43, SE = .08, t = -5.22, p < .001), recording location (0.90, SE = 1.11, p = .424), and number of L sessions (0.002, SE = 0.11, p = .981). These results indicate that the number of exposures to the Lobsterbot sessions, as a measure of experience, did not affect spatial decoding accuracy.
For minor edit, we edited the term as “total travel distance”.
Relating to the previous point, it should be emphasized in both sections on removing the Lobsterbot and on non-navigational behaviours that the spatial decoding is all in reference to distance from the threat (or reward location). The language in these sections differs from the previous section where 'distance from the goal' is mentioned. If the authors wish to discuss spatial decoding per se, it would be helpful to perform the same analysis but relative to the animals' own location which might have equal accuracy across locations in the arena. Otherwise, it is worth altering the language in e.g. line 258 onwards to state the fact that distance to the goal is only decodable when animals are actively engaged in the task.
Thank you for this comment, we changed the term as “distance from the conflict zone” or “distance of the rat to the center of the E-zone” to clarify our experiment setup.
In Fig. 5, why is the number of neurons shown in the PETHs less than the numbers shown in the pie charts?
The difference in the number of neurons between the PETHs and the pie charts in Figure 5 is because PETHs are drawn only for 'event-responsive' units. For visualizing the neurons, we selectively included those that met certain criteria described in Method section (Behavior-responsive unit analysis). We have updated the caption for Figure 5 as follows to minimize confusion.
“Multiple subpopulations in the mPFC react differently to head entry and head withdrawal.
(A) Top: The PETH of head entry-responsive units is color-coded based on the Z-score of activity.
(C) The PETH of head withdrawal-responsive units is color-coded based on the Z-score of activity.”
I appreciate the amount of relatively unprocessed data plotted in Figure 5, but it would be great to visualize something similar for AW vs. EW responses within the HW2 population. In other words, what is there that's discernably different within these responses that results in the findings of Fig. 6?
To visualize the difference in neural activity between AW and EW, we included an additional supplementary figure (Supplementary Figure 5). We divided the neurons into Type 1 and Type 2 and plotted PETH during Avoidance Withdrawal (AW) and Escape Withdrawal (EW). Consistent with the results shown in Figure 6d, we could visually observe increased activity in Type 2 neurons before the execution of AW compared to EW. However, we couldn’t find a similar pattern in Type 1 neurons.
On a related note, it would add explanatory power if the authors were able to more tightly link the prediction accuracy of the ensemble (particularly the Type 2 neurons) to the timing of the behaviour. Earlier in the manuscript it would be helpful to show latency to withdraw in AW trials; are animals leaving many seconds before the attack happens, or are they just about anticipating the timing of the attack? And therefore when using ensemble activity to predict the success of the AW, is the degree to which this can be done in advance (as the authors say, up to 6 seconds before withdrawal) also related to how long the animal has been engaged with the threat?
We agree that the timing of head withdrawal, particularly in AW trials, is a critical factor in describing the rat's strategy toward the task. To test whether the rat uses a precise timing strategy—for instance, leaving several seconds before the attack or exploiting the discrete 3- and 6-second attack durations—we plotted all head withdrawal timepoints during the 6-second trials. The distribution was more even, without distinguishable peaks (e.g., at the very initial period or at the 3- or 6-second mark). This indicates a lack of precise temporal strategy by the rat. We included additional data in the supplementary figure (Supplementary Figure 6) and added the following to the results section.
“We monitored all head withdrawal timepoints to assess whether rats developed a temporal strategy to differentiate between the 3-second and 6-second attacks. We found no evidence of such a strategy, as the timings of premature head withdrawals during the 6-second attack trials were evenly distributed (see Supplementary Figure S1).”
As depicted in the new supplementary figure, head withdrawal times during avoidance behavior vary from sub-seconds to the 3- or 6-second attack timepoints. After receiving the reviewer’s comment, we became curious whether there is a decoding accuracy difference depending on how long the animal engaged with the threat. We selected all 6-second attack and avoidance withdrawal trials and checked if correctly classified trials (AW trials classified as AW) had different head withdrawal times—perhaps shorter durations—compared to misclassified trials (AW trials classified as EW). As shown in Author response image 3 below, there was no significant difference between these two types, indicating that the latency of head withdrawal does not affect prediction accuracy.
Author response image 3.
Finally, there remain some open questions. One is how much encoding strength - of either space or the decision to leave during the encounter - relates to individual differences in animal performance or behaviour, particularly because this seems so variable at baseline. A second is how stable this encoding is. The authors mention that the distance encoding must be stable to an extent for their regressor to work; I am curious whether this stability is also found during the encounter coding, and also whether it is stable across experience. For example, in a session when an individual has a high proportion of anticipatory withdrawals, is the proportion of Type 2 neurons higher?
Thank you for these questions. To recap the number of animals that we used, we used five rats during Lobsterbot experiments, and three rats for control experiment that we removed Lobsterbot after training. Indeed, there were individual differences in performance (i.e. avoidance success rate), number of recorded units (related to the recording quality), and baseline behaviors. To clarify these differences, see author response image 4 below.
Author response image 4.
We used a GLM to measure how much of the decoder’s accuracy was explained by individual differences. The result showed that 38.96% of distance regressor’s performance, and 12.14% of the event classifier’s performance was explained by the individual difference. Since recording quality was highly dependent on the animals, the high subject variability detected in the distance regression might be attributed to the number of recorded cells. Rat00 which had the lowest average mean absolute error had the highest number of recorded cells at average of 18. Compared to the distance regression, there was less subject variability in event classification. Indeed, the GLM results showed that the variability explained by the number of cells was only 0.62% in event classification.
The reason we mentioned that "distance encoding must be stable for our regressor to work" is entirely based on the population-level analysis. Because we used neural data and behaviors from entire trials within a session, the regressor or classifier would have low accuracy if encoding dynamics changed within the session. In other words, if the way neurons encode avoidance/escape predictive patterns changed within a training set, the classifier would fail to generate an optimized separation function that works well across all datasets.
To further investigate whether changes in experience affect event classification results over time, we plotted an additional graph below. Although there are individual and daily fluctuations in decoding accuracy, there was no observable trend throughout the experiments.
Author response image 5.
Regarding the correlation between the ratio of avoidance withdrawal and the proportion of Type 2 neurons, we were also curious and analyzed the data. Across 40 sessions, the correlation was -0.0716. For Type 1 neurons, it was slightly higher at 0.1459. We believe this indicates no significant relationship between the two variables.
Minor points:
I struggled with the overuse of acronyms in the paper. Some might be helpful but F-zone/N-zone, for example, or HE/HW, AW/EW are a bit of a struggle. After reading the paper a few times I learned them but a naive reader might need to often refer back to when they were first defined (as I frequently had to).
To increase readability, we removed acronyms that are not often used and changed HE/HW to head-entry/head-withdrawal.
I have a few questions about Figure 1F: in the text (line 150) it says that 'surgery was performed after three L sessions when the rats displayed a range of 30% to 60% AW'. This doesn't seem consistent with what is plotted, which shows greater variability in the proportion of AW behaviours both before and after surgery. It also appears that several rats only experienced two days of the L1 phase; please make clear if so. And finally, what is the line at 50% indicating? Neither the text nor the legend discuss any sort of thresholding at 50%. Instead, it would be best to make the distinction between pre- and post-surgery behaviour visually clearer.
Thank you for pointing out this issue. We acknowledge there was an error in the text description. As noted in the Methods section, we proceeded with surgery after three Lobsterbot sessions. We have removed the incorrect part from the Results section and revised the Methods section for clarity.
“After three days of Lobsterbot sessions, the rats underwent microdrive implant surgery, and recording data were collected from subsequent sessions, either Lobsterbot or shuttling sessions, depending on the experiment. For all post-surgery sessions, those with fewer than 20 approaches in 30 minutes were excluded from further analysis.”
Among the five rats, Rat2 and Rat3 did not approach the robot during the entire Lob2 session, which is why these two rats do not have Lob2 data points. We updated the caption for regarding issue.
Initially, we added a 50% reference line, but we agree it is unnecessary as we do not discuss this reference. We have updated the figure to include the surgery point, as shown in Supplementary Figure 1.
Fig. 2C: each dot is an ensemble of simultaneously recorded neurons, i.e. a subset of the total 800-odd units if I understand correctly. How many ensembles does each rat contribute? Similarly, is this evenly distributed across PL and IL?
Yes, each dot represents a single session, with a total of 40 sessions. Five rats contributed 11, 9, 8, 7, and 5 sessions, respectively. Although each rat initially had more than 10 sessions, we discarded some sessions with a low unit count (fewer than 10 sessions; as detailed in Materials and Methods - Data Collection). We collected 25 sessions from the PL and 15 sessions from the IL. Our goal was to collect more than 200 units per each region.
Please show individual data points for Fig. 2D.
We update the figure with individual data points.
Is there a reason why the section on removing the Lobsterbot (lines 200 - 215) does not have associated MAE plots? Particularly the critical comparison between Lob-Exp and Ctl-Exp.
We intentionally removed some graphs to create a more compact figure, but we appreciate your suggestion and have included the graph in Figure 2.
Some references to supplementary materials are not working, e.g. line 333.
Our submitted version of manuscript had reference error. For the current version, we used plane text, and the references are fixed.
The legend for Supp. Fig. 2B is incorrect.
We greatly appreciate this point. We changed the caption to match the figure.
Reviewer 3 (Public Review):
Thank you for recognizing our efforts in designing an ethologically relevant foraging task to uncover the multiple roles of the mPFC. While we acknowledge certain limitations in our methodology—particularly that we only observed correlations between neural activity and behavior without direct manipulation—we have conducted additional analyses to further strengthen our findings.
Weakness:
The primary concern with this study is the absence of direct evidence regarding the role of the mPFC in the foraging behavior of the rats. The ability to predict heterogeneous variables from the population activity of a specific brain area does not necessarily imply that this brain area is computing or using this information. In light of recent reports revealing the distributed nature of neural coding, conducting direct causal experiments would be essential to draw conclusions about the role of the mPFC in spatial encoding and/or threat evaluation. Alternatively, a comparison with the activity from a different brain region could provide valuable insights (or at the very least, a comparison between PL and IL within the mPFC).
Thank you for the comment. Indeed, the fundamental limitation of the recording study is that it is only correlational, and any causal relationship between neural activity and behavioral indices is only speculative. We made it clearer in the revision and refrained from expressing any speculative ideas suggesting causality throughout the revision. While we did not provide direct evidence that the mPFC is computing or utilizing spatial/foraging information, we based our assertion on previous studies that have directly demonstrated the mPFC's role in complex decision-making tasks (Martin-Fernandez et al., 2023; Orsini et al., 2018; Zeeb et al., 2015) and in certain types of spatial tasks (De Bruin et al., 1994; Sapiurka et al., 2016) . We would like to emphasize that, to the best of our knowledge, there was no previous study which investigated the mPFC function while animal is solving multiple heterogenous problems in semi-naturalistic environment. Therefore, although our recording study only provides speculative causal inference, it certainly provides a foundation for investigating the mPFC function. Future study employing more sophisticated, cell-type specific manipulations would confirm the hypotheses from the current study.
One of the key questions of this manuscript is how multiple pieces of information are represented in the recorded population of neurons. Most of the studies mentioned above use highly structured experimental designs, which allow researchers to study only one function of the mPFC. In the current study, the semi-naturalistic environment allows rats to freely switch between multiple behavioral sets, and our decoding analysis quantitatively assesses the extent to which spatial/foraging information is embedded during these sets. Our goal is to demonstrate that two different task hyperspaces are co-expressed in the same region and that the degree of this expression varies according to the rat’s current behavior (See Figure 8(b) in the revised manuscript).
Alternatively, we added multiple analyses. First, we included a single unit-level analysis looking at the place cell-like property to contrast with the ensemble decoding. Most neurons did not show well-defined place fields although there were some indications for place cell-like property. For example, some neurons displayed fragmented place fields or unusually large place fields only at particular spots in the arena (mostly around the gates). The accuracy from this place information at the single-neuron level is much lower than that acquired from population decoding. Likewise, although there were neurons with modulated firing around the time of particular behavior (head entry and withdrawal), overall prediction accuracy of avoidance decision was much higher when the ensemble-based classifier was applied.
Moreover, given that high-dimensional movement has been shown to be reflected in the neural activity across the entire dorsal cortex, more thorough comparisons between the neural encoding of task variables and movement would help rule out the possibility that the heterogeneous encoding observed in the mPFC is merely a reflection of the rats' movements in different behavioral modes.
Thanks for the comment. We acknowledge that the neural activity may reflect various movement components across different zones in the arena. We performed several analyses to test this idea. First, we want to recap our run-and-stop event analysis may provide an insight regarding whether the mPFC neurons are encoding locations despite the significant motor events. The rats typically move across the F-zone fairly routinely and swiftly (as if they are “running”) to reach the E-zone at which they reduce the moving speed to almost a halt (“stopping”). The PETHs around these critical motor events, however, did not show any significant modulation of neural activity indicating that most neurons we recorded from mPFC did not respond to movement.
We added this analysis to demonstrate that these sudden stops did not evoke the characteristic activation of Type 1 and Type 2 neurons observed during head entry into the E-zone. When we isolated these sudden stops outside the E-zone, we did not observe this neural signature (Supplementary Figure 2).
Second, our PCA results showed that population activity in the E-zone during dynamic foraging behavior was distinct from the activity observed in the N- and F-zones during navigation. However, there is a possibility that the two behaviorally significant events—entry into the E-zone and voluntary or sudden exit—might be driving the differences observed in the PCA results. To account for this, we designated ±1 second from head entry and head withdrawal as "critical event times," excluded the corresponding neural data, and reanalyzed the data. This method removed neural activity associated with sudden movements in specific zones. Despite this exclusion, the PCA still revealed distinct population activity in the E-zone, different from the other zones (Supplementary Figure 4). This result reduces the likelihood that the observed heterogeneous neural activity is merely a reflection of zone-specific movements.
Lastly, the main claim of the paper is that the mPFC population switches between different functional modes depending on the context. However, no dynamic analysis or switching model has been employed to directly support this hypothesis.
Thank you for this comment. Since we did not conduct a manipulation experiment, there is a clear limitation in uncovering how switching occurs between the two task contexts. To make the most of our population recording data, we added an additional results section that examines how individual neurons contribute to both the distance regressor and the event classifier. Our findings support the idea that distance and dynamic foraging information are distributed across neurons, with no distinct subpopulations dedicated to each context. This suggests that mPFC neurons adjust their coding schemes based on the current task context, aligning with Duncan’s (2001) adaptive coding model, which posits that mPFC neurons adapt their coding to meet the task's current demands.
Reviewer 3 (Recommendations):
The evidence for spatial encoding is relatively weak. In the F-zone (50 x 48 cm), the average error was approximately 17 cm, constituting about a third of the box's width and likely not significantly smaller than the size of a rat's body. The errors in the shuffled data are also not substantially greater than those in the original data. An essential test indicates that spatial decoding accuracy decreases when the Losterbot is removed. However, assessing the validity of the results is difficult in the current state. There is no figure illustrating the results, and no statistics are provided regarding the test for matching the number of neurons.
We acknowledge that the average error (~ 17 cm ) measured in our study is relatively large, even though the error is significantly smaller than that by the shuffled control model (22.6 cm). Previous studies reported smaller prediction errors but in different experimental conditions: 16 cm in Kaefer et al. (2020) and less than 10 cm in Ma et al. (2023) and Mashhoori et al. (2018). Most notably, the average number of units used in our study (15.8 units per session) is significantly smaller compared to the previous works, which used 63, 49, and 40 units, respectively. As our GLM results demonstrated, the number of recorded cells significantly influenced decoding accuracy (β = -0.43 cm/neuron). With a similar number of recorded cells, we would have achieved comparable decoding accuracy. In addition, unlike other studies that have employed a dedicated maze such as the virtual track or the 8-shaped maze, we exposed rats to a semi-naturalistic environment where they exhibited a variety of behaviors beyond simple navigation. As argued throughout the manuscript, we believe that the spatial information represented in the mPFC is susceptible to disruption when the animal engages in other activities. A similar phenomenon was reported by Mashhoori et al. (2018), where the decoder, which typically showed a median error of less than 10 cm, exhibited a much higher error—nearly 100 cm—near the feeder location.
As for the reviewer’s request for comparing spatial decoding without the Lobsterbot, we added a new figure to illustrate the spatial decoding results, including statistical details. We also applied a Generalized Linear Model to regress out the effect of the number of recorded neurons and statistically assess the impact of Lobsterbot removal. This adjustment directly addresses the reviewer's request for a clearer presentation of the results and helps contextualize the decoding performance in relation to the number of recorded neurons.
As indicated in the public review, drawing conclusions about the role of the mPFC in navigation and avoidance behavior during the foraging task is challenging due to the exclusively correlational nature of the results. The accuracy in AW/EW discrimination increases a few seconds before the response, implying that changes in mPFC activity precede the avoidance/escape response. However, one must question whether this truly reflects the case. Could this phenomenon be attributed to rats modifying their "micro-behavior" (as evidenced by changes in movement observed in the video) before executing the escape response, and subsequently influencing mPFC activity?
We appreciate the reviewer's thoughtful observation regarding the correlational nature of our results and the potential influence of pre-escape micro-behaviors on mPFC activity. We acknowledge that the increased accuracy in AW/EW discrimination preceding the response could also be correlated with micro-behaviors. However, there is very little room for extraneous behavior other than licking the sucrose delivery port within the E-zone, as the rats are highly trained to perform this stereotypical behavior. To support this, we measured the time delays between licking events (inter-lick intervals). The results show a sharp distribution, with 95% of the intervals falling within a quarter second, indicating that the rats were stable in the E-zone, consistently licking without altering their posture.
To complement the data presented in Author response image 2, a video clip showing a rat engaged in licking behavior was included. We carefully designed the robot compartment and adjusted the distance between the Lobsterbot and the sucrose port to ensure that rats could exhibit only limited behaviors inside the E-zone. The video confirms that no significant micro-behaviors were observed during the rat’s activity in the E-zone.
If mPFC activity indeed switches mode, the results do not clearly indicate whether individual cells are specifically dedicated to spatial representation and avoidance or if they adapt their function based on the current goal. Figure 7, presented as a schematic illustration, suggests the latter option. However, the proportion of cells in the HE and HW categories that also encode spatial location has not been demonstrated. It has also not been shown how the switch is manifested at the level of the population.
Thank you for this comment. As the reviewer pointed out, we suggest that mPFC neurons do not diverge based on their functions, but rather adapt their roles according to the current goal. To support this assertion, we added an additional results section that calculates the feature importance of decoders. This analysis allows us to quantitatively measure each neuron’s contribution to both the distance regressor and the event decoder. Our results indicate that distance and defensive behavior are not encoded by a small subset of neurons; instead, the information is distributed across the population. Shuffling the neural data of a single neuron resulted in a median increase in decoding error of 0.73 cm for the distance regressor and 0.01% for the event decoder, demonstrating that the decoders do not rely on a specific subset of neurons that exclusively encode spatial and/or defensive behavior
Although we found supporting evidence that mPFC neurons encode two different types of information depending on the current context, we acknowledge that we could not go further in answering how this switch is manifested. One simple explanation is that the function is driven by current contextual information and goals—in other words, a bottom-up mechanism. However, in our control experiment, simplifying the navigation task worsened the encoding of spatial information in the mPFC. Therefore, we speculate that an external or internal arbitrator circuit determines what information to encode. A precise temporal analysis of the timepoint when the switch occurs in more controlled experiments might answer these questions. We have added this discussion to the discussion section.
PL and IL are two distinct regions; however, there is no comparison between the two areas regarding their functional properties or the representations of the cells. Are the proportions of cell categories (HE vs HW or HE1 vs HE2, spatial encoding vs no spatial encoding) different in IL and PL? Are areas differentially active during the different behaviors?
Thank you for bringing up this issue. As mentioned in our response to the public review, we included a comparison between the PL and IL regions. While we did not observe any differences in spatial encoding (feature importance scores), the only distinction was in the proportion of Type 1 and Type 2 neurons, as the reviewer suggested. We have incorporated our interpretation of these results into the discussion section.
The results and interpretations of the cluster analysis appear to be highly dependent on the parameters used to define a cluster. For example, the HE2 category includes cells with activity that precedes events and gradually decreases afterward, as well as cells with activity that only follows the events.
We strongly agree that dependency on hyperparameters is a crucial point when using unsupervised clustering methods. To eliminate any subjective criteria in defining clusters, we carefully selected our clustering approach, which requires only two hyperparameters: the number of initial clusters (set to 8) and the minimum number of cells required to be considered a valid cluster (cutoff limit, 50). The rationale behind these choices was: 1) a higher number of initial clusters would fail to generalize neural activity, 2) clusters with fewer than 50 neurons would be difficult to analyze, and 3) to prevent the separation of clusters that show noisy responses to the event.
Author response table 2 shows the differences in the number of cell clusters when we varied these two parameters. As demonstrated, changing these two variables does result in different numbers of clusters. However, when we plotted each cluster type’s activity around head entry (HE) and head withdrawal (HW), an increased number of clusters resulted in the addition of small subsets with low variation in activity around the event, without affecting the general activity patterns of the major clusters.
The example mentioned by the reviewer—possible separation of HE2—appears when using a hyperparameter set those results in 4 clusters, not 3. In this result, 83 units, which were labeled as HE2 in the 3-cluster hyperparameter set, form a new group, HE3 (Group 3). This group of units shows increased activity after head entry and exhibited characteristics similar to HE2, with most of the units classified as HW2, maintaining high activity until head withdrawal. Among the 83 HE3 units, 36 were further classified as HW2, 44 as non-significant, and 3 as HW1. Therefore, we believe this does not affect our analysis, as we observed the separation of two major groups, Type 1 (HE1-HW1) and Type 2 (HE2-HW2), and focused our analysis on these groups afterward.
Despite this validation, there remains a strong possibility that our method might not fully capture small yet significant subpopulations of mPFC units. As a result, we have included a sentence in the methods section addressing the rationale and stability of our approach.
“(Materials and Methods) To compensate for the limited number of neurons recorded per session, the hyperparameter set was chosen to generalize their activity and categorize them into major types, allowing us to focus on neurons that appeared across multiple recording sessions. Although changes in the hyperparameter sets resulted in different numbers of clusters, the major activity types remained consistent (Supplementary Figure S8). However, there is a chance that this method may not differentiate smaller subsets of neurons, particularly those with fewer than 50 recorded neurons.”
Author response table 2.
Minor points:
Line 333: Error! Reference source not found. This was probably the place for citing Figure S2?
Lines 339, 343: Error! Reference source not found.
Thank you for mentioning these comments. In the new version, all reference functions from Word have been replaced with plain text.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study examines how self-citations in selected neurology, neuroscience, and psychiatry journals differ according to seniority, geography, gender and subfield. The evidence supporting the claims is convincing, and the article is a valuable addition to the literature on self-citations.
-
Joint Public Review:
Editors’ note: This is the third version of this article, and it addresses the points made during the peer review of the second version by performing additional analyses and clarifying some of the limitations of the study.
Comments made during the peer review of the first version, along with author's responses to these comments, are available with previous versions of the article.
The following summary of the article is taken from comments made by Reviewer #1 about version 2 of the article:
In this manuscript, the authors use a large dataset of neuroscience publications to elucidate the nature of self-citation within the neuroscience literature. The authors initially present descriptive measures of self-citation across time and author characteristics; they then produce an inclusive model to tease apart the potential role of various article and author features in shaping self-citation behavior. This is a valuable area of study, and the authors approach it with a rich dataset and solid methodology.
-
Author response:
The following is the authors’ response to the previous reviews
Public Reviews:
Reviewer #1 (Public review):
In this manuscript, the authors use a large dataset of neuroscience publications to elucidate the nature of self-citation within the neuroscience literature. The authors initially present descriptive measures of self-citation across time and author characteristics; they then produce an inclusive model to tease apart the potential role of various article and author features in shaping self-citation behavior. This is a valuable area of study, and the authors approach it with a rich dataset and solid methodology.
The revisions made by the authors in this version have greatly improved the validity and clarity of the statistical techniques, and as a result the paper's findings are more convincing.
This paper's primary strengths are: 1) its comprehensive dataset that allows for a snapshot of the dynamics of several related fields; 2) its thorough exploration of how self-citation behavior relates to characteristics of research and researchers.
Thank you for your positive view of our paper and for your previous comments.
Its primary weakness is that the study stops short of digging into potential mechanisms in areas where it is potentially feasible to do so - for example, studying international dynamics by identifying and studying researchers who move between countries, or quantifying more or less 'appropriate' self-citations via measures of abstract text similarity.
We agree that these are limitations of the existing study. We updated the limitations section as follows (page 15, line 539):
“Similarly, this study falls short in several potential mechanistic insights, such as by investigating citation appropriateness via text similarity or international dynamics in authors who move between countries.”
Yet while these types of questions were not determined to be in scope for this paper, the study is quite effective at laying the important groundwork for further study of mechanisms and motivations, and will be a highly valuable resource for both scientists within the field and those studying it.
Reviewer #2 (Public review):
The study presents valuable findings on self-citation rates in the field of Neuroscience, shedding light on potential strategic manipulation of citation metrics by first authors, regional variations in citation practices across continents, gender differences in early-career self-citation rates, and the influence of research specialization on self-citation rates in different subfields of Neuroscience. While some of the evidence supporting the claims of the authors is solid, some of the analysis seems incomplete and would benefit from more rigorous approaches.
Thank you for your comments. We have addressed your suggestions presented in the “Recommendations for the authors” section by performing your recommended sensitivity analysis that specifically identifies authors who could be considered neurologists, neuroscientists, and psychiatrists (as opposed to just papers that are published in these fields). Please see the “Recommendations for the authors” section for more details.
Reviewer #3 (Public review):
This paper analyses self-citation rates in the field of Neuroscience, comprising in this case, Neurology, Neuroscience and Psychiatry. Based on data from Scopus, the authors identify self-citations, that is, whether references from a paper by some authors cite work that is written by one of the same authors. They separately analyse this in terms of first-author self-citations and last-author self-citations. The analysis is well-executed and the analysis and results are written down clearly. The interpretation of some of the results might prove more challenging. That is, it is not always clear what is being estimated.
This issue of interpretability was already raised in my review of the previous revision, where I argued that the authors should take a more explicit causal framework. The authors have now revised some of the language in this revision, in order to downplay causal language. Although this is perfectly fine, this misses the broader point, namely that it is not clear what is being estimated. Perhaps it is best to refer to Lundberg et al. (2021) and ask the authors to clarify "What is your Estimand?" In my view, the theoretical estimands the authors are interested in are causal in nature. Perhaps the authors would argue that their estimands are descriptive. In either case, it would be good if the authors could clarify that theoretical estimand.
Thank you for your comment and for highlighting this insightful paper. After reading this paper, we believe that our theoretical estimand is descriptive in nature. For example, in the abstract of our paper, we state: “This work characterizes self-citation rates in basic, translational, and clinical Neuroscience literature by collating 100,347 articles from 63 journals between the years 2000-2020.” This goal seems consistent with the idea of a descriptive estimand, as we are not interested in any particular intervention or counterfactual at this stage. Instead, we seek to provide a broad characterization of subgroup differences in self-citations such that future work can ask more focused questions with causal estimands.
Our analysis included subgroup means and generalized additive models, both of which were described as empirical estimands for a theoretical descriptive estimand in Lundberg et al. We added the following text to the paper (page 3, line 112):
“Throughout this work, we characterized self-citation rates with descriptive, not causal, analyses. Our analyses included several theoretical estimands that are descriptive 17, such as the mean self-citation rates among published articles as a function of field, year, seniority, country, and gender. We adopted two forms of empirical estimands. First, we showed subgroup means in self-citation rates. We then developed smooth curves with generalized additive models (GAMs) to describe trends in self-citation rates across several variables.”
In addition, we added to the limitations section as follows (page 15, line 539):
“Yet, this study may lay the groundwork for future works to explore causal estimands.”
Finally, in my previous review, I raised the issue of when self-citations become "problematic". The authors have addressed this issue satisfactorily, I believe, and now formulate their conclusions more carefully.
Thank you for your previous comments. We agree that they improved the paper.
Lundberg, I., Johnson, R., & Stewart, B. M. (2021). What Is Your Estimand? Defining the Target Quantity Connects Statistical Evidence to Theory. American Sociological Review, 86(3), 532-565. https://doi.org/10.1177/00031224211004187
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
Thank you for your thorough revisions and responses to the reviews
Reviewer #2 (Recommendations for the authors):
I appreciate the authors' responses and am satisfied with all their replies except for my second comment. I still find the message conveyed slightly misleading, as the results seem to be generalized to neurologists, neuroscientists, and psychiatrists. It is important to refine the analysis to focus specifically on neuroscientists, identified as first or last authors based on their publication history. This approach is common in the science of science literature and would provide a more accurate representation of the findings specific to neuroscientists, avoiding the conflation with other related fields. This refinement could serve as a robustness check in the supplementary. I think adding this sub-analysis is essential to the validity of the results claimed in this paper.
Thank you for your comment. We added a sensitivity analysis where fields are defined by an author’s publication history, not by the journal of each article.
In the main text, we added the following:
(Page 3, line 129) “When determining fields by each author’s publication history instead of the journal of each article, we observed similar rates of self-citation (Table S7). The 95% confidence intervals for each field definition overlapped in most cases, except for Last Author self-citation rates in Neuroscience (7.54% defined by journal vs. 8.32% defined by author) and Psychiatry (8.41% defined by journal vs. 7.92% defined by author).”
Further details are provided in the methods section (page 21, line 801):
“4.11 Journal-based vs. author-based field sensitivity analyses
We refined our field-based analysis to focus only on authors who could be considered neuroscientists, neurologists, and psychiatrists. For each author, we looked at the number of articles they had in each subfield, as defined by Scopus. We considered 12 subfields that fell within Neurology, Neuroscience, and Psychiatry. These subfields are presented in Table S12. For each First Author and Last Author, we excluded them if any of their three most frequently published subfields did not include one of the 12 subfields of interest. If an author’s top three subfields included multiple broader fields (e.g., both Neuroscience and Psychiatry), then that author was categorized according to the field in which they published the most articles. Among First Authors, there were 86,220 remaining papers, split between 33,054 (38.33%) in Neurology, 23,216 (26.93%) in Neuroscience, and 29,950 (34.73%) in Psychiatry. Among Last Authors, there were 85,954 remaining papers, split between 31,793 (36.98%) in Neurology, 25,438 (29.59%) in Neuroscience, and 28,723 (33.42%) in Psychiatry.”
Reviewer #3 (Recommendations for the authors):
I would like to thank the authors for their responses the points that I raised, I do not have any new comments or further responses.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important contribution to the field demonstrates the role of a single transcription factor with cell-autonomous functions in the differentiation of two distinct neuronal populations in regulating the interactions between those cells in a non-autonomous manner to generate their final organized projection pattern. There are additional quantifications and controls that would enhance the study and would improve the strength of the evidence from incomplete if they were performed.
-
Reviewer #1 (Public review):
Summary:
This study seeks to investigate the role of the transcription factor Bcl11b/Ctip2 in regulating subcerebral projection neuron (SCPN) axon development through both cell-autonomous and non-cell-autonomous mechanisms. The authors demonstrate that Bcl11b is required within SCPNs for axonal outgrowth and proper entry into the internal capsule, while its expression in medium spiny neurons (MSNs) influences SCPN axon fasciculation and pathfinding in a non-cell-autonomous manner. Notably, through transcriptomic analysis, immunocytochemistry, and in vivo growth cone purification, the study identifies Cdh13 as a downstream mediator of Bcl11b function, localizing along axons and at growth cone surfaces to regulate SCPN axonal outgrowth.
Strengths:
To me the most interesting aspect of this study is how common transcriptional programs across neuronal cell types cooperate to facilitate axon pathfinding, this is a very interesting concept.
Overall, it could be of interest to the brain development field.
Weaknesses:
My main concern is that, as presented in the figures, many phenotypes are too subtle to be convincing and would require quantitative analyses to corroborate the claims of the study.
I also think that the growth cones transcription data needs additional validation to be incorporated into the manuscript. In fact, I am not even sure that it really brings anything to the story.
I also think that the CRISPR in utero electroporation experiments lack appropriate controls.
-
Reviewer #2 (Public review):
Summary:
Itoh et al. investigate the role of the zinc finger transcription factor Bcl11b/Citp2 on sub cerebral projection neurons (SCPN) development. They dissect Bcl11b cell-autonomous and non-cell-autonomous functions on subcerebral projection neurons. In addition, they identify Cdh13 as a downstream target of Bcl11b in the process of SCPN axon outgrowth.
Strengths:
Itoh et al. take advantage of a mouse CRE/Lox genetic system as a powerful tool to distinguish Bcl11b cell-autonomous function on cortical layer V subcerebral projection neurons and its non-cell-autonomous function mediated by the striatal medium spiny neurons (MSN).
Besides the description of the cellular and anatomical defects of the corticofugal projection neurons' outgrowth and fasciculation, they perform a transcriptomic analysis of SCPN somata to identify Bcl11b target genes. As a result, they find that Cdh13, a membrane-anchored cadherin , is downstream of Bcl11b and mediates its cell-autonomous role on axon outgrowth. To validate the role of Cdh13 as a mediator of Bcl11b on SCPN development, they set up a new technique to identify and quantify superficial antigens on growth cone membranes.
Weaknesses:
While the authors shed light on the role of Bcl11b on SCPN development, they lack to contextualize their findings on the previously described interplay between Bcl11a and b.<br /> In addition, this work is another example of the common practice of picking from a list of differentially expressed genes the most likely ones. This approach, while useful, does not allow the identification of new and unknown players.
-
Author response:
We appreciate that the reviewers recognize the conceptual novelty of our work and find our work interesting.
Reviewer #1:
We thank Reviewer #1 for making us aware that the image presentation of some of what we see as very clear phenotypes in our work might not have been optimal in the reviewed pdf file, presumably due to the relatively low resolution and lack of appropriately magnified images in the merged pdf file. This issue– if not caught and corrected now– might have caused future readers to similarly not appreciate these clear phenotypes. We will carefully revise the figures and ensure maintenance of appropriate pdf resolution in the merged file so that image presentation is optimal and our findings are appropriately represented.
We appreciate that Reviewer #1 carefully and critically assessed the growth cone transcriptomic data. We agree that future additional validation is warranted, and this will be clearly stated in our revised paper. Because we judge that these data – even in their current form – will be of potential interest to other investigators sooner rather than later, we respectfully offer and request that we should share them in this paper as our attempt so far to identify elements of the relevant growth cone biology, rather than waiting for years before completing additional validation.
Even upon repeated reflection, we judge and respectfully submit that our CRISPR in utero electroporation experiments are, indeed, conducted with appropriate controls. We thought through the potential controls deeply prior to completing these complex experiments. We will describe our reasoning in detail in our point-by-point response.
Reviewer #2:
We thank Reviewer #2 for encouraging us to elaborate on the direction and cross- repressive interplay between Bcl11a and Bcl11b, which we previously identified (Woodworth*, Greig* et al., Cell Rep, 2016). We omitted deep discussion because we had already published this result, cited that work, and did not want to seem overly self- referential, as well as for reasons of length. Though we know and have reported that Bcl11a and Bcl11b are cross-repressive in SCPN development, we currently do not know whether increased Bcl11a expression in Bcl11b-null SCPN contributes to reduced Cdh13 expression. Also, we do not know if there is a similar Bcl11a-Bcl11b cross repression in striatal medium spiny neurons. This will be clarified in our revised paper.
We agree fully with the reviewer that “the common practice of picking from a list of differentially expressed genes the most likely ones” has been useful for and has substantially contributed to the elucidation of molecular mechanisms in many systems, including in CNS development. Indeed, the current paper identifies Cdh13 as a newly recognized functional molecule in SCPN axon development by in part using this approach. Cdh13 belongs to a well-known gene family, and its expression by SCPN was already reported by us (Arlotta*, Molyneauz* et al., Neuron, 2005). Despite these two facts, we newly identify its function in SCPN development, which has never been investigated or reported. We appreciate the reviewer encouraging us to elaborate on this here.
Recent technical advancement allows functional screening of a larger list of genes in vivo (Jin et al., Science, 2020; Ramani et al., bioRxiv, 2024; Zheng et al., Cell, 2024). That said, it is still a challenge to specifically access SCPN in vivo and apply such a high-throughput screening assay for axon development. We agree and predict that future work of this type might likely lead to identification of other new and unknown molecular regulators. We respectfully submit that our work reported here will provide useful foundation for many such future studies.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study investigates how neural representations in the postrhinal and medial entorhinal cortices evolve with the learning of a visual associative memory task in mice. The findings provide new insights into how non-spatial information is differentially encoded across interconnected brain areas, with strong evidence that stimulus encoding is robust in the postrhinal cortex and emerges more weakly in the medial entorhinal cortex with learning. The evidence is solid overall, particularly in the use of sophisticated population-level analyses and two-photon imaging across learning phases, although the interpretation of regression models and clustering would benefit from additional clarity and control. The work will be of broad interest to systems neuroscientists studying learning, memory, and cortical circuit function.
-
Reviewer #1 (Public review):
Summary:
Nysten et al. use in vivo 2-photon calcium imaging in behaving mice learning a visual associative memory task to understand how neural dynamics in the postrhinal cortex and medial entorhinal cortex evolve over task learning and through reversal learning. Using a combination of analyses to measure trial-averaged neural responses, regression models, and population decoding methods, the authors argue that both POR and MEC dynamics evolve over learning, with relatively more neurons in MEC becoming responsive. The impact of this study comes from comparing neural dynamics across multiple medial temporal lobe circuits to show how different aspects of task structure are differentially encoded. Below, I have listed several major concerns that need to be addressed to ensure the findings are robust.
Strengths:
(1) The study employs a well-controlled behavioral paradigm alongside powerful cellular-resolution two-photon imaging, enabling high-throughput recordings of hundreds of neurons simultaneously in deep brain structures.
(2) The simplicity of the task allows for a detailed examination of learning dynamics across multiple stages, including early and late learning in the main task, as well as during reversal learning.
(3) The use of sophisticated analysis methods to compare and contrast learning dynamics in large neuronal populations strengthens the study, though additional steps are needed to ensure their robustness (detailed below).
(4) Two-photon imaging enables the investigation of functional topography, further supporting previous findings of functional clustering in MEC across different task and behavioral domains.
Weaknesses:
(1) GLM Robustness & Behavioral Attribution: The current GLM design may misattribute neural activity by lacking appropriate time lags for velocity and not accounting for distinct neural states (e.g., rest vs. run). Given MEC's known speed-invariant coding, the observed decrease in speed-modulated neurons may be an artifact rather than a true learning effect. Additionally, gradual behavioral stabilization over training could influence neural dynamics in ways not fully accounted for.
(2) Licking vs. Movement Encoding: The increase in lick-modulated neurons raises questions about whether these neurons encode reward anticipation or motor execution. Without a detailed analysis of error trials and the timing of licking vs. movement adjustments, it remains unclear whether MEC activity reflects predictive coding of reward or simply motor feedback.
(3) Clustering Interpretation Issues: The functional clustering approach does not control for correlations between behavioral features, making it difficult to determine whether speed modulation plays a role in cluster assignments. The anatomical analysis in Figure 6 relies heavily on clusters that may be predominantly defined by a single regressor, requiring further clarification.
(4) Data Presentation & Statistical Support: Some key claims, particularly the increase in task-modulated neurons with learning (Figure 3), lack statistical quantification.
-
Reviewer #2 (Public review):
Summary:
The authors examine medial entorhinal cortex (MEC) and postrhinal cortex (POR) responses using Ca imaging during a non-spatial, Go/No-Go visual association task. The authors specifically consider whether MEC encodes stimulus information, as previously seen and hypothesized in POR, as well as other task elements such as reward, and whether and how these responses evolve with learning in both regions. The authors find that, in general, POR encodes task-related information more strongly compared to MEC. In particular, POR encodes the stimulus even before the animal reaches expert performance, whereas MEC shows considerably weaker stimulus encoding that emerges with learning. Both regions also display licking-related coding, although notably this activity reflects choice or licking-preparation, which emerges with learning. Further, despite its overall reduced coding, MEC exhibits greater anatomical clustering of cells with similar functional properties compared to POR.
Strengths:
These data are generally well-presented, both in the description of the experimental paradigm - which is simple yet highly informative - and in the individual results for each section. A major strength is the dataset, which includes many cells, including a subset that are tracked across learning. I found the core findings - (1) that POR has robust stimulus encoding while MEC develops weaker stimulus information with learning, and (2) that both POR and MEC exhibit an increase in lick-modulated cells, although POR has more, and stronger, lick-modulated cells - to be generally well-supported by the data presented. The general question of whether and how MEC encodes non-spatial task-relevant features and how these responses (if they exist) emerge with learning is of general interest. In addition, how MEC activity contrasts with activity in an upstream region, thereby indicating what information MEC gets and what it does with it, is also of broad interest.
Weaknesses:
I perceived two primary weaknesses.
The first was that it was not entirely clear to me what was expected of MEC and POR responses, and whether the observations the authors made were surprising or entirely in line with what would've been predicted based on prior work. In some ways, the results seem expected - POR had visual signals, MEC had few visual signals but some reward signals.
The second is that it took me a long time to extract what I perceived to be the core results of the paper, and in some places, it was a little hard for me to understand all the analyses and results together as one cohesive step forward in our understanding of MEC and POR coding properties.
I think this was most evident in the results presented in Figure 4. Up until Figure 4, it seemed to me that the core results were:<br /> (1) visual (stimulus information) is present in POR responses from very early learning, whereas weak stimulus information develops in MEC with learning, and in both cases, there is a preference for the plus stimulus.<br /> (2), both POR and MEC show an increase in lick-modulated cells with learning, although more cells encode licking at all stages in POR.
This is nicely summarized in my view by Figure 3e. However, I became confused when Figure 4 entered the picture. Here, it seems that by far the most predominant coefficient in the model is the lick response, with stimulus features playing a smaller role - specifically, at the end of learning, 60% of POR cells were characterized as predominantly lick/non lick, compared to 25% defined by their coding to the stimulus. I can appreciate that there might be nuances to these and previous analyses such that all the results sit cohesively together, but I think that needs to be clarified.
A second example - Figure 2b - shows that many (75%) of MEC neurons seem to be selectively active for the plus stimulus, but when doing the GLM analysis with the plus stimulus (and reward/licking) as features, many fewer neurons (35%) are determined to be encoding task information. It was not clear to me what was contributing to the discrepancy between these two results - is it that MEC activity often increases with learning, but doesn't increase by that much?
I think in general this can be helped by specifically pointing out how the results of these different analyses relate to each other, including specifically mentioning where the results might seem unaligned (at least on the surface).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
In this important study, Baniulyte and Wade provide convincing evidence that translation of a short ORF denoted toiL positioned upstream of the topAI-yjhQP operon is responsive to different ribosome-targeting antibiotics, consequently controlling translation of the TopAI toxin as well as Rho-dependent transcription termination. Strengths of the study include combining a genetic screen to identify 23S rRNA mutations that affect topA1 expression and a creative approach to map the different locations of ribosome stalling within toiL induced by different antibiotics, with ribosome profiling and RNA structure probing by SHAPE to examine consequences of different antibiotics on toiL-mediated regulation. The work leaves unanswered how bacteria benefit by activating expression of the genes using the proposed strategy and the mechanism underlying ToiL's sensing of structurally distinct antibiotics.
-
Reviewer #1 (Public review):
Summary:
The manuscript reports that expression of the E. coli operon topAI/yjhQ/yjhP is controlled by the translation status of a small open reading frame, that authors have discovered and named toiL, located in the leader region upstream of the operon. Authors propose the following model for topAI activation: Under normal conditions, toiL is translated but topAI is not expressed because of Rho-dependent transcription termination within the topAI ORF and because its ribosome binding site and start codon are trapped in an mRNA hairpin. Ribosome stalling at various codons of the toiL ORF, prompted in this work by some ribosome-targeting antibiotics, triggers an mRNA conformational switch which allows translation of topAI and, in addition, activation of the operon's transcription because presence of translating ribosomes at the topAI ORF blocks Rho from terminating transcription. The model is appealing and several of the experimental data mainly support it. However, it remains unanswered what is the true trigger of the translation arrest at toiL and what is the physiological role of the induced expression of the topAI/yjhQ/yjhP operon.
-
Reviewer #2 (Public review):
Summary:
Baniulyte and Wade describe how translation of an 8-codon uORF denoted toiL upstream of the topAI-yjhQP operon is responsive to different ribosome-targeting antibiotics, consequently controlling translation of the TopAI toxin as well as Rho-dependent termination with the gene.
Strengths:
The authors used multiple different approaches such as a genetic screen to identify factors such as 23S rRNA mutations that affect topA1 expression and ribosome profiling to examine the consequences of various antibiotics on toiL-mediated regulation.
Weaknesses: Future experiments will be needed to better understand the physiological role of the toiL-mediated regulation and elucidate the mechanism of specific antibiotic sensing.
The results are clearly described, and the revisions have helped to improve the presentation of the data.
-
Reviewer #3 (Public review):
In this revised manuscript, the authors provide convincing data to support an elegant model in which ribosome stalling by ToiL promotes downstream topAI translation and prevents premature Rho-dependent transcription termination. However, the physiological consequences of activating topAI-yjhQP expression upon exposure to various ribosome-targeting antibiotics remain unresolved. The authors have satisfactorily addressed all major concerns raised by the reviewers, particularly regarding the SHAPE-seq data. Overall, this study underscores the diversity of regulatory ribosome-stalling peptides in nature, highlighting ToiL's uniqueness in sensing multiple antibiotics and offering significant insights into bacterial gene regulation coordinated by transcription and translation.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
Summary:
The manuscript reports that expression of the E. coli operon topAI/yjhQ/yjhP is controlled by the translation status of a small open reading frame, that authors have discovered and named toiL, located in the leader region of the operon. The authors propose the following model for topAI activation: Under normal conditions, toiL is translated but topAI is not expressed because of Rho-dependent transcription termination within the topAI ORF and because its ribosome binding site and start codon are trapped in an mRNA hairpin. Ribosome stalling at various codons of the toiL ORF, caused by the presence of some ribosome-targeting antibiotics, triggers an mRNA conformational switch which allows translation of topAI and, in addition, activation of the operon's transcription because the presence of translating ribosomes at the topAI ORF blocks Rho from terminating transcription. Even though the model is appealing and several of the experimental data support some aspects of it, several inconsistencies remain to be solved. In addition, even though TopAI was shown to be an inhibitor of topoisomerase I (Yamaguchi & Inouye, 2015, NAR 43:10387), the authors suggest, without offering any experimental support, that, because ribosome-targeting antibiotics act as inducers, expression of the topAI/yjhQ/yjhP operon may confer resistance to these drugs.
Strengths:
- There is good experimental support of the transcriptional repression/activation switch aspect of the model, derived from well-designed transcriptional reporters and ChIP-qPCR approaches.
- There is a clever use of the topAI-lacZ reporter to find the 23S rRNA mutants where expression topAI was upregulated. This eventually led the authors to identify that translation events occurring at toiL are important to regulate the topAI/yjhQ/yjhP operon. Is there any published evidence that ribosomes with the identified mutations translate slowly (decreased fidelity does not necessarily mean slow translation, does it?)?
G2253 is in helix 80 of the 23S rRNA, which has been proposed to be involved in correct positioning of the tRNA. Mutations in helix 80 have been reported to cause defects in peptidyl transferase center activity, which could reduce the rate of ribosome movement along the mRNA. If ribosomes are sufficiently slowed when translating toiL, this could induce expression of topAI. G1911 and Ψ1917 are in helix 69 of the 23S rRNA, which is involved in forming the inter-subunit bridge, as well as interactions with release factors. Mutations in helix 69 cause a decrease in the processivity of translation, suggesting that the mutations we identified may increase the occupancy of ribosomes within toiL, thereby inducing expression of topAI. We have added text to the Discussion section to include this speculation.
- Authors incorporate relevant links to the antibiotic-mediated expression regulation of bacterial resistance genes. Authors can also mention the tryptophan-mediated ribosome stalling at the tnaC leader ORF that activates the expression of tryptophan metabolism genes through blockage of Rho-mediated transcriptional attenuation.
We have added a citation to a recent structural study of ribosomes translating the tnaC uORF. Specifically, we speculate in the Discussion that toiL may have evolved to sense a ribosome-targeting antibiotic, or another ribosome-targeting small molecule such as an amino acid.
Weaknesses:
The main weaknesses of the work are related to several experimental results that are not consistent with the model, or related to a lack of data that needs to be included to support the model.
The following are a few examples:
- It is surprising that authors do not mention that several published Ribo-seq data from E. coli cells show active translation of toiL (for example Li et al., 2014, Cell 157: 624). Therefore, it is hard to reconcile with the model that starts codon/Shine-Dalgarno mutations in the toiL-lux reporter have no effect on luciferase expression (Figure 2C, bar graphs of the no antibiotic control samples).
These data are for a topAI-lux reporter construct rather than toiL-lux. In our model, ribosome stalling within toiL is required to induce expression of the downstream genes; preventing translation of toiL by mutating the start codon or Shine-Dalgarno sequence would not cause ribosome stalling, consistent with the lack of an effect on topAI expression.
- The SHAPE reactivity data shown in Figure 5A are not consistent with the toiL ORF being translated. In addition, it is difficult to visualize the effect of tetracycline on mRNA conformation with the representation used in Figure 5B. It would be better to show SHAPE reactivity without/with Tet (as shown in panel A of the figure).
We have modified this figure (now Figure 6) so that we no longer show the SHAPE-seq data +/- tetracycline overlayed on the predicted RNA structure, since at best, the predicted structure likely only represents uninduced state. We have included the predicted structure together with the SHAPE-seq data for untreated cells as a separate panel because it is part of the basis for our model. We have also added a supplementary figure showing a similar RNA structure prediction based on conservation of the topAI upstream region across species (Figure 6 – figure supplement 1), and we describe this in the text.
- The "increased coverage" of topAI/yjhP/yjhQ in the presence of tetracycline from the Ribo-seq data shown in Figure 6A can be due to activation of translation, transcription, or both. For readers to know which of these possibilities apply, authors need to provide RNA-seq data and show the profiles of the topAI/yjhQ/yjhP genes in control/Tet-treated cells.
A previous study (Li et al., 2014, PMID 24766808) compared RNA-seq and Ribo-seq data for E. coli to measure normalized ribosome occupancy for each gene. However, sequence coverage for topAI was too low to confidently quantify either the RNA-seq or the Ribo-seq data. Presumably RNA levels were low because of Rho termination. Hence, we were not confident that RNA-seq would provide information on the regulation of topAI-yjhQP. Other data in our study provide strong evidence that regulation is primarily at the level of translation. And the key conclusion from Figure 6 (now Figure 7) is that tetracycline stalls ribosomes on start codons.
- Similarly, to support the data of increased ribosomal footprints at the toiL start codon in the presence of Tet (Figure 6B), authors should show the profile of the toiL gene from control and Tet-treated cells.
Figure 6B shows data for both treated and untreated cells. The overall ribosome occupancy is much lower for untreated cells, making it difficult to draw strong conclusions about the relative distribution of ribosomes across toiL.
- Representation of the mRNA structures in the model shown in Figure 5, does not help with visualizing 1) how ribosomes translate toiL since the ORF is trapped in double-stranded mRNA, and 2) how ribosome stalling on toiL would lead to the release of the initiation region of topAI to achieve expression activation.
We now show the predicted structure with only SHAPE-seq data for untreated cells. The comparison of SHAPE-seq +/- tetracycline is shown without reference to the predicted structure.
- The authors speculate that, because ribosome-targeting antibiotics act as expression inducers [by the way, authors should mention and comment that, more than a decade ago, it had been reported that kanamycin (PMID: 12736533) and gentamycin (PMID: 19013277) are inducers of topAI and yjhQ], the genes of the topAI/yjhQ/yjhP operon may confer resistance to these antibiotics. Such a suggestion can be experimentally checked by simply testing whether strains lacking these genes have increased sensitivity to the antibiotic inducers.
We thank the reviewer for pointing out these references, which we now cite. The fact that another group found that gentamycin induces topAI expression – it is one of the most highly induced genes in that paper – strongly suggests that we missed the key inducing concentrations for one or more antibiotics, meaning that topAI is induced by even more ribosome-targeting antibiotics than we realized.
We did some preliminary experiments to look for effects of TopAI, YjhQ, and/or YjhP on antibiotic sensitivity, but generated only negative results. Since these experiments were preliminary and far from exhaustive, we have chosen not to include them in the manuscript. Other studies of genes regulated by ribosome stalling in a uORF have looked at genes whose functions in responding to translation stress were already known, so the environmental triggers were more obvious. With so many possible triggers for topAI-yjhQP, it will likely require considerable effort to find the relevant trigger(s). Hence, we consider this an important question, but beyond the scope of this manuscript.
Reviewer #2 (Public Review):
Summary:
In this important study, Baniulyte and Wade describe how the translation of an 8-codon uORF denoted toiL upstream of the topAI-yjhQP operon is responsive to different ribosome-targeting antibiotics, consequently controlling translation of the TopAI toxin as well as Rho-dependent termination with the gene.
Strengths:
I appreciate that the authors used multiple different approaches such as a genetic screen to identify factors such as 23S rRNA mutations that affect topA1 expression and ribosome profiling to examine the consequences of various antibiotics on toiL-mediated regulation. The results are convincing and clearly described.
Weaknesses:
I have relatively minor suggestions for improving the manuscript. These mainly relate to the figures.
Reviewer #3 (Public Review):
Summary:
The authors nicely show that the translation and ribosome stalling within the ToiL uORF upstream of the co-transcribed topAI-yjhQ toxin-antitoxin genes unmask the topAI translational initiation site, thereby allowing ribosome loading and preventing premature Rho-dependent transcription termination in the topAI region. Although similar translational/transcriptional attenuation has been reported in other systems, the base pairing between the leader sequence and the repressed region by the long RNA looping is somehow unique in toiL-topAI-yjhQP. The experiments are solidly executed, and the manuscript is clear in most parts with areas that could be improved or better explained. The real impact of such a study is not easy to appreciate due to a lack of investigation on the physiological consequences of topAI-yjhQP activation upon antibiotic exposure (see details below).
Strengths:
Conclusion/model is supported by the integrated approaches consisting of genetics, in vivo SHAPE-seq and Ribo-Seq.
Provide an elegant example of cis-acting regulatory peptides to a growing list of functional small proteins in bacterial proteomes.
Recommendations for the authors:
Reviewing Editor Comments:
(1) Examine the consequences of mutations impeding translation of the topAI/yjhQ/yjhP operon on cell growth in the presence and absence of antibiotics.
See response to Reviewer 1’s comment.
(2) Resolve discrepancies between the SHAPE data indicating constitutive sequestration of the toiL Shine Dalgarno sequence with antibiotic-regulated translation of the toiL ORF.
See response to Reviewer 1’s comment.
(3) Reconcile published Ribo-Seq data with the model that start codon/Shine-Dalgarno mutations in the toiL-lux reporter have no effect on luciferase expression in the absence of antibiotics.
See response to Reviewer 1’s comment.
(4) Clarify whether antibiotic MIC values were employed to select antibiotic concentrations for different experiments.
The antibiotic concentrations we used are in line with reported MICs for E. coli. We now list the reported ECOFFs/MICs and include relevant citations.
(5) Provide RNA-seq data to complement the Ribo-Seq data for the topAI/yjhQ/yjhP genes in control vs. Tet-treated cells.
See response to Reviewer 1’s comment.
(6) Revise the text to address as many of the reviewers' suggestions as reasonably possible.
Changes to the text have been made as indicated in the responses to the reviewers’ comments.
Reviewer #2 (Recommendations for the Authors):
(1) Page 6: I would have liked to have more information about the 39 suppressor mutations in rho. Do any of the cis-acting mutations give support for the model proposed in Figure 8?
We only know the specific mutation for some of the strains, and we now list those mutations in the Methods section. For other mutants, we mapped the mutation to either the rho gene or to Rho activity, but we did not sequence the rho gene. Most of the specific mutations we did identify fall within the primary RNA-binding site of Rho and hence should be considered partial-loss-of-function mutations (complete loss of function would be lethal).
We identified cis-acting mutations by re-transforming the lacZ reporter plasmid into a wild-type strain. We did not sequence any of these plasmids.
(2) Page 12-13, Section entitled "Mapping ribosome stalling sites induced by different antibiotics": This section should start with a better transition regarding the logic of why the experiments were carried out and should end with an interpretation of the results.
We have added a few sentences at the start of this section to explain the rationale. We have also added two sentences at the end of this section to summarize the interpretation of the data.
(3) Page 15: The authors should discuss under what conditions the expression of TopAI (and YjhQ/YjhP might be induced? Is expression also elevated upon amino acid starvation?
We have looked through public RNA-seq data but have not identified growth conditions other than antibiotic treatment that induce expression of topAI, yjhQ or yjhP.
(4) References: The authors should be consistent about capitalization, italics, and abbreviations in the references.
These formatting errors will be fixed in the proofing stage.
(5) All graph figures: There should be more uniformity in the sizes of individual data points (some are almost impossible to see) and error bars across the figures.
We have tried to make the data points and error bars more visible for figures where they were smaller.
(6) Figure 1B: I do not think the left arrow labeling is very intuitive and suggest renaming these constructs.
We have removed the arrows to improve clarity.
(7) Figure 2A: toiL should be introduced at the first mention of Figure 2A.
We have added a schematic of the topAI-yjhQ-yjhP region as Figure 1A, including the toiL ORF, which we briefly mention in the text. We have opted to split Figure 2C into two panels. In Figure 2C we now only show data for the wild-type construct. Data for the mutant constructs are now shown in a new figure (Figure 5), alongside data for the wild-type constructs. We have simplified Figure 2A, since the mutations are not relevant to this revised figure, and we now show the schematic with the mutations as Figure 5A.
(8) Figure 3C and 3D: I suggest giving these graphs headings (or changing the color of the bars in Figure 3D) to make it more obvious that different things are measured in the two panels.
We have added headers to panels B-D make it clear that which graphs show ChIP-qPCR data which graph shows qRT-PCR data.
(9) Figure 6: It might be nice to show the topAI-yjhPQ operon here.
We now show the operon in Figure 1A.
(10) Figure 8: This figure could be optimized by adding 5' and 3' end labels and having more similarity with the model in Figure 7.
The constructs shown in Figure 7 lack most of the topAI upstream region, so they aren’t readily comparable to the schematic in Figure 8. However, we have changed the color of the ribosome in Figure 7 to match that in Figure 8. We also indicate the 5’ end of the RNA in Figure 8.
Reviewer #3 (Recommendations for the Authors):
Areas to improve:
(1) While it's important to learn about ToiL-dependent regulation of the downstream topAI-yjhQ toxin-antitoxin genes, the physiological consequence of topAI-yjhQ activation seems to be lost in the manuscript. Everything was done with a reporter lacZ/lux. In the absence of toiL translation (i.e. SD mutant) and/or ribosome stalling, does premature transcription termination result in non-stochiometric synthesis of toxin vs. antitoxin, leading to growth arrest or other measurable phenotype? Knowing the impact of ToiL in the native topAI-yjhQ context will be valuable.
See response to Reviewer 1’s comment.
(2) It was indicated in Figure 4-figure supplement 1 that toiL homologs are found in many other proteobacteria, are the UR sequences in those species also form a similar inhibitory RNA loop?? The nt sequence identity of toiL is likely to be constrained by the base pairing of the topAI 5' region.
We have added a supplementary figure panel showing an RNA structure prediction for the topAI upstream region based on sequence alignment of homologous regions from other species (Figure 6 – figure supplement 1).
What is the frequency of the MLENVII hepta-peptide in the E. coli genome-wide. Is the sequence disfavored to avoid spurious multi-antibiotic sensing?
LENVII is not found in any annotated E. coli K-12 protein. However, this is a sufficiently long sequence that we would expect few to no instances in the E. coli proteome.
(3) Figure 1A, it would be helpful to indicate the location of the toiL (red arrow as in Figure 2A) relative to the putative rut site early in the beginning of the results. Does TSS mark the transcription start site? There is no annotation of TSS in the figure legend. Was TSS previously mapped experimentally? Please include relevant citations.
We now indicate the position of the TSS relative to the topAI start codon. Similarly, we indicate the position of the start of toiL relative to the topAI start codon in Figure 2A. We now explain “TSS” in the figure legend. There is a reference in the text for the TSS (Thomason et al., 2015).
(4) Please consider rearranging the results section, perhaps more helpful to introduce the toiL in Figure 1 or earlier. The current format requires readers to switch back-and-forth between Figure 4 and Figure 2.
We have added a schematic of the topAI upstream region as Figure 1A, and we have separated Figure 2C as described in a response to a comment from Reviewer 2.
(5) Figure 2A and Figure 2-Figure Suppl 1A, for clarity, please mark the rut site upstream of the red arrow.
Rather than mark the rut on Figure 2A, which would make for a busy schematic, readers can compare the positions of the rut to those of toiL, which we have now added to Figures 1B (formerly Figure 1A) and 2A.
(6) The following conclusion seems speculative: "...but does not trigger termination until RNAP ..., >180 nt further downstream…". Shouldn't the authors already know where the termination site is based on their previous Term-seq data (see Ref 1, Adams PP et al 2021)?
Sites of Rho-dependent transcription termination cannot be mapped precisely from Term-seq data because exoribonucleases rapidly process the unstructured RNA 3’ ends.
(7) Genetic screen: Please discuss why the 23S rRNA mutations that cause translational infidelity could promote topAI translation. Wouldn't the mutant ribosome be affected in translating toiL?
See response to Reviewer 1’s comment.
(8) Although antibiotic concentrations were provided in Figure 2 legend, please provide the MIC values of each antibiotic, e.g., in Table S2, for the tested E. coli strain, to inform readers how specific subinhibitory concentrations were chosen.
See response to Reviewing Editor.
(9) Please clarify the calculation of luciferase units in the y-axis of Figure 2A, why the scale is drastically higher than that of Figure 7C using the same antibiotics?
These reporter assays use different constructs. The reporter construct used for experiments in Figure 7 includes a portion of the ermCL gene and associated downstream sequence. We have enlarged Figure 7A to highlight the difference in reporter constructs.
(10) Table S4 needs a few more details. It is unclear how those numbers in columns G-H were generated. Do those numbers correspond to ribosome density per nt/ORF?
We have added footnotes to Table S4 to indicate that the numbers in columns G and H represent sequence read coverage normalized by region length and by the upper quartile of gene expression.
(11) Figure 5, if the SHAPE results were true, the Shine Dalgarno sequence of toiL is sequestered in the hairpin structure with and without tetracycline treatment. It is inconceivable that translational initiation will occur efficiently, please discuss.
Our representation of the SHAPE-seq data was confusing since we overlayed the SHAPE-seq changes on a predicted structure that likely corresponds to the uninduced state. We hope that the new version of Figure 5 is clearer.
We presume the reviewer is referring to the Shine-Dalgarno sequence of topAI rather than toiL, since the Shine-Dalgarno sequence of toiL is predicted to be unstructured even in the absence of tetracycline treatment. The ribosome-binding site of topAI is more accessible in cells treated with tetracycline, although the SHAPE-seq data suggest that this is a transient event. The binding of the initiating ribosome may also reduce reactivity in this region under inducing conditions. We now discuss this briefly in the text.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides the most comprehensive analysis of Salmonella Dublin to date, uncovering distinct genotypic adaptations, antimicrobial resistance patterns, and virulence strategies that influence epidemiological success. The revised manuscript is very valuable, rigorous and compelling.
-
Reviewer #1 (Public review):
The manuscript consists of two separate but interlinked investigations: genomic epidemiology and virulence assessment of Salmonella Dublin. ST10 dominates the epidemiological landscape of S. Dublin, while ST74 was uncommonly isolated. Detailed genomic epidemiology of ST10 unfolded the evolutionary history of this common genotype, highlighting clonal expansions linked to each distinct geography. Notably, North American ST10 was associated with more antimicrobial resistance compared to others. The authors also performed long read sequencing on a subset of isolates (ST10 and ST74), and uncovered a novel recombinant virulence plasmid in ST10 (IncX1/IncFII/IncN). Separately, the authors performed cell invasion and cytotoxicity assays on the two S. Dublin genotypes, showing differential responses between the two STs. ST74 replicates better intracellularly in macrophage compared to ST10, but both STs induced comparable cytotoxicity levels. Comparative genomic analyses between the two genotypes showed certain genetic content unique to each genotype, but no further analyses were conducted to investigate which genetic factors likely associated with the observed differences. The study provides a comprehensive and novel understanding on the evolution and adaptation of two S. Dublin genotypes, which can inform public health measures. The methodology included in both approaches were sound and written in sufficient detail, and data analysis were performed with rigour. Source data were fully presented and accessible to readers.
Comments on revised version:
The authors have addressed all the points raised by the reviewer. The manuscript is now much enhanced in clarity and accuracy. The rewritten Discussion is more relevant and brings in comparison with other invasive Salmonella serotypes.
-
Reviewer #2 (Public review):
This is a comprehensive analysis of Salmonella Dublin genomes that offers insights into the global spread of this pathogen and region-specific traits that are important to understand its evolution. The phenotyping of isolates of ST10 and ST74 also offer insights into the variability that can be seen in S. Dublin, which is also seen in other Salmonella serovars, and reminds the field that it is important to look beyond lab-adapted strains to truly understand these pathogens. This is a valuable contribution to the field. The only limitation, which the authors also acknowledge, is the bias towards S. Dublin genomes from high-income settings. However, there is no selection bias; this is simply a consequence of publicly available sequences.
-
Author response:
The following is the authors’ response to the previous reviews
Public Reviews:
Reviewer #1 (Public review):
The manuscript consists of two separate but interlinked investigations: genomic epidemiology and virulence assessment of Salmonella Dublin. ST10 dominates the epidemiological landscape of S. Dublin, while ST74 was uncommonly isolated. Detailed genomic epidemiology of ST10 unfolded the evolutionary history of this common genotype, highlighting clonal expansions linked to each distinct geography. Notably, North American ST10 was associated with more antimicrobial resistance compared to others. The authors also performed long read sequencing on a subset of isolates (ST10 and ST74), and uncovered a novel recombinant virulence plasmid in ST10 (IncX1/IncFII/IncN). Separately, the authors performed cell invasion and cytotoxicity assays on the two S. Dublin genotypes, showing differential responses between the two STs. ST74 replicates better intracellularly in macrophage compared to ST10, but both STs induced comparable cytotoxicity levels. Comparative genomic analyses between the two genotypes showed certain genetic content unique to each genotype, but no further analyses were conducted to investigate which genetic factors likely associated with the observed differences. The study provides a comprehensive and novel understanding on the evolution and adaptation of two S. Dublin genotypes, which can inform public health measures. The methodology included in both approaches were sound and written in sufficient detail, and data analysis were performed with rigour. Source data were fully presented and accessible to readers.
Comments on revised version:
The authors have addressed all the points raised by the reviewer. The manuscript is now much enhanced in clarity and accuracy. The re-written Discussion is more relevant and brings in comparison with other invasive Salmonella serotypes.
Comments:
In light of the metadata supplied in this revision, for Australian isolates, all human cases of ST74 (n=7) were from faeces (assuming from gastroenteritis) while 18/40 of ST10 were from invasive specimen (blood and abscess). This may contradict with the manuscript's finding and discussion on different experiment phenotypes of the two STs, with ST74 showing more replication in macrophages and potentially more invasive. Thus, the reviewer suggests the authors to mention this disparity in the Discussion, and discuss possible reasons underlying this disparity. This can strengthen the author's rationale for further in vivo studies.
We thank the reviewer for pointing out this important observation. We have amended the text in the Discussion to address the differences in source of human cases as suggested by the Reviewer (lines 392-430). We have also included text highlighting the important knowledge gaps in understanding the drivers for emerging iNTS with broad host ranges and identify future avenues of research that could be explored to better understand the observed differences in the host-pathogen interactions.
Reviewer #2 (Public review):
This is a comprehensive analysis of Salmonella Dublin genomes that offers insights into the global spread of this pathogen and region-specific traits that are important to understand its evolution. The phenotyping of isolates of ST10 and ST74 also offer insights into the variability that can be seen in S. Dublin, which is also seen in other Salmonella serovars, and reminds the field that it is important to look beyond lab-adapted strains to truly understand these pathogens. This is a valuable contribution to the field. The only limitation, which the authors also acknowledge, is the bias towards S. Dublin genomes from high income settings. However, there is no selection bias; this is simply a consequence of publicly available sequences.
We thank the reviewer for their comments and acknowledge the limitations of this study.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
In this interesting and important work, the authors use detailed behavioral analysis and rigorous quantitative modeling to support the idea that C. elegans uses an "accept-reject" strategy to make behavioral decisions upon encountering food patches based on learned features of its environment. The work convincingly expands our understanding of the behavioral repertoire of this species and provides a strong foundation for future mechanistic studies.
-
Reviewer #1 (Public review):
Summary:
This work uses a novel, ethologically relevant behavioral task to explore decision-making paradigms in C. elegans foraging behavior. By rigorously quantifying multiple features of animal behavior as they navigate in a patch food environment, the authors provide strong evidence that worms exhibit one of three qualitatively distinct behavioral responses upon encountering a patch: (1) "search", in which the encountered patch is below the detection threshold; (2) "sample", in which animals detect a patch encounter and reduce their motor speed, but do not stay to exploit the resource and are therefore considered to have "rejected" it; and (3) "exploit", in which animals "accept" the patch and exploit the resource for tens of minutes. Interestingly, the probability of these outcomes varies with the density of the patch as well as the prior experience of the animal. Together, these experiments provide an interesting new framework for understanding the ability of the C. elegans nervous system to use sensory information and internal state to implement behavioral state decisions.
Strengths:
-The work uses a novel, neuroethologically-inspired approach to studying foraging behavior<br /> -The studies are carried out with an exceptional level of quantitative rigor and attention to detail<br /> -Powerful quantitative modeling approaches including GLMs are used to study the behavioral states that worms enter upon encountering food, and the parameters that govern the decision about which state to enter<br /> -The work provides strong evidence that C. elegans can make 'accept-reject' decisions upon encountering a food resource<br /> -Accept-reject decisions depend on the quality of the food resource encountered as well as on internally represented features that provide measurements of multiple dimensions of internal state, including feeding status and time
-
Reviewer #2 (Public review):
This study provides an experimental and computational framework to examine and understand how C. elegans make decisions while foraging environments with patches of food. The authors show that C. elegans reject or accept food patches depending on a number of internal and external factors.
The key novelty of this paper is the explicit demonstration of behavior analysis and quantitative modeling to elucidate decision-making processes. In particular, the description of the exploring vs. exploiting phases, and sensing vs. non-sensing categories of foraging behavior based on the clustering of behavioral states defined in a multi-dimensional behavior-metrics space, and the implementation of a generalized linear model (GLM) whose parameters can provide quantitative biological interpretations.
The work builds on the literature of C. elegans foraging by adding the reject/accept framework.
-
Reviewer #3 (Public review):
Summary:
In this study by Haley et al, the authors investigated explore-exploit foraging using C. elegans as a model system. Through an elegant set of patchy environment assays, the authors built a GLM based on past experience that predicts whether an animal will decide to stay on a patch to feed and exploit that resource, instead of choosing to leave and explore other patches.
Strengths:
I really enjoyed reading this paper. The experiments are simple and elegant, and address fundamental questions of foraging theory in a well-defined system. The experimental design is thoroughly vetted, and the authors provide a considerable volume of data to prove their points. My only criticisms have to do with the data interpretation, which I think are easily addressable.
Weaknesses:
History-dependence of the GLM
The logistic GLM seems like a logical way to model a binary choice, and I think the parameters you chose are certainly important. However, the framing of them seem odd to me. I do not doubt the animals are assessing the current state of the patch with an assessment of past experience; that makes perfect logical sense. However, it seems odd to reduce past experience to the categories of recently exploited patch, recently encountered patch, and time since last exploitation. This implies the animals have some way of discriminating these past patch experiences and committing them to memory. Also, it seems logical that the time on these patches, not just their density, should also matter, just as the time without food matters. Time is inherent to memory. This model also imposes a prior categorization in trying to distinguish between sensed vs. not-sensed patches, which I criticized earlier. Only "sensed" patches are used in the model, but it is questionable whether worms genuinely do not "sense" these patches.
It seems more likely that the worm simply has some memory of chemosensation and relative satiety, both of which increase on patches and decrease while off of patches. The magnitudes are likely a function of patch density. That being said, I leave it up to the reader to decide how best to interpret the data.
osm-6
The argument is that osm-6 animals can't sense food very well, so when they sense it, they enter the exploitation state by default. That is what they appear to do, but why? Clearly they are sensing the food in some other way, correct? Are ciliated neurons the only way worms can sense food? Don't they also actively pump on food, and can therefore sense the food entering their pharynx? I think you could provide further insight by commenting on this. Perhaps your decision model is dependent on comparing environmental sensing with pharyngeal sensing? Food intake certainly influences their decision, no? Perhaps food intake triggers exploitation behavior, which can be over-run by chemo/mechanosensory information?
Impact:
I think this work will have a solid impact on the field, as it provides tangible variables to test how animals assess their environment and decide to exploit resources. I think the strength of this research could be strengthened by a reassessment of their model that would both simplify it and provide testable timescales of satiety/starvation memory.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
(1) The authors repeatedly assert that an individual's behavior in the foraging assay depends on its prior history (particularly cultivation conditions). While this seems like a reasonable expectation, it is not fully fleshed out. The work would benefit from studies in which animals are raised on more or less abundant food before the behavioral task.
Cultivation density: While we agree with the reviewer that testing the effects of varying bacterial density during animal development (cultivation) is an interesting experiment, it is not feasible at this time. We previously attempted this experiment but found it nontrivial to maintain stable bacterial density conditions over long timescales as this requires matching the rate of bacterial growth with the rate of bacterial consumption. Despite our best efforts, we have not been able to identify conditions that satisfy these requirements. Thus, we focused our revised manuscript to include only assertions about the effects of recent experiences and added this inquiry as a future direction (lines 618-624).
(2) The authors convincingly show that the probability of particular behavioral outcomes occurring upon patch encounter depends on time-associated parameters (time since last patch encounter, time since last patch exploitation). There are two concerns here. First, it is not clear how these values are initialized - i.e., what values are used for the first occurrence of each behavioral state? More importantly, the authors don't seem to consider the simplest time parameter, the time since the start of the assay (or time since worm transfer). Transferring animals to a new environment can be associated with significant mechanical stimulus, and it seems quite possible that transferring animals causes them to enter a state of arousal. This arousal, which certainly could alter sensory function or decision-making, would likely decay with time. It would be interesting to know how well the model performs using time since assay starts as the only time-dependent parameter.
Parameter Initialization: We thank the reviewer for pointing out an oversight in our methods section regarding the model parameter values used for the first encounter. We clarified the initialization of parameters in the manuscript (lines 1162-1179). In short, for the first patch encounter where k = 1:
● ρ<sub>k</sub> is the relative density of the first patch.
● τ<sub>s</sub> is the duration of time spent off food since the beginning of the recorded experiment. For the first patch, this is equivalent to the total time elapsed.
● ρ<sub>h</sub> is the approximated relative density of the bacterial patch on the acclimation plates (see Assay preparation and recording in Methods). Acclimation plates contained one large 200 µL patch seeded with OD<sub>600</sub> = 1 and grown for a total of ~48 hours. As with all patches, the relative density was estimated from experiments using fluorescent bacteria OP50-GFP as described in Bacterial patch density estimation in Methods.
● ρ<sub>e</sub> is equivalent to ρ<sub>h</sub>.
Transfer Method: We thank the reviewer for their thoughtful comment on how the stress of transferring animals to a new plate may have resulted in an increased arousal state and thus a greater probability of rejecting patches. We anticipated this possibility and, in order to mitigate the stress of moving, we used an agar plug method where animals were transferred using the flat surface of small cylinders of agar. Importantly, the use of agar as a medium to transfer animals provides minimal disruption to their environment as all physical properties (e.g. temperature, humidity, surface tension) are maintained. Qualitatively, we observed no marked change in behavior from before to after transfer with the agar plug method, especially as compared to the often drastic changes observed when using a metal or eyelash pick. We added these additional methodological details to the methods (lines 791-796).
Time Parameter: However, the reviewer’s concern that the simplest time parameter (time since start of the assay) might better predict animal behavior is valid. We thank the reviewer for pointing out the need to specifically test whether the time-dependent change in explore-exploit decision-making corresponds better with satiety (time off patch) or arousal (time since transfer/start of assay) state. To test this hypothesis, we ran our model with varying combinations of the satiety term τ<sub>s</sub> and a transfer term τ<sub>t</sub>. We found that when both terms were included in the model, the coefficient of the transfer term was non-significant. This result suggests that the relevant time-dependent term is more likely related to satiety than transfer-induced stress (lines 343-358; Figure 4 - supplement 4D).
(3) Similarly, Figures 2L and M clearly show that the probability of a search event occurring upon a patch encounter decreases markedly with time. Because search events are interpreted as a failure to detect a patch, this implies that the detection of (dilute) patches becomes more efficient with time. It would be useful for the authors to consider this possibility as well as potential explanations, which might be related to the point above.
Time-dependent changes in sensing: We agree with the reviewer that we observe increased responsiveness to dilute patches with time. Although this is interesting, our primary focus was on what decision an animal made given that they clearly sensed the presence of the bacterial patch. Nonetheless, we added this observation to the discussion as an area of future work to investigate the sensory mechanisms behind this effect (lines 563-568).
(4) Based on their results with mec-4 and osm-6 mutants, the authors assert that chemosensation, rather than mechanosensation, likely accounts for animals' ability to measure patch density. This argument is not well-supported: mec-4 is required only for the function of the six non-ciliated light-touch neurons (AVM, PVM, ALML/R, PLML/R). In contrast, osm-6 is expected to disrupt the function of the ciliated dopaminergic mechanosensory neurons CEP, ADE, and PDE, which have previously been shown to detect the presence of bacteria (Sawin et al 2000). Thus, the paper's results are entirely consistent with an important role of mechanosensation in detecting bacterial abundance. Along these lines, it would be useful for the authors to speculate on why osm-6 mutants are more, rather than less, likely to "accept" when encountering a patch.
Sensory mutant behavior: We thank the reviewer for pointing out the error in our interpretation of the behavior of osm-6 and mec-4 animals. We further elaborated on our findings and edited the text to better reflect that osm-6 mutants lack both chemosensory and mechanosensory ciliated sensory neurons (lines 406-448; lines 567-577). Specifically, we provided some commentary on the finding that osm-6 mutants show an augmented ability to detect the presence of bacterial patches but a reduced ability to assess their bacterial density. While this finding seems contradictory, it suggests that in the absence of the ability to assess bacterial density, animals must prioritize exploiting food resources when available.
(5) While the evidence for the accept-reject framework is strong, it would be useful for the authors to provide a bit more discussion about the null hypothesis and associated expectations. In other words, what would worm behavior in this assay look like if animals were not able to make accept-reject decisions, relying only on exploit-explore decisions that depend on modulation of food-leaving probability?
Accept-reject vs. stay-switch: We thank the reviewer for alerting us to this gap in our discussion. We have revised the text to further extrapolate upon our point of view on this somewhat philosophical distinction and what it predicts about C. elegans behavior (lines 507-533).
Reviewer #3 (Public review):
(1) Sensing vs. non-sensing
The authors claim that when animals encounter dilute food patches, they do not sense them, as evidenced by the shallow deceleration that occurs when animals encounter these patches. This seems ethologically inaccurate. There is a critical difference between not sensing a stimulus, and not reacting to it. Animals sense numerous stimuli from their environment, but often only behaviorally respond to a fraction of them, depending on their attention and arousal state. With regard to C. elegans, it is well-established that their amphid chemosensory neurons are capable of detecting very dilute concentrations of odors. In addition, the authors provide evidence that osm-6 animals have altered exploit behaviors, further supporting the importance of amphid chemosensory neurons in this behavior.
Interpretation of “non-sensing” encounters: We thank the reviewer for their comment and agree that we do not know for certain whether the animals sensed these patches or were merely non-responsive to them. We are, however, confident that these encounters lack evidence of sensing. Specifically, we note that our analyses used to classify events as sensing or non-sensing examined whether an animal’s slow-down upon patch entry could be distinguished from either that of events where animals exploited or that of encounters with patches lacking bacteria. We found that “non-sensing” encounters are indeed indistinguishable from encounters with bacteria-free patches where there are no bacteria to be sensed (see Figure 2 - Supplement 8A-C and Patch encounter classification as sensing or non-responding in Methods). Regardless, we agree with the reviewer that all that can be asserted about these events is that animals do not appear to respond to the bacterial patch in any way that we measured. Therefore, we have replaced the term “non-sensing” with “non-responding” to better indicate the ethological interpretation of these events and clarified the text to reflect this change (lines 193-200; lines 211-212).
(2) Search vs. sample & sensing vs. non-sensing
In Figures 2H and 2I, the authors claim that there are three behavioral states based on quantifying average velocity, encounter duration, and acceleration, but I only see three. Based on density distributions alone, there really only seem to be 2 distributions, not 3. The authors claim there are three, but to come to this conclusion, they used a QDA, which inherently is based on the authors training the model to detect three states based on prior annotations. Did the authors perform a model test, such as the Bayesian Information Criterion, to confirm whether 2 vs. 3 Gaussians is statistically significant? It seems like the authors are trying to impose two states on a phenomenon with a broad distribution. This seems very similar to the results observed for roaming vs. dwelling experiments, which again, are essentially two behavioral states.
Validation of sensing clusters: We are grateful to the reviewer for pointing out the difficulty in visualizing the clusters and the need for additional clarity in explaining the semi-supervised QDA approach. We added additional visualizations and methods to validate the clusters we have discovered. Specifically, we used Silverman’s test to show that the sensing vs. non-responding data were bi-modal (i.e. a two-cluster classification method fits best) and accompanied this statistical test with heat maps which better illustrate the clusters (lines 171-173; lines 190-191; lines 948-972; lines 1003-1005; Figure 2 - supplement 6A-C; Figure 2 - supplement 7C-F).
Further, it seems that there may be some confusion as to how we arrived at 3 encounter types (i.e. search, sample, exploit). It’s important to note that two methods were used on two different (albeit related) sets of parameters. We first used a two-cluster GMM to classify encounters as explore or exploit. We then used a two-cluster semi-supervised QDA to classify encounters as sensing or non-sensing (now changed to “non-responding”, see above response) using a different set of parameters. We thus separated the explore cluster into two (sensing and non-responding exploratory events) resulting in three total encounter types: exploit, sample (explore/sensing), and search (explore/non-sensing).
(4) History-dependence of the GLM
The logistic GLM seems like a logical way to model a binary choice, and I think the parameters you chose are certainly important. However, the framing of them seems odd to me. I do not doubt the animals are assessing the current state of the patch with an assessment of past experience; that makes perfect logical sense. However, it seems odd to reduce past experience to the categories of recently exploited patch, recently encountered patch, and time since last exploitation. This implies the animals have some way of discriminating these past patch experiences and committing them to memory. Also, it seems logical that the time on these patches, not just their density, should also matter, just as the time without food matters. Time is inherent to memory. This model also imposes a prior categorization in trying to distinguish between sensed vs. not-sensed patches, which I criticized earlier. Only "sensed" patches are used in the model, but it is questionable whether worms genuinely do not "sense" these patches.
Model design: We thank the reviewer for their thoughtful comments on the model. We completed a number of analyses involving model selection including model selection criteria (AIC, BIC) and optimization with regularization techniques (LASSO and elastic nets) and found that the problem of model selection was compounded by the enormous array of highly-correlated variables we had to choose from. Additionally, we found that both interaction terms and non-linear terms of our task variables could be predictive of accept-reject decisions but that the precise set of terms selected depended sensitively on which model selection technique was used and generally made rather small contributions to prediction. The diverse array of results and combinatorial number of predictors to possibly include failed to add anything of interpretable value. We therefore chose to take a different approach to this problem. Rather than trying to determine what the “best” model was we instead asked whether a minimal model could be used to answer a set of core questions. Indeed, our goal was not maximal predictive performance but rather to distinguish between the effects of different influences enough to determine if encounter history had a significant, independent effect on decision making. We thus chose to only include task variables that spanned the most basic components of behavioral mechanisms to ask very specific questions. For example, we selected a time variable that we thought best encapsulated satiety. While we could have included many additional terms, or made different choices about which terms to include, based on our analyses these choices would not have qualitatively changed our results. Further, we sought to validate the parameters we chose with additional studies (i.e. food-deprived and sensory mutant animals). We regard our study as an initial foray into demonstrating accept-reject decision-making in nematodes. The exact mechanisms and, consequently, the best model design are therefore beyond the scope of this study.
Lastly, in regards to the use of only sensed patches in the model; while we acknowledge that we are not certain as to whether the “non-responding” encounters are truly not sensed, we find qualitatively similar results when including all exploratory patches in our analyses. However, we take the position that sensation is necessary for decision-making and thus believe that while our model’s predictive performance may be better using all encounters, the interpretation of our findings is stronger when we only include sensing events. We have added additional commentary about our model to the discussion section (lines 667-695).
(5) osm-6
The osm-6 results are interesting. This seems to indicate that the worms are still sensing the food, but are unable to assess quality, therefore the default response is to exploit. How do you think the worms are sensing the food? Clearly, they sense it, but without the amphid sensory neurons, and not mechanosensation. Perhaps feeding is important? Could you speculate on this?
We thank the reviewer for their thoughtful remarks. We have added additional commentary about the result of our sensory mutant experiments as described above in response to Reviewer #1 under Sensory mutant behavior.
(7) Impact:
I think this work will have a solid impact on the field, as it provides tangible variables to test how animals assess their environment and decide to exploit resources. I think the strength of this research could be strengthened by a reassessment of their model that would both simplify it and provide testable timescales of satiety/starvation memory.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
The authors title the work as an "ethological study" and emphasize the theme of "foraging in naturalistic environments" in contrast to typical laboratory conditions. The only difference in this study relative to typical laboratory conditions is that the food bacteria is distributed in many small patches as compared to one large patch. First, it is not clear to the reviewer that the size of the food patches in these experiments is more relevant to C. elegans in its natural context than the standard sizes of food patches. Furthermore, all the other highly unnatural conditions typical of laboratory cultivation still apply: the use of a 2D agar substrate, a single food bacteria that is not a component of a naturalistic diet, and the use of a laboratory-adapted strain of C. elegans with behavior quite distinct from that of natural isolates. The reviewer is not suggesting that the authors need to make their experiments more naturalistic, only that the experiments as described here should not be described as naturalistic or ethological as there is no support for such claims.
Ethological interpretation: We thank the reviewer for their comments about the use of the term ethological to describe this study. We chose to develop a patchy bacterial assay to mimic the naturalistic “boom-or-bust” environment. While we agree with the reviewer that we do not know if the size and distribution of the food patches in these experiments is more relevant to C. elegans, we maintain that these experiments were ecologically-inspired and revealed behavior that is difficult to observe in environments with large, densely-seeded bacterial patches. We have updated our text to better reflect that this study was “ecologically-inspired” rather than truly “ethological” in nature (lines 94, 693).
The main finding of the paper is that worms explore and then exploit, i.e. they frequently reject several bacterial patches before accepting one. This result requires additional scrutiny to reject other possible interpretations. In particular, when worms are transferred to a new plate we would expect some period of increased arousal due to the stressful handling process. A high arousal state might cause rejection of food patches. Could the measured accept/reject decisions be influenced by this effect? One approach to addressing this concern would be to allow the animals to acclimate to the new plate on a bare region before encountering the new food patches.
We thank the reviewer for their comment on how the stress of transferring animals to a new plate may have resulted in an increased arousal state and thus a greater probability of rejecting patches. We addressed this above in response to Reviewer #1 under Transfer Method and Time Parameter. In brief, we used a worm picking method that mitigated stress and added additional analyses showing that a transfer-related term was less predictive than a satiety-related term.
Related to the above, in what circumstances exactly are the authors claiming that worms first explore and then exploit? After being briefly deprived of food? After being handled?
Explore-then-exploit: All animals were well-fed and handled gently as described above under Transfer Method (lines 787-795). Our results suggest that the appearance of an explore-then-exploit strategy is a byproduct of being transferred from an environment with high bacterial density to an environment with low bacterial density as described in the manuscript (lines 461-466).
The authors emphasize their analysis of the accept/reject decision as a critical innovation. However, the accept/reject decision does not strike me as substantially different from the previously described stay/switch decision. When a worm encounters a new patch of bacteria, accepting this bacteria is equivalent to staying on it and rejecting (leaving) it is equivalent to switching away from it. The authors should explain how these concepts are significantly distinct.
Accept-reject vs. stay-switch: We thank the reviewer for alerting us to this gap in our discussion. We have revised the text to further extrapolate upon our point of view on this somewhat philosophical distinction and what it predicts about C. elegans behavior (lines 507-533).
During patch encounter classification, the authors computed three of the animals' behavioral metrics (Line 801-804) and claimed that the combination of these three metrics reveals two non-Gaussian clusters representing encounters where animals sensed the patch or did not appear to sense the patch. The authors also refer to a video to demonstrate the two clusters by rotating the 3-dimension scatter plot. However, the supposed clusters, if any, are difficult to see in a 3D (Video 5) or in a 2D scatter plot (Figure 3I). The authors need to clearly demonstrate the distinct clustering as claimed in the paper as this feature is fundamental and necessary for the model implementation and interpretation of results.
We are grateful to the reviewer for pointing out the difficulty in visualizing the clusters. We added additional visualizations and methods to validate the clusters we have discovered as described in our above response to Reviewer #3 under Validation of sensing clusters.
When selecting parameters (covariates) for their model, it is critical to avoid overfitting. Therefore, the authors used AIC and BIC (Figure 4- supplement 1) to demonstrate that the full GLM model has a better model performance than the other models which contain only a subset of the full covariates (in a total of 5). However, the authors compare the full set with only 4 other models whereas the total number of models that need to be compared with is 2^5-2. The authors at least need to include the AIC and BIC scores of all possible models in order to draw the conclusion about the performance of the full model.
Model selection criterion: We thank the reviewer for pointing out this gap in our methodology. We have now run the model with all combinations of subsets of model parameters and have confirmed that the model with all 5 covariates outperforms all other models even when using BIC, the strictest criterion for overfitting (Figure 1 - supplement 1A). The only other model that performs well (though not as often as the 5-term model) is the 4-term model lacking ρ<sub>h</sub>. This result is not surprising as ρ<sub>h</sub> only changes substantially once in an animal’s encounter history for the single-density, multi-patch data that this model was fit to. For example, for an animal foraging on patches of density 10, on the first encounter ρ<sub>h</sub> = ~200 (see Parameter initialization above), but on every subsequent encounter ρ<sub>h</sub> = ~10. Resultantly, the effect of ρ<sub>h</sub> on the probability of exploiting is somewhat binary on the single-density, multi-patch data set. Nevertheless, we see significantly improved prediction of behavior in the novel multi-density, multi-patch data (Figure 4F) as we observe an effect of the most recently encountered patch. Additionally, we observe a similar impact (i.e., significant coefficient of negative sign) of the ρ<sub>h</sub> term when the model is fit to the multi-density, multi-patch data set (Figure 4 - supplement 4D).
In any bacterial patch, the edges have a higher density of bacteria than the patch center. Thus, it is possible that a worm scans the patch edge density, on the basis of which it decides to accept or reject the patch whose average density is smaller. This could potentially cause an underestimate of the bacteria density used in the model. Furthermore, the potential inhomogeneity of the patch may further complicate the worm's decision-making, and the discrepancy between the reality and the model assumption will reduce the validity of the model. The authors need to estimate the inhomogeneity of the bacterial patches used in their assays and discuss how the edge effects may affect their results and conclusions.
Bacterial patch inhomogeneity: We extensively tested the landscape of the bacterial patches by imaging fluorescently-labeled bacteria OP50-GFP (Bacterial Patch Density in Methods; Figure 2 - supplement 1-3). As the reviewer mentions, we observe significantly greater bacterial density at the patch edge. This within-patch spatial inhomogeneity results from areas of active proliferation of bacteria and likely complicates an animal’s ability to accurately assess the quantity of bacteria within a patch and, consequently, our ability to accurately compute a metric related to our assumptions of what the animal is sensing. In our study, we used the relative density of the patch edge where bacterial density is highest as a proxy for an animal’s assessment of bacterial patch density (Figure 2 – supplement 1). This decision was based on a previous finding that the time spent on the edge of a bacterial patch affected the dynamics of subsequent area-restricted search. While within-patch spatial inhomogeneity likely affects an animal’s ability to assess patch density, we do not believe that this qualitatively affects the results of our study. Both the patch densities tested (Figure 2 – supplement 3A) as well as our observations of time-dependent changes in exploitation (Figure 2E,N-O; Figure 3H-I) maintained a monotonic relationship. Therefore, alternative methods of patch density estimation should yield similar results. We have added additional discussion on this topic to our manuscript (lines 578-593).
The authors claim that their methods (GMM and semi-supervised QDA) are unbiased. This seems unlikely as the QDA involves supervision. The authors need to provide additional explanation on this point.
Semi-supervised QDA labelling: We have removed the term “unbiased” to avoid any misinterpretation of the methodology and clarified our method of labelling used for “supervising” QDA. Specifically, we made two simple assumptions: 1) animals must have sensed the patch if they exploited it and 2) animals must not have sensed the patch if there was no bacteria to sense. Thus, we labeled encounters as sensing if they were found to be exploitatory as we assume that sensation is prerequisite to exploitation; and we labeled encounters as non-sensing for events where animals encountered patches lacking bacteria (OD<sub>600</sub> = 0). All other points were non-labeled prior to learning the model. In this way, our labels were based on the experimental design and results of the GMM, an unsupervised method; rather than any expectations we had about what sensing should look like. The semi-supervised QDA method then used these initial labels to iteratively fit a paraboloid that best separated these clusters, by minimizing the posterior variance of classification (lines 1012-1021). See Figure 2 - supplement 8A-B for a visualization showing the labelled data.
Based on the authors' result, worms behaviorally exhibit their preferences toward food abundance (density), which results in a preference scale for a range of densities. Does this scale vary with the worms' initial cultivation states? The author partially verified that by observing starved worms. This hypothesis could be better tested if the authors could analyze the decision-making of the worms that were initially cultivated with different densities of bacterial food.
While we agree with the reviewer that testing the effects of varying bacterial density during animal development (cultivation) is a very interesting experiment, it is not feasible at this time. We focused our revised manuscript to include only assertions about the effects of recent experiences and added this inquiry as a future direction as described above in our response to Reviewer #1 under Cultivation density.
It would be helpful to elaborate more on how the framework developed in this paper can be applied more broadly to other behaviors and/or organisms and how it may influence our understanding of decision-making across species.
We thank the reviewer for alerting us to this gap in our discussion. We have added additional commentary about our model and its utility to the discussion section (lines 667-695).
Reviewer #3 (Recommendations for the authors):
Sensing vs. non-sensing
Perhaps a more ethologically accurate term to describe this behavior would be "ignoring" rather than "not sensing". If the authors feel strongly about using the term "not sensing", then they should provide experimental evidence supporting this claim. However, I think simply changing the terminology negates these experiments.
We thank the reviewer for their thoughtful comments. While we agree with the reviewer that the term “non-sensing” may not be ethologically accurate (see response to Public Review above under Interpretation of “non-sensing” encounters), we interpret the term “ignoring” to mean that the animal sensed the patches but decided not to react. We have chosen to replace the term “non-sensing” with “non-responding” to best indicate the ethological interpretation of our observation. Nonetheless, we believe that it remains possible that animals are truly not sensing the bacterial patches as our method of classification compared the behavior against encounters with patches lacking bacteria (as described above in response to Reviewer #2 under Semi-supervised QDA labelling).
History-dependence of the GLM
Perhaps a simpler approach would be to say the worm senses everything, and this accumulative memory affects the decision to exploit. For example, the animal essentially experiences two feeding states: feeding on patches, and starvation off of patches.
The level of satiety could be modeled linearly:
Satiety(t_enter:t_leave) = k_feed*patch_density*delta_t
Where k_feed is some model parameter for rate of satiety signal accumulation, t_enter is the time the animal entered the patch, t_leave is the time the animal left the patch, and delta_t is the difference between the two. Perhaps you could add a saturation limit to this, but given your data, I doubt that is the case.
Starvation could be modeled as simply a decay from the last satiety signal:
Starvation(t_leave:t_enter) = Satiety(t_leave)*exp(-k_starve*delta_t).
Where starvation is the rate constant for the decay of the satiety signal.
For the logistic model, the logistic parameter is simply the difference between the current patch density and the current satiety signal.
A nice thing about this approach is that it negates the need to categorize your patches. All patch encounters matter. Brief patch encounters (categorized as non-sensing and not used in the prior GLM) naturally produce a very small satiety signal and contribute very little to the exploit decision. Another nice thing about this approach is that it gives you memory timescales, that are testable. There is a rate of satiety accumulation and a rate of satiety loss. You should be able to predict behavior with lower patch density, assuming the rate constants hold. (I am not advocating you do more experiments here, just pointing out a nice feature of this approach).
You could possibly apply this to a GLM for velocity on a non-exploited patch as well, though I assume this would be a linear GLM, given the velocity distributions you provided.
We thank the reviewer for their time and thoughtfulness in thinking about our model. The reviewer’s proposed model seems entirely reasonable and could aid in elucidating the time component of how prior experience affects decision-making. However, we decided to keep our paper focused on using a minimal model to answer a set of core questions (e.g., Does encounter history or satiety influence decision-making?) (see above under Model design for a more detailed response). Future studies investigating the mechanisms of these foraging decisions should open the door for more mechanistically accurate models. We have expanded our discussion of the model to include this assertion (lines 667-695).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study employed a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence. The data presented to support the main conclusion are solid. This work will be of interest to both basic researchers who work on reproductive biology and clinicians who practice reproductive medicine.
-
Reviewer #1 (Public review):
Summary:
The use of a multi-omics approach to elucidate the regulatory mechanism underlying parturition and myometrial quiescence adds novelty to the study. The identification of myometrial cis-acting elements and their association with gene expression, particularly the regulation of the PLCL2 gene by PGR opens the door to further investigate the impact of PGR and other regulators.
Strengths:
(1) Multi-Omic Approach: The paper employs a comprehensive multi-omic approach, combining ChIP-Seq, RNA-Seq, and CRISPRa-based Perturb-Seq assays, which allow for a thorough investigation of the regulatory mechanisms underlying myometrial gene expression.<br /> (2) Clinical Relevance: Investigating human myometrial specimens provides direct clinical relevance, as understanding the molecular mechanisms governing parturition and myometrial quiescence can have significant implications for the management of pregnancy-related disorders.<br /> (3) Functional work: For functional screening, They have used CRISPRa-based screening of PLCL2 gene regulation using immortalized human cell-line hTERT-HM and T-hESC to add more dimension to the work which strengthens their finding of PGR-dependent regulation of the PLCL2 gene in the human myometrial cells.
Weaknesses:
(1) Variability in epigenomic mapping: The significant variations in the number and location of H3K27ac-positive intervals across different samples and studies suggest potential challenges in accurately mapping the myometrial epigenome. This variability may introduce uncertainty and complicate the interpretation of results.<br /> (2) Sample specificity: The study focuses on term pregnant nonlabor myometrial specimens, limiting the generalizability of the findings to other stages of pregnancy or labor.<br /> (3) Limited Understanding of Regulatory Mechanisms: While the study identifies potential regulatory programs within super-enhancers, the exact mechanisms by which these enhancers regulate gene expression and cellular functions in the myometrium remain unclear. Further mechanistic studies are needed to elucidate these processes.<br /> (4) Discordant analysis: Why regular enhancers are being understood in terms of motif enrichment of transcription factors and super-enhancers in terms of pathways enriched for active genes? This needs a clear reason.
-
Reviewer #2 (Public review):
Summary:
In "Assessment of the Epigenomic Landscape in Human Myometrium at Term Pregnancy" the authors generate a number of genome-wide data sets to investigate epigenomic and transcriptomic regulation of the myometrium at term pregnancy. These data provide a useful resource for further evaluation of gene regulatory mechanisms in the myometrium and include the first Hi-C data published for this tissue. There is a comparison to previously published histone modification data and integration with RNA-seq to highlight potential enhancer-gene regulatory relationships. The authors further investigate putative enhancers upstream of the PLCL2 gene and identify a candidate region that may be regulated by the PGR (progesterone receptor) signaling.
Strengths:
The strengths of this study are in the multi-omics nature of the design as several genome-wide data sets are generated from the same patient samples. Extending this type of approach in the future to a larger number of samples will allow for additional investigation into gene regulation as correlation between epigenomic features and gene expression across a larger number of samples can reveal regulatory relationships.
Weaknesses:
One of the most interesting aspects of this study is the generation of the first Hi-C data for the human pregnant myometrium, however, there is minimal description in the results section of the Hi-C data analysis and the only data shown are the number of loops identified and one such loop that includes the PLCL2 promoter shown in figure 3A. The manuscript would benefit from a more extensive analysis of the Hi-C data, for example, the analysis of TADs (topological associating domains) would be interesting to add and could be used to evaluate to what extent H3K27ac domains and putative regulated genes fall within the same TAD.
The authors present some convincing evidence on the transcriptional regulation of the PLCL2 gene using Perturb-Seq to identify putative upstream enhancer regions and PGR over-expression showing PGR can act as an activator. These two experiments on their own are interesting, however, they are not as mechanistically integrated as they could be to clarify the molecular mechanisms. Deletion of the putative enhancer upstream of PLCL2 followed by over-expression of PGR would clarify the mechanistic relationship between the proposed enhancer, PGR and PLCL2 expression. Does PGR act through the proposed enhancer? In addition, reporter assays using this proposed enhancer region with and without increased expression of PGR and mutation of any PRE sequences would also provide mechanistic insight. Although CRISPRa and Perturb-Seq can be used to identify potential regulatory regions, the best approach to verify the requirement for a particular enhancer in regulating a specific gene is a deletion approach.
Comments on revisions:
The authors have addressed my comments that were directly sent to them, however, my comments in the public review, specifically the superficial nature of the Hi-C analysis were not addressed.
In addition, many of the comments to reviewer 3 were unaddressed and declared out of the scope of this study, as these were points of accuracy in the data analysis they are very much in scope.
I hope the authors reconsider presenting a more thorough analysis.
-
Reviewer #3 (Public review):
In this manuscript, Wu et al. investigate active H3K27ac and H3K4me1 marks in term pregnant nonlabor myometrial biopsies, linking putative enhancers and super enhancers to gene expression levels. Through their findings, they reveal the PGR-dependent regulation of the PLCL2 gene in human myometrial cells via a cis-acting element located 35-kilobases upstream of the PLCL2 gene. By targeting this region using a CRISPR activation system, they were able to increase the elevate the endogenous PLCL2 mRNA levels in immortalized human myometrial cells.
This research offers novel insights into the molecular mechanisms governing gene expression in myometrial tissues, advancing our understanding of pregnancy-related processes.
Major comments:
(1) A more comprehensive analysis of the epigenetic and transcriptomic data would have strengthened the paper, moving beyond basic association studies. Currently, it is challenging to assess the quality and significance of the data as much of the information is lacking.
Strengths:
- The combination of ChIP-Seq, RNA-Seq, and CRISPRa Perturb-Seq approaches to investigate gene regulation and expression in myometrial cells.<br /> - The use of CRISPR activation system to specifically target cis-acting elements.
Weaknesses:
- The manuscript would strongly benefit from a deeper analysis of the Omic datasets. Furthermore, expanding figures/graphs to effectively contextualize these datasets would be greatly beneficial and would add more value to this research.<br /> - Limited sample size, coupled with variability in results and overall lack of details, compromises the robustness of result interpretation.<br /> - Additional efforts are needed to dissect the proposed regulatory mechanisms.<br /> - While the discussion provided helpful context for understanding some of the experiments performed, it lacked interpretation of the results in relation to the existing literature.
Comments on revisions:
The authors have improved the manuscript by enhancing its readability and organization. Tables were added to present key information more clearly, and figures were refined for better visualization. Additionally, more details were included, particularly in the methods and bioinformatics analyses sections, ensuring a more comprehensive and precise presentation of the data.
However, in many cases, reviewers' questions and concerns were addressed in the response to reviewers rather than incorporated into the manuscript, or it was noted that these points would be explored in future studies.
-
Author response:
The following is the authors’ response to the original reviews
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) Sample size: If the sample size of the study is increased, more confidence and new insights can be inferred about myometrial enhancer-mediated gene regulation in term pregnancy. Such a small sample size (N = 3) limits the statistical power of the study. As mentioned in the manuscript they failed to identify chromatin loops in the second subject's biopsy is observed due to a limited sample.
We agree with the reviewer’s comment about the sample size. We sincerely hope the result of this study would increase the interest of stakeholders to fund future projects in a larger scale.
(2) Figure quality: There is a lack of good representations of the results (e.g., screenshots of tables as figure panels!) as well as missing interpretations that might add value to the manuscript.
Figure 1B and 2B have been converted to the pie chart format.
(3) Definition of super-enhancer: The definition of super-enhancer is not clear. Also, the computational merging of enhancers to define super-enhancers should be described better.
Added more details about tool and parameter setting in the Method section of “Identification of super enhancers”:
“Identification of super enhancers
H3K27ac-positive enhancers were defined as regions of H3K27ac ChIP-seq peaks in each sample. The enhancers within 12.5Kb were merged by using bedtools merge function with parameter “-d 12500”. The combined enhancer regions were called super enhancers if they were larger than 15Kb. The common super enhancers from multiple samples were used for downstream analysis.”
Reference:
Whyte WA, Orlando DA, Hnisz D, Abraham BJ, Lin CY, Kagey MH, Rahl PB, Lee TI, Young RA. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013 Apr 11;153(2):307-19. doi: 10.1016/j.cell.2013.03.035. PMID: 23582322; PMCID: PMC3653129.
(4) Assay-Specific Limitations: Each assay employed in the study, such as ChIP-Seq and CRISPRa-based Perturb-Seq, has its limitations, including potential biases, sensitivity issues, and technical challenges, which could impact the accuracy and reliability of the results. These limitations should be addressed properly to avoid false-positive results and improve the interpretability of the results.
The major limitations of the CRISPRa-based Perturb-Seq protocol in this study are the use of the hTERT-HM cells and the two-vector system for transduction. While hTERT-HM cells are a much easier platform in terms of technical operation, primary human myometrial cells are generally considered retaining a molecular context that is closer to the in vivo tissues. Due to the limitation on the efficiency of having two vectors simultaneously present in the same cell, hTERT-HM cells are much more affordable and operationally feasible to conduct the experiment. Future advancements on the increase of viral vector payload capacity may overcome this challenge and open the venue to perform the assay on primary human myometrial cells.
(5) Sample collection and comparison: There is mention of matched gravid term and non-gravid samples whereas no description or use of control samples was found in the results. Also, the comparison of non-labor samples with labor samples would provide a better understanding of epigenomic and transcriptomic events of myometrium leading to laboring events.
The description has been updated:
“Collection of myometrial specimens
Permission to collect human tissue specimens was prospectively obtained from individuals undergoing hysterectomy or cesarean section for benign clinical indications (H-33461). Gravid myometrial tissue was obtained from the margin of the hysterotomy in women undergoing term cesarean sections (>38 weeks estimated gestational age) without evidence of labor. Non-gravid myometrial tissue was collected from pre-menopausal women undergoing hysterectomy for benign conditions. Specimens from gravid women receiving treatment for pre-eclampsia, eclampsia, pregnancy-related hypertension, or pre-term labor were excluded.”
(6) Lack of clarity:
(6a) It is written as 'Chromatin Conformation Capture (Hi-C)'. I think Hi-C is Histone Capture and 3C is Chromosome Conformation Capture! This needs clear writing.
As the reviewer suggested, to make it clear, we have changed the text “A high throughput chromatin conformation capture (Hi-C) assay” to “A High-throughput Chromosome Conformation Capture (Hi-C) assay”.
(6b) In multiple places, 'PLCL2' gene is written as 'PCLC2'.
Corrected as suggested.
(6c) What is the biological relevance of considering 'active' genes with FPKM {greater than or equal to} 1? This needs clarification.
In RNA-seq analysis, the gene expression levels are often quantified using FPKM (Fragments Per Kilobase of transcript per Million mapped reads). Setting a threshold of FPKM for defining "active" genes in RNA-seq analysis is biologically relevant, because it helps to distinguish between genuinely expressed genes and background noise. It helps researchers focus on genes, which are more likely to have a significant biological impact. A common threshold for defining "active" genes is FPKM ≥ 1. Genes with FPKM values below this threshold may be transcribed at very low levels or could be background noise.
(6d) The understanding of differentially methylated genes at promoters is underrated as per the authors. But, why leaving DNA methylation apart, they selected histone modification as the basis of epigenetic reprogramming in terms of myometrium is unclear.
DNA methylation indeed plays a crucial role in evaluating the impact of cis-acting elements on gene regulation. Large-scale studies, such as the comprehensive analysis of the myometrial methylome landscape in human biopsies (Paul et al., JCI Insight, 2022, PMID: 36066972), have provided valuable insights. When integrated with histone modification and chromatin looping data, contributed by our group and collaborators, future secondary analyses leveraging machine learning are poised to further elucidate the mechanisms underlying myometrial transcriptional regulation.
(6e) How does the identification of PGR as an upstream regulator of PLCL2 gene expression in human myometrial cells contribute to our understanding of progesterone signaling in myometrial function?
In a previous study, we demonstrated a positive correlation between PLCL2 and PGR expression in a mouse model and identified PLCL2's role in negatively modulating oxytocin-induced myometrial cell contraction (Peavy et al., PNAS, 2021, PMID: 33707208). The present study builds on this by providing evidence for a direct regulatory mechanism in which PGR influences PLCL2 transcription, likely through a cis-acting element located 35 kb upstream. These findings suggest that PLCL2 acts as a mediator of PGR-dependent myometrial quiescence prior to labor, rather than merely participating in a parallel pathway. Further in vivo studies are necessary to delineate the extent to which PLCL2 mediates PGR activity, particularly the contraction-dampening function of the PGR-B isoform.
(7) Grammatical error: The manuscript has numerous grammatical errors. Please correct them.
Corrections have been made as suggested.
(8) Use of single-cell data: Though from the Methods section, it can be understood that single-cell RNA-seq was done to identify CRISPRa gRNA expressing cells to characterize the effect of gene activation, some results from single-cell data e.g., cell clustering, cell types, gRNA expression across clusters could be added for better elucidation.
As reviewer suggested, we have prepared a file “PerturbSeq_summary.xlsx” (Dataset S9) to provide additional results of perturb-seq data analysis. It includes 2 spreadsheets, “Cell_per_gRNA” for clustering and “Protospacer_calls_per_cell” for gRNA expression across clusters.
Reviewer #2 (Recommendations For The Authors):
(1) The following are a number of grammatical issues in the abstract. I suggest having a careful read of the entire manuscript to identify additional grammatical issues as I may not be able to highlight all of these issues.
(1a) "The myometrium plays a critical component during pregnancy." change component to role.
(1b) "It is responsible for the uterus' structural integrity and force generation at term," à replace "," with "."
(1c) Also, I suggest rephrasing the first 2 sentences to: The myometrium plays a critical role during pregnancy as it is responsible for both the structural integrity of the uterus and force generation at term.
(1d) "Here we investigated the human term pregnant nonlabor myometrial biopsies for transcriptome, enhancer histone mark cistrome, and chromatin conformation pattern mapping." Remove "the", and modify to "Here we investigated human term pregnant".
(1e) Missing period and sentence fragment, "PGR overexpression facilitated PLCL2 gene expression in myometrial cells Using CRISPR activation the functionality of a PGR putative enhancer 35-kilobases upstream of the contractile-restrictive gene PLCL2.
Corrections have been made as suggested.
(2) Sentence fragment: Studies on the role of steroid hormone receptors in myometrial remodeling have provided evidence that the withdrawal of functional progesterone signaling at term is due to a stoichiometric increase of progesterone receptor (PGR) A to B isoform-related estrogen receptor (ESR) alpha expression activation at term. (Mesiano, Chan et al. 2002) (Merlino, Welsh et al. 2007) (Nadeem, Shynlova et al. 2016).
The statement has been updated:
“Studies on the role of steroid hormone receptors in myometrial remodeling suggest that the withdrawal of functional progesterone signaling at term results from a stoichiometric shift favoring the PGR-A isoform over PGR-B. This shift is associated with increased activation of estrogen receptor alpha (ESR1) expression at term (Mesiano, Chan et al. 2002) (Merlino, Welsh et al. 2007) (Nadeem, Shynlova et al. 2016).”
(3) FOS:JUN heterodimers are implicated to be critical for the initiation of labor through transcriptional regulation of gap junction proteins such as Cx43 (Nadeem, Farine et al. 2018) (Balducci, Risek et al. 1993).
Use Gja1 (Gap junction alpha 1) as the current correct gene, not Cx43.
Also, several references predate Nadeem, Farine et al. 2018 and are more appropriate to use as references for the role of Ap-1 proteins in regulating Gja1; PMID: 15618352 and PMID: 12064606 were the first to show this relationship in myometrial cells.
The statement has been updated as suggested:
“FOS:JUN heterodimers are implicated to be critical for the initiation of labor through transcriptional regulation of gap junction proteins such as GJA1 (Nadeem, Farine et al. 2018) (Balducci, Risek et al. 1993)”
(4) Define PLCL2 on first use.
Updated as suggested.
(5) There are a number of issues with this section, "Matched sSpecimens of gravid myometrium were collected at the margin of hysterotomy from women undergoing clinically indicated cesarean section at term (>38 weeks estimated gestation age) without evidence of labor. Specimens of healthy, non-gravid myometrium were also pecimens were collected from uteri removed from pre-menopausal women undergoing hysterectomy for benign clinical indications."
The description has been updated:
“Collection of myometrial specimens
Permission to collect human tissue specimens was prospectively obtained from individuals undergoing hysterectomy or cesarean section for benign clinical indications (H-33461). Gravid myometrial tissue was obtained from the margin of the hysterotomy in women undergoing term cesarean sections (>38 weeks estimated gestational age) without evidence of labor. Non-gravid myometrial tissue was collected from pre-menopausal women undergoing hysterectomy for benign conditions. Specimens from gravid women receiving treatment for pre-eclampsia, eclampsia, pregnancy-related hypertension, or pre-term labor were excluded.”
(6) Enriched motifs were identified by HOMER (Hypergeometric Optimization of Motif EnRichment) v4.11 (Heinz, Benner et al. 2010).
Please clarify what background is used for motif enrichment.
We used the default background sequences generated by HOMER from a set of random genomic sequences matching the input sequences in terms of basic properties, such as GC content and length. We have added more details in the Method section:
“DNA-binding factor motif enrichment analysis
Enriched motifs were identified by HOMER (Hypergeometric Optimization of Motif EnRichment) v4.11 with default background sequences matching the input sequences (Heinz, Benner et al. 2010).”
(7) "Six of the seven regions are also co-localized with previously published genome occupancy of transcription regulators curated by the ReMap Atlas"
Please clarify if this Atlas includes myometrial tissues or not and clarify the cell types included in the atlas.
According to the UCSC Genome Browser and the reference by Hammal et al. (2022), the current ReMap database includes PGR ChIP-seq data from human myometrial biopsies, available under NCBI GEO accession number GSE137550, alongside data from various other cell and tissue types. ReMap provides valuable insights into potential functional cis-acting elements in the genome from a systems biology perspective. However, tissue specificity requires independent validation.
(8) "Notably, 76% of the putative super-enhancers are co-localized with known PGR-occupied regions in the human myometrial tissue (Figure S2). This is significantly higher than the 20% co-localization in the regular enhancer group (Figure S2)."
Because there is a huge difference in the size of the putative super enhancer regions and the isolated enhancers this comparison is not appropriate as conducted. The comparison needs to account for the difference in size of the regions. Please provide P values for significance statements.
We acknowledge the reviewer's concern that our initial statement was overstated and potentially misleading, given the substantial difference in size between putative super-enhancer regions and regular enhancers. Rather than emphasizing the enrichment, it would be more accurate to simply describe our observation that super-enhancers encompass more PGR-occupied regions.
Here is the updated version:
“Notably, 76% of the putative super-enhancers co-localize with known PGR-occupied regions in human myometrial tissue, compared to 20% co-localization observed in regular enhancers (Figure S2).”
Reviewer #3 (Recommendations For The Authors):
(1) Title is extremely misleading, as here we do not get a view of the epigenomic landscape, but rather sparce data related to H3K27ac and H3K4me (focusing on enhancers) and chromatin conformation associated with the PLCL2 transcription start site (TSS).
As suggested, the title is modified to “Assessment of the Histone Mark-based Epigenomic Landscape in Human Myometrium at Term Pregnancy”.
(2) Improve the first result paragraph by providing a clear rationale for the experiments and their objectives, as well as introducing the samples used. Rather than simply listing approaches and end results in Table 1, offer concise explanations for the experiments alongside the supporting data presented in detailed figures. Using appropriate figures/graphs to effectively contextualize these datasets would be greatly appreciated by readers and would add more value to this research. Currently, it is difficult for us to assess and appreciate the quality of the data.
The following statement is included in the beginning of the Result section:
"To better understand the regulatory network shaping the myometrial transcriptome before labor, we analyzed transcriptome and putative enhancers in individual human myometrial specimens. Using RNA-seq, we identified actively expressed RNAs, while ChIP-seq for H3K27ac and H3K4me1 was used to map putative enhancers. Active genes were associated with nearby putative enhancers based on their genomic proximity. Additionally, chromatin looping patterns were mapped using Hi-C to further link active genes and putative enhancers within the same chromatin loops."
(3) The statistics for every sequencing approach need to be provided for each sample (e.g., RNA-seq: number of total reads, number of mapped reads, % of mapped reads; ChIP-Seq: number of mapped reads, % of mapped reads, % of duplicates).
We have generated the summary table of each dataset included in this study (Dataset S7) [NGS-summary.xls].
(4) Figure S1: The rationale behind comparing the Dotts study and yours regarding H3K27ac-positive regions needs to be better defined. Why is this performed if the data will not be used afterwards? What are the conserved regions associated with vs the ones that are variable? Is this biologically relevant? Why not use only the regions conserved between the 6 samples, to have more robust conclusions?
The purpose of comparing our data with the Dotts dataset is to highlight the degree of variation across studies. In this study, we focused on addressing specific biological questions using our own dataset rather than developing methodologies for meta-analysis. Future advancements in meta-analysis techniques could leverage the combined power of multiple datasets to provide deeper insights.
(5) Perhaps due to a lack of details, I am unable to ascertain how the putative myometrial enhancers were defined. In Dataset S1, it is stated, "we define the regions that have overlapping H3K27ac and H3K4me1 marks as putative myometrial enhancers at the term pregnant nonlabor stage (Dataset S1)". Within Dataset S1, for subjects 1, 2, and 3, H3K27ac and H3K4me1 double-positive enhancers are shown in term pregnant, non-labor human myometrial specimens, with approximately 100 regions corresponding to 131 (sample 1), 127 (sample 2), and 140 (sample 3) common peaks. However, in Figure 1a, reference is made to the 13114 putative enhancers commonly present across the three specimens. Is Dataset S1 intended to represent only a small fraction of the 13114 putative enhancers? Detailed analyses need to be conducted and better showcased.
Dataset S1 has been updated to list all 13,114 putative enhancers.
(6) For the gene expression analyses of RNA-seq data, FPKM values were utilized. However, it is unclear why the gene expression count matrix was normalized based on the ratio of total mapped read pairs in each sample to 56.5 million for the term myometrial specimens. I would recommend exercising caution regarding the use of FPKM expression units, as samples are normalized only within themselves, lacking cross-sample normalization. Consequently, due to external factors unaccounted for by this normalization method, a value of 10 in one sample may not equate to 10 in another.
We value the reviewer’s input. This question will be addressed in future secondary data analyses with suitable methodologies, as it is beyond the scope of this study.
(7) In Figure 1b, the authors have categorized their 12157 active genes into 3 bins based on FPKM values: >5 FPKM >1, >15 FPKM >5, and >15 FPKM. However, in the text, they describe these as 'actively high-expressing genes (FPKM >= 15)'. I would advise caution regarding the interpretation of these values, as an FPKM of 15 is not typically associated with highly expressed genes. According to literature and resources such as the Expression Atlas, an FPKM of 15 is generally considered to represent a low to medium expression level.
We appreciate the reviewer’s feedback. This question will be revisited during secondary data analyses using appropriate methodologies, as it falls outside the scope of the present study.
To increase readability and clarity, we modified the sentence as following: More than 40% of the 540 putative super enhancers are located within a 100-kilobase distance to high-expressing genes (FPKM >= 15), while only 7.3% of putative myometrial super enhancers are found near low-expressing genes (5 > FPKM >=1) (Figure 2B).
(8) Out of the 12157 active genes, approximately two-thirds have an FPKM >15. Was this expected? How does this correspond to what is observed in the literature, particularly in other similar studies (https://pubmed.ncbi.nlm.nih.gov/30988671/ ; https://pubmed.ncbi.nlm.nih.gov/35260533/ ) .
This is indeed an intriguing question that merits further exploration in future secondary analyses.
(9) It is also surprising to see that for the motif enrichment analysis (Fig. 1C), the P-values are small. This is probably because the percentage of target sequences with the motif is very similar to the percentage of background sequences with the motif. For instance, for selected genes in Figure 1C: AP-1 (50.68% vs. 46.50%), STAT5 (28.08% vs. 25.04%), PGR (17.90% vs. 16.12%), etc. Can one really say that you have a biologically relevant enrichment for values that are so close between target sequences and background sequences?
Reviewer’s comment is noted. Biological relevance shall be experimentally examined though wet-lab assays in future studies.
(10) For Figure 2, again not convinced that FPKM >= 15 can be used to say: Compared with the regular putative enhancers, the putative myometrial super-enhancers are found more frequently near active genes that are expressed at relatively higher levels (Figure 1B and Figure 2B). A higher threshold should be used if they want to say this.
To compare the association of putative enhancers with active genes expressed at different levels, we categorized the active genes into three groups based on their FPKM (Fragments Per Kilobase of transcript per Million mapped reads) values. These groups are defined as follows: the top third active genes (FPKM ≥ 15), the middle third active genes (5 ≤ FPKM < 15), and the bottom third active genes (1 ≤ FPKM < 5). By "active genes expressed at relatively higher levels," we refer specifically to the top third active genes with FPKM values of 15 or higher, indicating their relatively higher expression levels compared to the other groups of active genes.
(11) More detailed explanations and methods are needed regarding how the data for Figure S2 was obtained.
The following details were added to the methods section:
“Colocalization of super enhancers and PGR genome occupancy was compared by calling peaks from previously published PGR ChIP-seq data (GSM4081683 and GSM4081684). The percentages of enhancers and super enhancers that manifest PGR occupancy were calculated by overlapping the genomic regions in each category with PGR occupancy regions.”
(12) In Figure 2C, there is no information provided on the genes used to obtain the results. It would be helpful to include examples of these genes, along with their expression values, for instance.
The expression levels of the 346 active genes that are associated with myometrial super enhancers are included in Dataset S4, along with results of the updated gene ontology enrichment analysis using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) of Knowledgebase v2024q4. Selected pathways of interest are listed in updated Figure 2C.
(13) The linking of PLCL2-related data to the first part of the story is lacking, and the rationale behind it is missing. This entire section should be more detailed, and the data should be expanded to better reflect the context.
As suggested, we included the following statement at the beginning of the section “Cis-acting elements for the control of the contractile gene PLCL2”:
“We previously demonstrated the positive correlation of PLCL2 and PGR expression in a mouse model and PLCL2’s function on negatively modulating oxytocin-induced myometrial cell contraction (Peavy et al., 2021). However, the mechanism underlies the PGR regulation of PLCL2 remains unclear. Taking advantage of the mapped myometrial cis-acting elements, we aimed to identify the cis-acting elements that may contribute to the PLCL2 transcriptional regulation with a special interest on the PGR-related enhancers.”
The context is that our results provide additional evidence to support a direct regulation mechanism of PGR on the PLCL2 transcription, likely though the 35-kb upstream cis-acting element. This finding suggests that PLCL2 likely plays a mediator’s role of PGR dependent myometrial quiescence before laboring rather than a mere passenger on a parallel pathway. Further studies using in vivo models are needed to determine the extent of PLCL2 in mediating PGR, especially PGR-B isoform’s contraction-dampening function.
(14) The entire Hi-C data should be presented to allow for the assessment of its quality and further value.
The revised manuscript has included the Hi-C quality control summary in Dataset S8 [HiC-QC-Summary.xlsx].
(15) The authors state: "For the purpose of functional screening, we focus on H3K27ac signals instead of using H3K27ac/H3K4me1 double positive criterium to cast a wider net." However, it is unclear how many of the targeted regions contained H3K27ac/H3K4me1 peaks. Were enhancers or super-enhancers targeted, and if so, how did they compare to H3K27ac sites?
The numbers of H3K27ac/H3K4me1 double positive peaks are recorded in Figure 1A. Compared to the numbers of H3K27ac intervals (Table 1), the H3K27ac/H3K4me1 double positive peaks are 62.9%, 70.7%, and 61.2% of corresponding H3K27ac intervals in each individual specimen.
(16) For the first set of data (Table 1), the authors state, "Together, these results reveal an epigenomic landscape in the human term pregnant myometrial tissue before the onset of labor, which we use as a resource to investigate the molecular mechanisms that prepare the myometrium for subsequent parturition." While it is acknowledged that an epigenetic landscape exists in all tissues, there is a lack of clarity regarding this landscape in the current manuscript, as we are only presented with a table containing numbers.
This sentence has been revised to: “Together, these results delineate a map of H3K27ac and H3K4me1 positive signals in the human term pregnant myometrial tissue before the onset of labor, which we use as a resource to investigate the molecular mechanisms that prepare the myometrium for subsequent parturition.”
(17) For S1, the authors conclude: These data together highlight the degree of variation in mapping the epigenome among specimens and datasets. This conclusion seems somewhat perplexing, and I find myself in partial disagreement. Firstly, providing a clear rationale for this section would strengthen the conclusions. It's important to consider what factors may contribute to this variability. It could simply be attributed to differences in experimental settings, such as variations in samples, protocols used, antibodies, sequencing departments, or overall data quality. Deeper analyses of the data could have provided more information.
We agree with the reviewer that deeper analyses are needed in order to extract more information among studies. However, appropriate methods for meta-analyses should be carefully evaluated and employed for this purpose. We humbly believe that such a task should belong to future studies that may combine available datasets for secondary analyses, leveraging the collective contribution of the reproductive biology community.
(18) In the methods section, please include an explanation of how enhancers and super-enhancers were defined or add appropriate citations for reference.
Added more details about tool and parameter setting in the Method section of “Identification of super enhancers”.
“Identification of super enhancers
H3K27ac-positive enhancers were defined as regions of H3K27ac ChIP-seq peaks in each sample. The enhancers within 12.5Kb were merged by using bedtools merge function with parameter “-d 12500”. The combined enhancer regions were called super enhancers if they were larger than 15Kb. The common super enhancers from multiple samples were used for downstream analysis.”
Reference:
Whyte WA, Orlando DA, Hnisz D, Abraham BJ, Lin CY, Kagey MH, Rahl PB, Lee TI, Young RA. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell. 2013 Apr 11;153(2):307-19. doi: 10.1016/j.cell.2013.03.035. PMID: 23582322; PMCID: PMC3653129.
(19) Additional description on the "Inferred myometrial PGR activities and the correlation analysis "method section should be included to enhance clarity and understanding.
The description has been updated:
“The inferred PGR activities were represented by the T-score, which was derived by inputting the mouse myometrial Pgr gene signature, based on the differentially expressed genes between control and myometrial Pgr knockout groups at mid-pregnancy (Wu, Wang et al., 2022), into the SEMIPs application (Li, Bushel et al., 2021). The T-scores were computed using this signature alongside the normalized gene expression counts (FPKM) from 43 human myometrial biopsy specimens.”
(20) How was the qPCR analysis performed? Was the ddCT method utilized, and was a reference gene used for control? Additional information would be beneficial.
Quantifying relative mRNA levels was performed via the standard curve method.
The following details were added: “Relative levels of genes of interest were normalized to the 18S rRNA.”
(21) Regarding the RNA-Seq analysis of Provera-treated human Myometrial Specimens, the continued use of FPKM is not ideal due to potential differences in RNA composition between libraries. Additionally, clarification is needed on why Cufflinks 2.0.2 was used, considering it is no longer supported.
FPKM (Fragments Per Kilobase of transcript per Million mapped reads) is used in RNA-Seq analysis, because it allows for the normalization of gene expression data, accounting for differences in gene length and sequencing depth, and facilitates comparability across different genes and libraries. This makes it one of the essential tools for accurately measuring and comparing gene expression levels in various biological and clinical research contexts.
CuffLinks was once a popular tool for analyzing RNA-seq data, transcriptome assembly, and DEG identification. Its usage has declined in recent years due to the emergence of newer and more advanced tools. The main reason is that it was used for RNA-seq analysis at early stage of this study a few years ago. For the purpose of comparison and consistency, we continued using this tool for later RNA-seq analysis. If we start a new project now, we will choose newer tools, such as HISAT2, Salmon, and DEseq2.
(22) Overall, sentence structure and typos need to be corrected across the text. Here are some examples:
Line 17: at term, emerging studies.
Line 20-22: Here we investigated the human term pregnant nonlabor myometrial biopsies for transcriptome, enhancer histone mark cistrome, and chromatin conformation pattern mapping.
Line 30-32: PGR overexpression facilitated PLCL2 gene expression in myometrial cells Using CRISPR activation the functionality of a PGR putative enhancer 35-kilobases upstream of the contractile-restrictive gene PLCL2.
Line 66-70: However, the role of differential myometrial DNA methylation at contractility-driving gene promoter CpG islands in preterm birth is not thought to be major (Mitsuya, Singh et al. 2014), but given that DNA methylation-mediated gene regulation often occurs outside of CpG islands (Irizarry, Ladd-Acosta et al. 2009), there is still work to be done at this interface.
Line 80-83: Putative enhancers upstream of the PLCL2, a gene encoding for the protein PLCL2 which has been implicated in the modulation of calcium signaling (Uji, Matsuda et al. 2002) and maintenance of myometrial quiescence (Peavey, Wu et al. 2021), transcriptional start site were subject to functional assessment using CRISPR activation based assays.
Line 290 : sSpecimens
We appreciate the reviewer’s kind efforts and have made changes accordingly.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study confirms the molecular function of putative components of the N-glycan-dependent endoplasmic reticulum protein quality control (ERQC) system in the pathogen Cryptococcus neoformans. The study demonstrates an involvement in fitness, virulence, and the secretion and composition of extracellular vesicles, albeit in ways that are not yet fully understood. The evidence provided is convincing, with rigorous, well-controlled assays and the use of complemented strains.
-
Reviewer #2 (Public review):
Summary:
This study investigates the molecular function of the N-glycan-dependent endoplasmic reticulum protein quality control system (ERQC) in Cryptococcus neoformans and correlates this pathway with key features of C. neoformans virulence, especially those mediated by extracellular vesicle transport. The findings provide valuable insights into the connection between this pathway and the biogenesis of C. neoformans extracellular vesicles.
Strengths:
The strength of this study lies primarily in the careful selection of appropriate and current methodologies, which provide a solid foundation for the authors' results and conclusions across all presented data. All experiments are supported by well-designed and established controls in the study of C. neoformans, further strengthening the validity of the results and conclusions drawn from them. The study presents novel data on this important pathway in C. neoformans, establishing its connection with C. neoformans virulence. Interestingly, the findings led the authors to understand the relationship between this pathway and the transport of key fungal virulence factors via extracellular vesicles. This was demonstrated in the study, paving the way for a deeper understanding of extracellular vesicle biogenesis-a field still filled with gaps but one that this study contributes to with solid data, helping to clarify aspects of this process.
-
Reviewer #3 (Public review):
Summary:
Cryptococcus neoformans is a global critical threat pathogen and the manuscript by Mota et al demonstrates that the pathogen's N-glycan-dependent protein quality control system regulates the capacity of the fungus to cause disease. The system ensures that glycoproteins are folded correctly. The system is involved in fitness and virulence of the fungus by regulating aspects of cellular robustness and the trafficking of virulence-associated compounds outside of the cell via transport in extracellular vesicles.
Strengths:
The investigators use multiple modalities to demonstrate that the system is involved in cryptococcal pathogenesis. The investigators generated mutant C. neoformans to explore the role of genes involved in the protein folding system. Basic microbiology, genetic analyses, proteomics, fluorescence and transmission microscopy, nanotracking analyses, and murine studies were performed. The validity of the findings are thus very high. Hypotheses are robustly demonstrated.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer #1 (Recommendations for the authors):
Major comments
(1) The section on page 20 describing the proteomic analysis of EVs is poorly written and confusing, with a lot of data in the supplement. It is not clear what the proteomics data actually means.
We appreciate your feedback on the clarity of the proteomic analysis section. We have rewritten the section on page 20 with more detained information to provide a clearer explanation of the proteomics data and its biological significance. Additionally, we have incorporated a comparative analysis of the EV and total cell lysate proteomes (Fig. 8E, Supplementary Fig. S7A, Supplementary Tables 3 and 4) for supplemental data interpretation.
(2) The order of the data could be improved.
We appreciate your feedback regarding the data organization. We have reorganized the order and position of some data in a more structured and coherent manner, as suggested by the reviewers.
- Reorganization of the qPCR data (previously Fig. 1C) as Fig. 3A
- Removal of the data on the growth analysis on raffinose media (previously Fig. 7H).
-Reorganization of the spotting data of the double mutant (previously Fig 3B) to Supplementary Fig. S3B
- Reorganization of the subcellular localization data (previously Fig 3E) to Supplementary Fig. S4A
(3) The discussion is repetitive with the introduction and merely summarizes the results and speculates on the mechanism of how the absence of UGGT, leading to ERQC defects, results in defective EV biogenesis/cargo loading in C. neoformans.
We removed several repetitive sentences in the discussion and provided additional information on proteome analysis.
Other questions and comments
(1) Instead of comprehensively analyzing EVs from the UGG1 mutant, a more informative approach to better understanding how defects in N-linked glycosylation impact secretion, would be to do a proteomic analysis on the total secretions (including beta glucanase-treated cells to release classically secreted proteins from the cell wall) and EVs.
We agree that a comprehensive proteomic analysis of total secretions and classically secreted proteins would provide deeper insights into how defects in N-glycosylation impact secretion in C. neoformans. To address this concern, we performed an additional set of proteomic analyses, the proteome profiles of total cell lysates and the secretome of C. neoformans cultivated in SD broth and presented the results as Supplementary Table S5 and Supplementary Fig. S7B. These additional analyses provide further insights into the impact of UGG1 deletion on both conventional and unconventional secretion pathways, supporting a more pronounced effect of the UGG1 defect on EV-mediated trafficking. The discussion has been updated accordingly (Page 22, lines 509-514).
(2) The melanization defect in Ugg1 mutant is not strong. Could the reduction be due to partially compromised Ugg1 mutant growth at 30{degree sign}C as indicated in the spot tests. Were photos of the spot dilution assays taken at 1 and 2 days to investigate slower growth? Or alternatively were growth curves taken in a liquid culture?
For accuracy of melanin synthesis defect, in addition to analysis on L-DOPA plates, we had assessed melanin production in liquid L-DOPA medium following a 3-day incubation, and the melanin production in liquid media was normalized by cell density (OD<sub>600</sub>). The data on normalized melanin production is now included as Fig. 4B in the revised manuscript. The defective laccase activity in the _ugg1_Δ mutant (Fig. 7C) further corroborates our melanization assay results, which is additionally mentioned in the text (Page 18, lines 393-395).
(3) Is it accurate to say that some virulence factors (i.e. melanin, capsule and phosphatases) are predominantly trafficked through EV's in C. neoformans? Have studies been done to determine the proportion of virulence factors trafficked via EV's versus traditional secretion?
We thank you for the thoughtful comments. Some virulence factors, such as urease, melanin and capsule polysaccharides, lack a signal peptide required for targeting for the conventional ER/Golgi secretion pathway. It is generally assumed that the trafficking of these factors in C. neoformans is predominantly mediated by non-conventional secretion via EVs. Additionally, even some virulence factors with signal peptides, such as laccase and phosphatases, are also transported via EVs besides the conventional secretion. The quantitative analysis to compare the proportion of virulence factors secretion via EVs versus the conventional pathway has not been yet reported, despite that genetic evidence suggests that conventional secretion also plays a significant role in the export of capsule polysaccharides. Thus, we were also careful not to highlight EV as the main route of virulence factors in the manuscript.
(4) There is insufficient background in the introduction linking what is known about the ERQC process to secretion in general. The topic changes from the ERQC process to fungal virulence factor, with a primary focus on non-classical (EV-based) secretion. Classical secretion should also be discussed without assuming that non classical (EV) secretion is the major pathway contributing to fungal virulence.
We appreciate your insightful comments highlighting the need for more background on the ERQC process and its relationship with secretion. To address the reviewer’s concerns, we have added sentences to describe the key roles of ERQC in conventional protein secretion in the Introduction (Page 5, lines 102-106).
(5) Figure 1A. What does the blue filled circle with the red outline signify? Fig 1 A legend is not well explained. A summary using material provided in the intro/discussion should be included to briefly explain the process and the differences between fungal species. Please also be aware that the intro starts describing the human ERQC process and then switches to what happens in S. cerevisiae.
We have revised Figure 1A by removing the red circle and updated the figure legend in the revised manuscript to include more detailed information about the ERQC differences across higher eukaryotes and fungal species.
(6) Figure 2A. There are no units on the Y-axis. Presumably, the scale is the same for all 3 strains.
Thank you for your comments. The Y-axis is the same for all three strains and, as in Fig. 2C, and represents the relative fluorescence intensity obtained from the HPLC analysis. We added the units on the Y-axis in Fig. 2A.
(7) If Mnl1 and 2 have proposed roles in proteasomal degradation, wouldn't they be expected to have ER retention signals, like Ugg1?
We appreciate your valuable insights regarding the absence of ER retention signals in Mnl1 and Mnl2. Previous studies have shown that Saccharomyces cerevisiae Mnl1/Htm1 does not possess canonical KDEL/HDEL-like ER retention signals. Instead, its retention in the ER lumen is facilitated through its interaction with protein disulfide isomerase Pdi1, which contains an HDEL sequence (Gauss et al. 2011). Thus, it is expected that non-canonical retention mechanisms—such as interactions with other ER proteins—could contribute to the retention of Mnl1 and Mnl2 within the ER. We added this information to the revised manuscript (Page 8, lines 154-159).
(8) Figure 1 C qPCR showing change in mRNA in response to ER stress should not be grouped in this figure. It could be standalone or discussed when the spot dilution assays are performed. Anyway, spots tests are more convincing of a role in stress response than qPCR as the ugg1 mutant is sensitive to tunicamycin, DTT and cell wall stressing agents.
As suggested by the reviewer, we have reorganized the qPCR data as a part of Figure 3 (Figure 3A) in the revised manuscript.
(9) It is odd that mns1/101 mutants are not sensitive to ER and CW stress given their proposed differing location/function in the pathway (Figure 1) determined from the N-linked profiling. Any explanation? Could there be redundancy?
We appreciate the reviewer’s observation regarding the lack of ER and CW stress sensitivity in the mns1_Δ and _mns101_Δ mutants, despite their proposed roles in _N-glycan processing. We had previously reported that the C. neoformans alg3_Δ mutant, lacking a critical enzyme responsible for the synthesis of Dol-PP-Man<sub>6</sub>GlcNAc<sub>2</sub> in the _N-glycosylation pathway, exhibited clearly impaired N-glycan elongation, but showed no detectable growth defects even under stress conditions in vitro. However, alg3_Δ is avirulent in _in vivo pathogenicity (Thak et al., 2020). Similarly, the mns1_Δ_101_Δ double mutant shows glycan-processing defects that do not compromise cellular fitness under stress conditions but result in attenuated virulence in animal models. These findings suggest that some glycosylation-related defects may impact more severely _in vivo pathogenicity rather than in vitro stress sensitivity.
(10) Although the Silver-stained gels of the ugg1 mutant are not particularly informative, why weren't they (and Con A blots) performed for the other mutants?
The overall decrease of hypermannosylated glycans observed in the ugg1_Δ mutant allowed us to detect clear alterations in protein glycosylation patterns in the lectin blot using _Galanthus nivalis agglutinin, which recognizes terminal α1,2-, α1,3-, and α1,6-linked mannose residues. In contrast, the limited changes of a few glycan species in other mutants, including mns1_Δ, _mns101_Δ, and _mns1_Δ_101_Δ, are relatively subtle to be detected in the lectin blot, due to only minor differences in the average lengths of their _N-glycans compared to the WT. Therefore, we presented the lectin blotting data only for the _ugg1_Δ mutant.
(11) If there is ER stress under normal conditions in the Ugg1 mutant then technically this mutant should be growing more slowly under normal conditions. This is difficult to predict in a spot dilution assay where growth is only visualized at day three when any growth defect may have been corrected. The slower growth rather than the reduced secretion of GXM specifically is therefore more likely to be responsible for the reduced virulence.
We appreciate the reviewer’s insightful comment regarding the interplay between ER stress, growth defects, and virulence attenuation in the ugg1_Δ mutant. While retarded growth in _C. neoformans is often associated with reduced virulence, there are a few exceptions. For instance, disruptions in cell cycle progression in C. neoformans have been reported to result in larger capsule sizes, which rather enhance in vivo virulence when analyzed in Galleria mellonella infection models (García-Rodas et al., 2014). This highlights that growth defect alone is not sufficient for virulence attenuation. In the case of the _ugg1_Δ mutant, we speculate that the almost complete loss of virulence is attributed not only to its growth retardation but also to its impaired secretion of key virulence factors, including the polysaccharide capsule.
(12) The rationale for using leucine analogue 5',5',5'-trifluoroleucine (TFL), in a growth assay (Fig. 3C) to determine whether the defective ugg1Δ phenotypes are induced by ER stress caused by misfolded protein accumulation is not explained.
The leucine analogue 5',5',5'-trifluoroleucine (TFL) can be incorporated into newly synthesized proteins, disrupting normal folding and thus leading to the generation of misfolded proteins (Trotter et al., 2002; Cowie et al., 1959). In the context of a defective ERQC pathway, these misfolded proteins cannot be adequately repaired, resulting in their accumulation and triggering ER stress. Excessive ER stress may ultimately inhibit cell growth in the presence of TFL. This explanation has been incorporated into the revised manuscript (Page 11, lines 236–241).
(13) I would argue that only the Ugg1 and double Mns mutant were defective in virulence. For the single mutants, it looks like no difference was found relative to WT. The longer median survival of these mutants (if significant) is most likely due to poor infection technique.
We agree with the reviewer’s opinion that the mns1_Δ and _mns101_Δ single mutants have no significant difference in _in vivo virulence compared to the WT strain, unlike the _mns1_Δ_101_Δ double mutant which showed significant attenuated virulence. We had previously addressed that in the manuscript (Page 13, lines 267-269).
(14) The authors conclude that the ugg1Δ strain specifically is impaired in extracellular secretion of capsular polysaccharides but is this via classical (SAV1) secretion or EVs?
In addition to EV-mediated transport, capsular polysaccharide secretion can occur via the Sav1 (Sec4p)-mediated classical secretion pathway. However, our proteome data of total cell lysates indicated that the protein levels of Sav1 were comparable between the WT and _ugg1_Δ strains, suggesting that Sav1p function itself might not be impaired. Given that the _ugg1_Δ mutant exhibits altered vesicular structures (Supplementary Fig. S6) and loss of microvesicles (Fig. 8A), we speculate that a defect might occur at a post-Sav1p step, such as vesicle fusion with the plasma membrane, likely contributing to the complete defect in secretion of capsular polysaccharides in the _ugg1_Δ strain, in which EV biogenesis and defective cargo loading are severely impaired, producing EVs that lack capsular polysaccharides (Figure 8F). However, further studies should be carried out to define the contribution of SAV1 to the secretion of capsular polysaccharides in in the _ugg1_Δ strain.
(15) The rationale for doing 7 H is very confusing.
The experiment assessing raffinose utilization as a carbon source was inspired by the previous work of Garcia-Rivera et al., reporting that the _cap59_Δ mutant is unable to utilize raffinose due to a defect in the secretion of raffinose-hydrolyzing enzymes. As another way to investigate potential defects in the conventional secretion pathway, we investigated the growth of the _ugg1_Δ mutant in the presence of raffinose. Due to our extensive data length, we have decided to remove this complementary data from the manuscript.
(16) It is speculated in the discussion that ER stress impacts lipid/sterol synthesis and that LDs (lipid droplets?) aid the UPR and ERAD in degrading misfolded proteins during ER stress in S. cerevisiae. The authors mention that they observed a drastic increase in LDs in the ugg1Δ mutant. Where is this data? Even with the data, this is all speculation. The authors also speculate that increased numbers of vacuoles in ugg1 (where is the data?) could be the cause of the altered vesicular structures observed in the mutants, which may indicate abnormal lipid homeostasis caused by the ERQC defects, which could, in turn, affect EV biogenesis. Again, this is speculative.
The data on lipid droplets (LDs) and vacuole staining are presented in Supplementary Figure S6, showing a drastic increase in LDs and an increased in vacuolar size in the _ugg1_Δ mutant compared to the wild-type strain, especially in capsule-inducing conditions. In addition to such changes in vesicular structures, our preliminary data on sphingolipids and sterol analysis in the surface lipid fraction of the _ugg1_Δ mutant led us to propose the hypothesis that ERQC defects may impact lipid metabolism, which in turn could influence EV biogenesis and membrane properties. It is expected that these findings would provide a strong foundation for future studies exploring the link between ERQC, lipid homeostasis, and EV biogenesis. We have revised our speculation on the association of abnormal lipid homeostasis, caused by ERQC, with EV biogenesis more appropriately by adding the information on our preliminary data of lipid profiles and mentioning that the _ugg1_Δ mutant lacks microvesicles, which are derived from the plasma membrane (Page 24, lines 554-559).
Reviewer #2 (Recommendations for the authors):
(1) My suggestions for the authors are the same as those presented in the public review: (1) reducing the text in certain sections of the paper to improve readability for the audience, and (2) reconsidering the figures to reduce the amount of information in each one, moving some of the content to the supplementary material.
We thank the reviewer for their constructive suggestions regarding the organization and readability of the manuscript. As suggested, we addressed your concerns as follows:
(1) Reducing the text in the Introduction, Results, and Discussion sections by removing repetitive statements and simplifying complex descriptions where possible.
(2) Changing the presentation of figures: we have also reorganized the presentation of some data by moving non-essential data to the supplementary material. The updated figures and supplementary materials have been clearly referenced in the text to guide readers.
(3) Reorganization of materials and methods: some parts of methods were moved to Supplementary Information
(4) Removal of Figure 7H and the sentences describing the result
More detailed explanations on the reduction and reorganization are also described in the response to the major comments (2) and (3) made by Reviewer #1.
(2) Figure 3, for example, shows no difference in fungal growth under different cultivation conditions. This information is valuable but could be mentioned in the text, with the image provided as supplementary material, focusing the figure only on images that show significant growth differences among the strains. I suggest a similar approach for other figures so that the authors can include only the most relevant results in the main body of the article and move some figures to the supplementary materials.
For Fig. 3, the spotting data of the double mutant (previously Fig. 3B) is now presented in the supplementary information (Supplementary Fig. S3B). Additionally, the subcellular localization data (previously Fig 3E) was also moved to the supplementary material (Supplementary Fig. S4A).
Reviewer #3 (Recommendations for the authors):
(1) Line 43 "EV-mediated transport of virulence bags" doesn't make sense. EVs have been described as "virulence bags" (and are in this work later in the introduction) but this should here be "transport of virulence factors" or "compounds associated with virulence" but only if you have confirmed that the "cargo" is consistent with this- which is not evident in the abstract.
Thank you for your insightful comment. We have revised this to "EV-mediated transport of virulence factors" in line with your suggestion.
(2) Line 49 "secretory pathway" - is there not more than one secretion pathway?
Thank you for pointing this out. The term "secretory pathway" has been updated to "secretory pathways" to acknowledge the presence of both conventional and unconventional secretion mechanisms.
(3) Line 53 "recognizes folding defects, repairs them, and ensures the translocation of irreparable misfolded proteins" should be "recognizes folding defects and repairs them or ensures the translocation of irreparable misfolded proteins.
Thank you for pointing this out. We have revised the sentence as you suggested.
(4) Lines 88-90 ALG needs to be written out the first time - Asn-linked glycans. Also, consider adding that ALG genes are present in most eukaryotes as it is unclear what you are comparing C. neoformans to.
Thank you for your helpful comment. We have revised the text to write out "ALG" as "Asn-linked glycosylation" and added the sentence “ALG genes are evolutionary conserved in most eukaryotes” in the revised manuscript (Page 4, line 84).
(5) Line 99 Cryptococcus has already been abbreviated to C. so don't write it out again.
We have corrected "Cryptococcus" to “C.” throughout the manuscript after its first mention.
(6) Line 152- tunicamycin and DTT are not described yet, which may make it challenging for some readers to understand what these drugs are doing/why they were used. What is on lines 156 and 157 for these drugs should go up with the first mention of these drugs.
Thank you for your helpful suggestion. We have revised the manuscript to include the descriptions and purpose of using tunicamycin (TM) and dithiothreitol (DTT) immediately following their first mention, as recommended (Page 10, lines 208-210).
(7) The text for Figure 1 C is inaccurate. High temperature also induced KAR2, as noted above, but inaccurately stated in line 160. There is no comment on the significant UGG1 increase with tunicamycin or that KAR2 was highest in this condition.
Thank you for your thoughtful comment. We have better clarified the significant increase of UGG1 expression following tunicamycin treatment and KAR2 induction upon heat stress in the revised manuscript (Page 10, lines 216-217). Please note that Fig. 1C was revised and is now referred to as Fig. 3A.
(8) Figure 2B is not well explored/explained. There appears to be more protein in the mutant, including of higher weight in the intracellular compartment. It is difficult to ascertain if there is more too in the secretion phase with this gel. The methods do not specifically describe the concentration of protein added - just volume. Is what we are seeing a loading issue vs real differences?
Thank you for your insightful comments regarding Figure 2B. We added information on amounts of protein (30 µg per lane) in the legend of Figure 2B.
The main purpose of Fig. 2B is to examine the altered glycosylation pattern of ERQC by detecting glycoproteins using the Galanthus nivalis agglutinin, which specifically bind terminal α1,2-, α1,3-, and α1,6-linked mannose residues. The result of lectin blotting indicated that glycoproteins are more abundantly detected in the secretion fraction compared to in the soluble intracellular fraction, consistent with the general notion that more than 50% of secretory proteins are glycoproteins. Also, the more abundant proteins with decreased molecular weight in the secretion fraction of ugg1_Δ mutant supported the _N-glycan profiles with decreased hypermannosylation in _ugg1_Δ mutant. We added the purpose and more detailed interpretation on Figure 2B in the revised manuscript (Page 9, lines 174-179).
(9) Line 242 "melanin pigment" is redundant as melanin is a pigment.
We thank the reviewer for pointing out the redundancy in the phrase. We revised the text to simply state "melanin".
(10) Line 250 drops "completely" especially as the mutant did colonize the lungs of mice.
To avoid any possible misleading, we removed the term "completely" in the revised manuscript.
(11) Line 275- need to reference 18B7 as it is first introduced here.
We added the reference on the antibody 18B7 in the revised manuscript.
(12) Line 308- there are specific techniques to measure GXM size that could validate or refute the statement on "incomplete" polysaccharides. For example, DOI:10.1128/EC.00268-09.
We appreciated the valuable suggestion on specific techniques to measure GXM size, which will be one of key experiments in our future study. In the revised manuscript we cited the suggested reference to indicate the need for validation of our statement (Page 14, lines 316-318).
(13) Line 496 "mammals" - why is this used when the study is on a fungus, not a mammal? The structure of the first 2 paragraphs can be clearer to focus more on fungal biology.
We have compared both mammals and fungi to emphasize that the ERQC system is conserved among eukaryotes but diverged with a few species-specific features. This comparison is relevant in the context of understanding the evolutionary unique features of ERQC pathways in C. neoformans. We modified the first 2 paragraphs to clarify the main issue of our present study (Page 21, lines 472-483).
(14) Line 525- the ugg mutant was not avirulent as CFU was present and histopathology in the supplementary figures shows the tissue with ugg1 deletion was not normal (although the images are not especially easy to review). Yes, the mutant did not kill under your test conditions, but it was not avirulent (incapable of causing disease). Significantly attenuated or other descriptors should be utilized. Line 548 is also thus incorrect "complete loss of virulence").
We appreciate the reviewer’s concern regarding the description of the _ugg1_Δ mutant as avirulent. We agree that the use of merely “avirulent" may not fully capture the observed phenotypes in the CFU and histopathological data, since we cannot exclude the possibility that the _ugg1_Δ mutant retains the ability to establish an infection. Thus, we have revised the text by describing the _ugg1_Δ mutant as "almost avirulent".
(15) Line 597- the study by Fukuoka used kidney cells. It is misleading to not clearly state that this finding of ER stress was NOT done in fungi as the way it is presented makes it read as if this work was performed in C. neoformans. This should be clarified. This should also be double-checked and clarified for other statements, such as the reference to Harada in line 606, as this study used melanoma cells. These cell types are very different from cryptococcus- though I absolutely concur that lessons can be learned from comparative assessments.
We thank the reviewer for pointing out the need to clarify the experimental context of the cited studies. We explicitly stated the host cell types used in the referenced studies by Fukuoka et al. and by Harada et al., respectively, in the revised manuscript (Page 25, lines 560 and 568).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study presents a theoretical framework in which spatial periodicity in grid cell firing emerges as the optimal solution for encoding two-dimensional spatial trajectories via sequential neural activation. The idea is supported by solid evidence, though it rests on several key assumptions that merit careful consideration. This work will be of interest to neuroscientists investigating the neural mechanisms underlying spatial navigation.
-
Reviewer #1 (Public review):
Summary:
This manuscript aims to explain the emergence of grid-like spatial firing patterns. Rather than taking the existence of grid cells as a given and asking what does their properties say about their function, the authors reverse the approach: they begin with a proposed computational function that the brain may need to perform-coding of 2D spatial trajectories using sequences of neural activity-and ask what type of neural code would optimally support this function. They show that, under a set of formal assumptions, such a code leads to the emergence of spatial periodicity and a hexagonal grid pattern. The aim is to provide a normative explanation for the existence of grid cells grounded in functional constraints.
Strengths:
The manuscript presents a mathematically well-defined framework that is internally consistent. The derivation is structured and leads to a hexagonal lattice as the most efficient solution for representing directional trajectories. The authors provide comparisons to experimental observations and extend the model to explain several findings in the grid cell literature. In the revised version, the discussion of foundational assumptions is expanded, and the manuscript better situates itself in relation to prior theoretical work. Overall, this work adds a very interesting view to the broader conversation about the role and origin of grid cells by offering a theoretical alternative grounded in trajectory coding.
Weaknesses:
The model depends on assumptions that, while plausible, should be treated as chosen assumptions. These include the premise that (1) grid function is trajectory coding, (2) that trajectory coding is implemented through sequences of neural activity, and (3) that such sequences are largely independent of spatial position. In the revised manuscript, the authors provide more literature to support these assumptions.
-
Reviewer #2 (Public review):
Summary:
In this work, the authors consider the required functional properties of neurons that trajectories in 2D space using cell sequences, ultimately linking the required properties to those found in grid cells. In their argument, the authors first introduce a set of definitions and axioms, which then lead to their conclusion that a hexagonal pattern is the most efficient or parsimonious pattern one could use to uniquely label different 2D trajectories using sequences of cells. The authors then go through a set of classic experimental results in the grid cell literature - e.g. that the grid modules exhibit a multiplicative scaling, that the grid pattern expands with novelty or is warped by reward, etc. - and describe how these results are either consistent with or predicted by their theory. Overall, this paper asks a very interesting question and provides an intriguing answer.
Major strengths:
The general idea behind the paper is very interesting - why *does* the grid pattern take the form of a hexagonal grid? This is a question that has been raised many times; finding a truly satisfying answer is difficult but of great interest. The authors' main assertion that the answer to this question has to do with the ability of a hexagonal arrangement of neurons to uniquely encode 2D trajectories is an intriguing suggestion. It is also impressive that the authors considered such a wide range of experimental results in relation to their theory.
Major weaknesses:
One weakness I perceive is that the paper overstates what it delivers. In the introduction, the authors claim to provide "mathematical proof that ... the nature of the problem being solved by grid cells is coding of trajectories in 2-D space using cell sequences. By doing so, we offer a specific answer to the question of why grid cell firing patterns are observed in the mammalian brain." By virtue of the fact that the authors make assumptions about biological function in their claims, this paper does not provide proof of what grid cells are doing to support behavior nor provide the true answer as to why grid patterns are found in the brain. Although I find this study both intriguing and important - and I respect the authors' perspective - as an experimentalist guided by the principle that biological theories are never proven but instead continually supported by data, suggestions of a proof of grid cell function are hard for me to get behind. Regardless, the paper presents a compelling line of reasoning that enhance our understanding of grid cells.
-
Reviewer #3 (Public review):
Concerning the revised manuscript, the authors are to be commended for carefully addressing the reviewers' comments and updating the references cited.
I will differ with the authors' argument that Gardner et al's paper supports the idea of sequences, except in the most trivial sense, namely that topology implies continuity, and hence movement along the manifold will be continuous, and if one discretizes this movement, one would get a sequence.
This has very little to do with the idea that the authors propose in their manuscript. If the authors were to so choose, their idea will produce an embedded graph, and they could study the topology of this construct---but this would mean going off on a tangent.
Let us not hold up the authors to an impossible standard, namely that their theory should explain everything in the grid cell field. Not every finding under the sun needs be addressed in the discussion.
My own take on the manuscript had been that the authors' interesting idea might be fairly straightforwardly provable. The authors have decided to put another student on this particular project. That is perfectly OK. Just one final note: mathematically, the main focus ought to be on sequences of prime length (many other results will likely follow).
-
Author response:
The following is the authors’ response to the original reviews
eLife assessment
This valuable study aims to present a mathematical theory for why the periodicity of the hexagonal pattern of grid cell firing would be helpful for encoding 2D spatial trajectories. The idea is supported by solid evidence, but some of the comparisons of theory to the experimental data seem incomplete, and the reasoning supporting some of the assumptions made should be strengthened. The work would be of interest to neuroscientists studying neural mechanisms of spatial navigation.
We thank the reviewers for this assessment. We have addressed the comments made by reviewers and believe that the revised manuscript has theoretical and practical implications beyond the subfield of neuroscience concerned with mechanisms underpinning spatial memory and spatial navigation. Specifically, the demonstration that four simple axioms beget the spatial firing pattern of grid cells is highly relevant for the field of artificial intelligence and neuromorphic computing. This relevance stems from the fact that the four axioms define a set of four simple computational algorithms that can be implemented in future work in grid cell-inspired computational algorithms. Such algorithms will be impactful because they can perform path integration, a function that is independent of an animal’s or agent’s location and therefore generalizable. Moreover, because of the functional organization of grid cells into modules, the algorithm is also scalable. Generalizability and scalability are two highly sought-after properties of brain-inspired computational frameworks. We also believe that the question why grid cells emerge in the brain is a fundamental one. This manuscript is, to our knowledge, the first one that provides an interpretable and intuitive answer to why grid cells are observed in the brain.
Before addressing each comment, we would like to point out that the first sentence of the assessment appears misphrased. The study does not aim to present a theory for why the periodicity in grid cell firing would be helpful for encoding 2D spatial trajectories. To present a theory “for why grid cell firing would be helpful for encoding 2D trajectories”, one assumes the existence of grid cells a priori. Instead of assuming the existence of grid cells and deriving a computational function from grid cells, our study derives grid cells from a computational function, as correctly summarized by reviewers #1 and #3 in their individual statements. In contrast to previous normative models, we prove mathematically that spatial periodicity in grid cell firing is implied by a sequence code of trajectories. If the brain uses cell sequences to code for trajectories, spatially periodic firing must emerge. As correctly pointed out by reviewer #1, the underlying assumptions of this study are that the brain codes for trajectories and that it does so using cell sequences. In response to comments by reviewer #1, we now discuss these two assumptions more rigorously.
Public Reviews:
Reviewer #1 (Public Review):
Rebecca R.G. et al. set to determine the function of grid cells. They present an interesting case claiming that the spatial periodicity seen in the grid pattern provides a parsimonious solution to the task of coding 2D trajectories using sequential cell activation. Thus, this work defines a probable function grid cells may serve (here, the function is coding 2D trajectories), and proves that the grid pattern is a solution to that function. This approach is somewhat reminiscent in concept to previous works that defined a probable function of grid cells (e.g., path integration) and constructed normative models for that function that yield a grid pattern. However, the model presented here gives clear geometric reasoning to its case.
Stemming from 4 axioms, the authors present a concise demonstration of the mathematical reasoning underlying their case. The argument is interesting and the reasoning is valid, and this work is a valuable addition to the ongoing body of work discussing the function of grid cells.
However, the case uses several assumptions that need to be clearly stated as assumptions, clarified, and elaborated on: Most importantly, the choice of grid function is grounded in two assumptions:
(1) that the grid function relies on the activation of cell sequences, and
(2) that the grid function is related to the coding of trajectories. While these are interesting and valid suggestions, since they are used as the basis of the argument, the current justification could be strengthened (references 28-30 deal with the hippocampus, reference 31 is interesting but cannot hold the whole case).
We thank this reviewer for the overall positive and constructive criticism. We agree with this reviewer that our study rests on two premises, namely that 1) a code for trajectories exist, and 2) this code is implemented by cell sequences. We now discuss and elaborate on the data in the literature supporting the two premises.
In addition to the work by Zutshi et al. (reference 31 in the original manuscript), we have now cited additional work presenting experimental evidence for sequential activity of neurons in the medial entorhinal cortex, including sequential activity of grid cells.
We have added the following paragraph to the Discussion section:
“Recent studies provided compelling evidence for sequential activity of neurons representing spatial trajectories. In particular, Gardner et al. (2022) demonstrated that the sequential activity of hundreds of simultaneously recorded grid cells in freely foraging rats represented spatial trajectories. Complementary preliminary results indicate that grid cells exhibit left-rightalternating “theta sweeps,” characterized by temporally compressed sequences of spiking activity that encode outwardly oriented trajectories from the current location (Vollan et al., 2024).
The concept of sequential grid cell activity extends beyond spatial coding. In various experimental contexts, grid cells have been shown to encode non-spatial variables. For instance, in a stationary auditory task, grid cells fired at specific sounds along a continuous frequency axis (Aronov et al., 2017). Further studies revealed that grid cell sequences also represent elapsed time and distance traversed, such as during a delay period in a spatial alternation task (Kraus et al., 2015). Similar findings were reported for elapsed time encoded by grid cell sequences in mice performing a virtual “Door Stop” task (Heys and Dombeck, 2018).
Additionally, spatial trajectories represented by temporally compressed grid cell sequences have been observed during sleep as replay events (Ólafsdóttir et al., 2016; O’Neill et al., 2017). Collectively, these studies demonstrate that sequential activity of neurons within the MEC, particularly grid cells, consistently encodes ordered experiences, suggesting a fundamental role for temporal structure in neuronal representations.
The theoretical underpinnings of grid cell activity coding for ordered experiences have been explored previously by Rueckemann et al. (2021) who argued that the temporal order in grid cell activation allows for the construction of topologically meaningful representations, or neural codes, grounded in the sequential experience of events or spatial locations. However, while Rueckemann et al. argue that the MEC supports temporally ordered representations through grid cell activity, our findings suggest an inverse relationship: namely, that grid cell activity emerges from temporally ordered spatial experiences. Additional studies demonstrate that hippocampal place cells may derive their spatial coding properties from higher-order sequence learning that integrates sensory and motor inputs (Raju et al., 2024) and that hexagonal grids, if assumed a priori, optimally encode transitions in spatiotemporal sequences (Waniek, 2018).
Together, experimental and theoretical evidence demonstrate the significance of sequential neuronal activity within the hippocampus and entorhinal cortex as a core mechanism for representing both spatial and temporal information and experiences.”
The work further leans on the assumption that sequences in the same direction should be similar regardless of their position in space, it is not clear why that should necessarily be the case, and how the position is extracted for similar sequences in different positions.
We thank this reviewer for giving us the opportunity to clarify this point. We define a trajectory as a path taken in space (Definition 6). By this definition, a code for trajectories is independent of the animal’s spatial location. This is consistent with the definition of path integration, which is also independent of an animal’s spatial location. If the number of neurons is finite (Axiom #4) and the space is large, sequences must eventually repeat in different locations. This results in neural sequences coding for the same directions being identical at different locations. We have clarified this point under new Remark 6.1. in the Results section of the revised:
“Remark 6.1. Note that a code for trajectories is independent of the animal’s spatial location, consistent with the definition of path integration. This implies that, if the number of neurons is finite (Axiom #4) and the space is large, sequences must eventually repeat in different location, resulting in neural sequences coding for the same trajectories at different locations.”
The formal proof was already included in the original manuscript: “Generally speaking, starting in a firing field of element i and going along any set of firing fields, some element must eventually become active again since the total number of elements is finite by axiom 4. Once there is a repeat of one element’s firing field, the whole sequence of firing fields of all elements must repeat by axiom 1. More specifically, if we had a sequence 1,2, … , k, 1, t of elements, then 1,2 and 1, t both would code for traveling in the same direction from element 1, contradicting axiom 1.”
Further: “More explicitly, assuming axioms 1 and 4, the firing fields of trajectory-coding elements must be spatially periodic, in the sense that starting at any point and continuing in a single direction, the initial sequence of locally active elements must eventually repeat with a repeat length of at least 3”.
Regarding the question how an animal’s position is extracted for similar sequences in different positions, we agree with this reviewer that this is an important question when investigating the contributions of grid cells to the coding of space. However, since a code for trajectories is independent of spatial location, the question of how to extract an animal’s position from a trajectory code is irrelevant for this study.
While a trajectory code by neural sequences begets grid cells, a spatial code by neural sequences does not. Nevertheless, grid cells could contribute to the coding of space (in addition to providing a trajectory code). However, while experimental evidence from studies with rodents and human subjects and theoretical work demonstrated the importance of grid cells for path integration (Fuhs and Touretzky, 2006; McNaughton et al., 2006; Moser et al., 2017), experimental studies have shown that grid cells contribute little to the coding of space by place cells (Hales et al., 2014). Yet, theoretical work (Mathis et al., 2012) showed that coherent activity of grid cells across different modules can provide a code for spatial location that is more accurate than spatial coding by place cells in the hippocampus. Importantly, such a spatial code by coherent activity across grid cell modules does not require location-dependent differences in neural sequences.
The authors also strengthen their model with the requirement that grid cells should code for infinite space. However, the grid pattern anchors to borders and might be used to code navigated areas locally. Finally, referencing ref. 14, the authors claim that no existing theory for the emergence of grid cell firing that unifies the experimental observations on periodic firing patterns and their distortions under a single framework. However, that same reference presents exactly that - a mathematical model of pairwise interactions that unifies experimental observations. The authors should clarify this point.
We thank this reviewer for this valuable feedback. We agree that grid cells anchor to borders and may be used to code navigated areas locally. In fact, the trajectory code performs a local function, namely path integration, and the global grid pattern can only emerge from performing this local computation if the activity of at least one grid unit or element (we changed the wording from unit to element based on feedback from reviewer #3) is anchored to either a spatial location or a border. Yet, the trajectory code itself does not require anchoring to a reference frame to perform local path integration. Because of the local nature of the trajectory code, path integration can be performed locally without the emergence of a global grid pattern. This has been shown experimentally in mice performing a path integration task where changes in the location of a task-relevant object resulted in translations of grid patterns in single trials. Although no global grid pattern was observed, grid cells performed path integration locally within the multiple reference frames defined by the task-relevant object, and grid patterns were visible when the changes in the references frames were accounted for in computing the rate maps (Peng et al., 2023). The data by Peng et al. (2023) confirm that the anchoring of the grid pattern to borders and the emergence of the global pattern are not required for local coding of trajectories. The global pattern emerges only when the reference frame does not change. However, this global pattern itself might not serve any function. According to the trajectory code model, the beguiling grid pattern is merely a byproduct of a local path integration function that is independent of the animal’s current location (which makes the code generalizable across space). The reviewer is correct that, if the reference frame used to anchor the grid pattern did not change in infinite space, the trajectory code model of grid cell firing would predict an infinite global pattern. But does the proof implicitly assume that space is infinite? The trajectory code model makes the quantitative prediction that the field size increases linearly with an increase in grid spacing (the distance between two fields). If the field size remains fixed, periodicity will emerge in finite spaces that are larger than the grid spacing. We have clarified these points in the revised manuscript:
“Notably, the trajectory code itself does not require anchoring to a reference frame to perform local path integration. Because of the local nature of the trajectory code, path integration can be performed locally without the emergence of a global grid pattern. This has been shown experimentally in mice performing a path integration task where changes in the location of a task-relevant object resulted in translations of grid patterns in single trials (Peng et al., 2023). Although no global grid pattern was observed because the reference frame was not fixed in space, grid cells performed path integration locally within the reference frame defined by the moving task-relevant object, and grid patterns were visible when the changes in the references frames were accounted for in computing the rate maps”.
Regarding how the emergence of grid cells from a trajectory code relates to the theory of a local code by grid cells brought forward by Ginosar et al. (ref. 14), we argue that the local computational function suggested by Ginosar et al. is to provide a code for trajectories. The perspective article by Ginosar et al. provides an excellent review of the experimental data on grid cells that point to grid cells performing a local function (see also Kate Jeffery’s excellent review article (Jeffery, 2024) on the mosaic structure of the mammalian cognitive map.) Assuming the existence of grid cells a priori, Ginosar et al. then propose three possible functions of grid cells, all of which are consistent with the trajectory code model of grid cell firing. Yet, the perspective article remains agnostic, in our opinion, on the exact nature of the local computation that is carried out by grid cells. But without knowing the local computation underlying grid cell function, a unifying theory explaining the emergence of grid cells cannot be considered complete. In contrast, our manuscript identifies the local computational function as a trajectory code by cell sequences. We have clarified these points in the revised manuscript:
“The influential hypothesis that grid cells provide a universal map for space is challenged by experimental data suggesting a yet to be identified local computational function of grid cells (Ginosar et al., 2023; Jeffery, 2024). Here, we identify this local computational function as a trajectory code.”
The mathematical model of pairwise interactions described by Ginosar et al. is fundamentally different from the mathematical framework developed in our manuscript. The mathematical model by Ginosar et al. describes how pairwise interactions between already existent grid fields can explain distortions in the grid pattern caused by the environment’s geometry, reward zones, and dimensionality. However, the model does not explain why there is a grid pattern in the first place. In contrast, our trajectory model provides an explanation for why grid cells may exist by demonstrating that a grid pattern emerges from a trajectory code by cell sequences. We stand by our assessment that a unifying theory of grid cells is not complete if it takes the existence of the grid pattern for granted.
Reviewer #2 (Public Review):
Summary:
In this work, the authors consider why grid cells might exhibit hexagonal symmetry - i.e., for what behavioral function might this hexagonal pattern be uniquely suited? The authors propose that this function is the encoding of spatial trajectories in 2D space. To support their argument, the authors first introduce a set of definitions and axioms, which then lead to their conclusion that a hexagonal pattern is the most efficient or parsimonious pattern one could use to uniquely label different 2D trajectories using sequences of cells. The authors then go through a set of classic experimental results in the grid cell literature - e.g. that the grid modules exhibit a multiplicative scaling, that the grid pattern expands with novelty or is warped by reward, etc. - and describe how these results are either consistent with or predicted by their theory. Overall, this paper asks a very interesting question and provides an intriguing answer. However, the theory appears to be extremely flexible and very similar to ideas that have been previously proposed regarding grid cell function.
We thank this reviewer for carefully reading the manuscript and their valuable feedback which helps us clarify major points of the study. One major clarification is that the theoretical/axiomatic framework we put forward does not assume grid cells a priori. In contrast, we start by hypothesizing a computational function that a brain region shown to be important for path integration likely needs to solve, namely coding for spatial trajectories. We go on to show that this computational function begets spatially periodic firing (grid maps). By doing so, we provide mathematical proof that grid maps emerge from solving a local computational function, namely spatial coding of trajectories. Showing the emergence of grid maps from solving a local computational function is fundamentally different from many previous studies on grid cell function, which assign potential functions to the existing grid pattern. As we discuss in the manuscript, our work is similar to using normative models of grid cell function. However, in contrast to normative models, we provide a rigorous and interpretable mathematical framework which provides geometric reasoning to its case.
Major strengths:
The general idea behind the paper is very interesting - why *does* the grid pattern take the form of a hexagonal grid? This is a question that has been raised many times; finding a truly satisfying answer is difficult but of great interest to many in the field. The authors' main assertion that the answer to this question has to do with the ability of a hexagonal arrangement of neurons to uniquely encode 2D trajectories is an intriguing suggestion. It is also impressive that the authors considered such a wide range of experimental results in relation to their theory.
We thank this reviewer for pointing out the significance of the question addressed by our manuscript.
Major weaknesses:
One major weakness I perceive is that the paper overstates what it delivers, to an extent that I think it can be a bit confusing to determine what the contributions of the paper are. In the introduction, the authors claim to provide "mathematical proof that ... the nature of the problem being solved by grid cells is coding of trajectories in 2-D space using cell sequences. By doing so, we offer a specific answer to the question of why grid cell firing patterns are observed in the mammalian brain." This paper does not provide proof of what grid cells are doing to support behavior or provide the true answer as to why grid patterns are found in the brain. The authors offer some intriguing suggestions or proposals as to why this might be based on what hexagonal patterns could be good for, but I believe that the language should be clarified to be more in line with what the authors present and what the strength of their evidence is.
We thank this reviewer for this assessment. While there is ample experimental evidence demonstrating the importance of grid cells for path integration, we agree with this reviewer that there may be other computational functions that may require or largely benefit from the existence of grid cells. We now acknowledge the fact that we have provided a likely teleological cause for the emergence of grid cells and that there might be other causes for the emergence of grid cells. We have changed the wording in the abstract and discussion sections to acknowledge that our study does provide a likely teleological cause. We choose “likely” because the computational function – trajectory coding – from which grid maps emerge is very closely associated to path integration, which numerous experimental and theoretical studies associate with grid cell function.
Relatedly, the authors claim that they find a teleological reason for the existence of grid cells - that is, discover the function that they are used for. However, in the paper, they seem to instead assume a function based on what is known and generally predicted for grid cells (encode position), and then show that for this specific function, grid cells have several attractive properties.
We agree with this reviewer that we leveraged what is known about grid cells, in particular their importance for path integration, in finding a likely teleological cause. However, the major significance of our work is that we demonstrate that coding for spatial trajectories requires spatially periodic firing (grid cells).This is very different from assuming the existence of grid cells a priori and then showing that grid cells have attractive, if not optimal, properties for this function. If we had shown that grid cells optimized a code for trajectories, this reviewer would be correct: we would have suggested just another potential function of grid cells. Instead, we provide both proof and intuition that trajectory coding by cell sequences begets grid cells (not the other way around), thereby providing a likely teleological cause for the emergence of grid cells. As stated above, we clarified in the revised manuscript that we provide a likely teleological cause which requires additional experimental verification.
There is also some other work that seems very relevant, as it discusses specific computational advantages of a grid cell code but was not cited here: https://www.nature.com/articles/nn.2901.
We thank this reviewer for pointing us toward this article by (Sreenivasan and Fiete, 2011). The revised manuscript now cites this article in the Introduction and Discussion sections. We agree that the article by (Sreenivasan and Fiete, 2011) discusses a specific computational advantage of a population code by grid cells, namely unprecedented robustness to noise in estimating the location from the spiking information of noisy neurons. However, the work by (Sreenivasan and Fiete, 2011) differs from our work in that the authors assume the existence of grid cells a priori.
In addition, we now discuss other relevant work, namely work on the conformal isometry hypothesis by (Schøyen et al., 2024) and (Xu et al., 2024), published as pre-prints after publication of the first version of our manuscript, as well as work on transition scale- spaces by Nicolai Waniek. (Xu et al., 2024) and (Schøyen et al., 2024) investigate conformal isometry in the coding of space by grid cells. Conformal isometry means that trajectories in neural space map trajectories in physical space. (Xu et al., 2024) show that the conformal isometry hypothesis can explain the spatially periodic firing pattern of grid cells. (Schøyen et al., 2024) further show that a module of seven grid cells emerges if space is encoded as a conformal isometry, ensuring equal representation in all directions. While the work by (Xu et al., 2024) and (Schøyen et al., 2024) arrive at very similar conclusions as stated in the current manuscript, the conformal isometry hypothesis provides only a partial answer to why grid cells exist because it doesn’t explain why conformal isometry is important or required. In contrast, a sequence code of trajectories provides an intuitive answer to why such a code is important for animal behavior. Furthermore, we included the work by Nicolai Waniek, (2018, 2020) in the Discussion, who demonstrated that the hexagonal arrangement of grid fields is optimal for coding transitions in space.
The paragraph added to the Discussion reads as follows:
“As part of the proof that a trajectory code by cell sequences begets spatially periodic firing fields, we proved that the centers of the firing fields must be arranged in a hexagonal lattice. This arrangement implies that the neural space is a conformally isometric embedding of physical space, so that local displacements in neural space are proportional to local displacements of an animal or agent in physical space, as illustrated in Figure 5. This property has recently been introduced in the grid cell literature as the conformal isometry hypothesis(Schøyen et al., 2024; Xu et al., 2024). Strikingly, Schøyen et al.(Schøyen et al., 2024) arrive at similar if not identical conclusions regarding the geometric principles in the neural representations of space by grid cells.”
A second major weakness was that some of the claims in the section in which they compared their theory to data seemed either confusing or a bit weak. I am not a mathematician, so I was not able to follow all of the logic of the various axioms, remarks, or definitions to understand how the authors got to their final conclusion, so perhaps that is part of the problem. But below I list some specific examples where I could not follow why their theory predicted the experimental result, or how their theory ultimately operated any differently from the conventional understanding of grid cell coding. In some cases, it also seemed that the general idea was so flexible that it perhaps didn't hold much predictive power, as extra details seemed to be added as necessary to make the theory fit with the data.
I don't quite follow how, for at least some of their model predictions, the 'sequence code of trajectories' theory differs from the general attractor network theory. It seems from the introduction that these theories are meant to serve different purposes, but the section of the paper in which the authors claim that various experimental results are predicted by their theory makes this comparison difficult for me to understand. For example, in the section describing the effect of environmental manipulations in a familiar environment, the authors state that the experimental results make sense if one assumes that sequences are anchored to landmarks. But this sounds just like the classic attractornetwork interpretation of grid cell activity - that it's a spatial metric that becomes anchored to landmarks.
We thank this reviewer for giving us the opportunity to clarify in what aspects the ‘sequence code of trajectories’ theory of grid cell firing differs from the classic attractor network models, in particular the continuous attractor network (CAN) model. First of all, the CAN model is a mechanistic model of grid cell firing that is specifically designed to simulate spatially periodic firing of grid cells in response to velocity inputs. In contrast, the sequence code of trajectories theory of grid cell firing resembles a normative model showing that grid cells emerge from performing a specific function. However, in contrast to previous normative models, the sequence code of trajectories model grounds the emergence of grid cell firing in a mathematical proof and both geometric reasoning and intuition. The proof demonstrates that the emergence of grid cells is the only solution to coding for trajectories using cell sequences. The sequence code of trajectories model of grid cell firing is agnostic about the neural mechanisms that implements the sequence code in a population of neurons. One plausible implementation of the sequence code of trajectories is in fact a CAN. In fact, the sequence code of trajectories theory predicts conformal isometry in the CAN, i.e., a trajectory in neural space is proportional to a trajectory of an animal in physical space. However, other mechanistic implementations are possible. We have clarified how the sequence code of trajectories theory of grid cells relates to the mechanistic CAN models of grid cells.
We added the following text to the Discussion section:
“While the sequence code of trajectories-model of grid cell firing is agnostic about the neural mechanisms that implements the sequence code, one plausible implementation is a continuous attractor network (McNaughton et al., 2006; Burak and Fiete, 2009). Interestingly, a sequence code of trajectories begets conformal isometry in the attractor network, i.e., a trajectory in neural space is proportional to a trajectory of an animal in physical space.”
It was not clear to me why their theory predicted the field size/spacing ratio or the orientation of the grid pattern to the wall.
We thank this reviewer for bringing to our attention that we lacked a proper explanation for why the sequence code of trajectories theory predicts the field size/spacing ration in grid maps. We have modified/added the following text to the Results section of the manuscript to clarify this point:
“Because the sequence code of trajectories model of grid cell firing implies a dense packing of firing fields, the spacing between two adjacent grid fields must change linearly with a change in field size. It follows that the ratio between grid spacing and field size is fixed. When using the distance between the centers of two adjacent grid fields to measure grid spacing and a diameter-like metric to measure grid field size, we can compute the ratio of grid spacing to grid field size as √7≈2.65 (see Methods).”
We are also grateful for this reviewer’s correctly pointing out that the explanation as to why the sequence code of trajectories predicts a rotation of the grid pattern relative to a set of parallel walls in a rectangular environment. We have now made explicit the underlying premise that a sequence of firing fields from multiple grid cells are aligned in parallel to a nearby wall of the environment. We cite additional experimental evidence supporting this premise. Concretely, we quote Stensola and Moser summarizing results reported in (Stensola et al. 2015): “A surprising observation, however, was that modules typically assumed one of only four distinct orientation configurations relative to the environment” (Stensola and Moser, 2016). Importantly, all of the four distinct orientations show the characteristic angular rotation. Intriguingly, this is predicted by the sequence code of trajectories-model under the premise that a sequence of firing fields aligns with one of the geometric boundaries of the environment, as shown in Author response image 1 below.
Author response image 1.
Under the premise that a sequence of firing fields aligns with one of the geometric boundaries (walls) of a square arena, there are precisely four possible distinct configurations of orientations. This is precisely what has been observed in experiments (Stensola et al., 2015; Stensola and Moser, 2016).
We added clarifying language to the Results section: “Under the premise that a sequence of firing fields aligns with one of the geometric boundaries of the environment, the sequence code model explains that the grid pattern typically assume one of only four distinct orientation configurations relative to the environment41,46. Concretely, the four orientation configurations arise when one row of grid fields aligns with one of the two sets of parallel walls in a rectangular environment, and each arrangement can result in two distinct orientations (Figure 3B).”
I don't understand how repeated advancement of one unit to the next, as shown in Figure 4E, would cause the change in grid spacing near a reward.
In familiar environments, spatial firing fields of place cells in hippocampal CA1 and CA3 tend to shift backwards with experience (Mehta et al., 2000; Lee et al., 2004; Roth et al., 2012; Geiller et al., 2017; Dong et al., 2021). This implies that the center of place fields move closer to each other. A potential mechanism has been suggested, namely NMDA receptor-dependent longterm synaptic plasticity (Ekstrom et al., 2001). When we apply the same principle observed for place fields on a linear track to grid fields anchored to a reward zone, grid fields will “gravitate” towards the reward side. A similar idea has been presented by (Ginosar et al., 2023) who use the analogy of reward locations as “black holes”. In contrast to (Ginosar et al., 2023), who we cite multiple times, our idea unifies observations on place cells and grid cells in 1-D and 2-D environments and suggests a potential mechanism. We changed the wording in the revised manuscript and clarified the underlying premises.
I don't follow how this theory predicts the finding that the grid pattern expands with novelty. The authors propose that this occurs because the animals are not paying attention to fine spatial details, and thus only need a low-resolution spatial map that eventually turns into a higher-resolution one. But it's not clear to me why one needs to invoke the sequence coding hypothesis to make this point.
We agree with this reviewer that this point needs clarification. The sequence code model adds explanatory power to the hypothesis that the grid pattern in a novel environment reflects a lowresolution mapping of space or spatial trajectories because it directly links spatial resolution to both field size and spacing of a grid map. Concretely, the spatial resolution of the trajectory code is equivalent to the spacing between two adjacent spatial fields, and the spatial resolution is directly proportional to the grid spacing and field size. If one did not evoke the sequence coding hypothesis, one would need to explain how and why both spacing and field size are related to the spatial resolution of the grid map. Lastly, as written in the manuscript text, we point out that, while the experimentally observed expansion of grid maps is consistent with the sequence code of trajectory, it is not predicted by the theory without making further assumption.
The last section, which describes that the grid spacing of different modules is scaled by the square root of 2, says that this is predicted if the resolution is doubled or halved. I am not sure if this is specifically a prediction of the sequence coding theory the authors put forth though since it's unclear why the resolution should be doubled or halved across modules (as opposed to changed by another factor).
We agree with reviewer #2 that the exact value of the scaling factor is not predicted by the sequence coding theory. E.g., the sequence code theory does not explain why the spatial resolution doesn’t change by a factor 3 or 1.5 (resulting in changes in grid spacing by square root of 3 or square root of 1.5, respectively). We have changed the wording to reflect this important point. We further clarified in the revised manuscript that future work on multiscale representations using modules of grid cells needs to show why changing the spatial resolution across modules by a factor of 2 is optimal. Interestingly, a scale ratio of 2 is commonly used in computer vision, specifically in the context of mipmapping and Gaussian pyramids, to render images across different scales. Literature in the computer vision field describes why a scaling factor of 2 and the use of Gaussian filter kernels (compare with Gaussian firing fields) is useful in allowing a smooth and balanced transition between successive levels of an image pyramid (Burt and Adelson, 1983; Lindeberg, 2008). Briefly, larger factors (like 3) could result in excessive loss of detail between levels, while smaller factors (like 1.5) would not reduce the image size enough to justify additional levels of computation (that would come with the structural cost of having more grid cell modules in the brain). We have clarified these points in the Discussion section.
Reviewer #3 (Public Review):
The manuscript presents an intriguing explanation for why grid cell firing fields do not lie on a lattice whose axes aligned to the walls of a square arena. This observation, by itself, merits the manuscript's dissemination to the eLife's audience.
We thank this reviewer for their positive assessment.
The presentation is quirky (but keep the quirkiness!).
We kept the quirkiness.
But let me recast the problem presented by the authors as one of combinatorics. Given repeating, spatially separated firing fields across cells, one obtains temporal sequences of grid cells firing. Label these cells by integers from $[n]$. Any two cells firing in succession should uniquely identify one of six directions (from the hexagonal lattice) in which the agent is currently moving.
Now, take the symmetric group $\Sigma$ of cyclic permutations on $n$ elements. We ask whether there are cyclic permutations of $[n]$ such that
\left(\pi_{i+1} - \pi_i \right) \mod n \neq \pm 1 \mod n, \; \forall i.
So, for instance, $(4,2,3,1)$ would not be counted as a valid permutation of $(1,2,3,4)$, as $(2,3)$ and $(1,4)$ are adjacent.
Furthermore, given $[n]$, are there two distinct cyclic permutations such that {\em no} adjacencies are preserved when considering any pair of permutations (among the triple of the original ordered sequence and the two permutations)? In other words, if we consider the permutation required to take the first permutation into the second, that permutation should not preserve any adjacencies.
{\bf Key question}: is there any difference between the solution to the combinatorics problem sketched above and the result in the manuscript? Specifically, the text argues that for $n=7$ there is only {\em one} solution.
Ideally, one would strive to obtain a closed-form solution for the number of such permutations as a function of $n$.
This is a great question! We currently have a student working on describing all possible arrangements of firing fields (essentially labelings of the hexagonal lattice) that satisfy the axioms in 2D, and we expect that results on the number of such arrangements will come out of his work. We plan to publish those results separately, possibly targeting a more mathematical audience.
The argument above appears to only apply in the case that every row (and every diagonal) contains all of the elements 1,...,n. However, when n is not prime, there are often arrangements where rows and/or diagonals do not contain every element from 1,...,n. For example, some admissible patterns with 9 neurons have a repeat length of 3 in all directions (horizontally and both diagonals). As a result the construction listed here will not give a full count of all possible arrangements.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
I think the concise style of mathematical proof is both a curse and a blessing. While it delivers the message, I think the fluency and readability of the mathematical proof could be improved with longer paragraphs and some more editing.
We have added some clarifications in the text that we hope improve the readability.
Reviewer #3 (Recommendations For The Authors):
A minor qualm I have with the nomenclature:
On page 7:
“To prove this statement, suppose that row A consists of units $1, \dots , k$ repeating in this order. Then any row that contains any unit from $1, \dots, k$ must contain the full repeat $1, \dots , k$ by axiom 1. So any row containing any unit from $1,\dots , k$ is a translation of row A, and any unit that does not contain them is disjoint from row A.”
The last use of `unit' at the end of this paragraph instead of `row' is confusing. Technically, the authors have given themselves license to use this term by defining a unit to be “either to a single cell or a cell assembly”. Yet modern algebra tends to use `unit' as meaning a ring element that has an inverse.
We have renamed “unit” to “element” to avoid confusion with the terminology in modern algebra.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable investigation provides new and solid evidence for a specific cognitive deficit in cerebellar degeneration patients. The authors use three tasks that modulate complexity and error presence to show specific slowing of reaction times in the presence of errors but not with task complexity. While the authors interpret these findings as indicating that the cerebellum is required for the processing of violations of expectations, the exact patterns of results may suggest alternative interpretations. Nonetheless, the work provides a new, invaluable data point in describing the cognitive contribution of cerebellar processing.
-
Joint Public Review:
Summary:
In this study, Daniel et al. used three cognitive tasks to investigate behavioural signatures of cerebellar degeneration. In the first two tasks, the authors found that if an equation was incorrect, reaction times slowed significantly more for cerebellar patients than for healthy controls. In comparison, the slowing in the reaction times when the task required more operations was comparable to normal controls. In the third task, the authors show increased errors in cerebellar patients when they had to judge whether a letter string corresponded to an artificial grammar.
Strengths:
Overall, the work is methodologically sound and the manuscript well written. The data do show some evidence for specific cognitive deficits in cerebellar degeneration patients.
Weaknesses:
The current version has some weaknesses in the visual presentation of results. Overall, the study lacks a more precise discussion on how the patterns of deficits relate to the hypothesized cerebellar function.
The reviewers and the editor agreed that the data are interesting and point to a specific cognitive deficit in cerebellar patients. However, in the discussion, we were somewhat confused about the interpretation of the result:
If the cerebellum (as proposed in the introduction) is involved in forming expectations in a cognitive task, should they not show problems both in the expected (1+3 =4) and unexpected (1+3=2) conditions? Without having formed the correct expectation, how can you correctly say "yes" in the expected condition? No increase in error rate is observed - just slowing in the unexpected condition. But this increase in error rate was not observed. If the patients make up for the lack of prediction by using some other strategy, why are they only slowing in the unexpected case?
If the cerebellum is NOT involved in making the prediction, but only involved in detecting the mismatch between predicted and real outcome, why would the patients not show specifically more errors in the unexpected condition?
-
Author response:
Joint Public Review:
Summary:
In this study, Daniel et al. used three cognitive tasks to investigate behavioral signatures of cerebellar degeneration. In the first two tasks, the authors found that if an equation was incorrect, reaction times slowed significantly more for cerebellar patients than for healthy controls. In comparison, the slowing in the reaction times when the task required more operations was comparable to normal controls. In the third task, the authors show increased errors in cerebellar patients when they had to judge whether a letter string corresponded to an artificial grammar.
Strengths:
Overall, the work is methodologically sound and the manuscript well written. The data do show some evidence for specific cognitive deficits in cerebellar degeneration patients.
Thank you for the thoughtful summary and constructive feedback. We are pleased that the methodological rigor and clarity of the manuscript were appreciated, and that the data were recognized as providing meaningful evidence regarding cognitive deficits in cerebellar degeneration.
Weaknesses:
The current version has some weaknesses in the visual presentation of results. Overall, the study lacks a more precise discussion on how the patterns of deficits relate to the hypothesized cerebellar function. The reviewers and the editor agreed that the data are interesting and point to a specific cognitive deficit in cerebellar patients. However, in the discussion, we were somewhat confused about the interpretation of the result: If the cerebellum (as proposed in the introduction) is involved in forming expectations in a cognitive task, should they not show problems both in the expected (1+3 =4) and unexpected (1+3=2) conditions? Without having formed the correct expectation, how can you correctly say "yes" in the expected condition? No increase in error rate is observed - just slowing in the unexpected condition. But this increase in error rate was not observed. If the patients make up for the lack of prediction by using some other strategy, why are they only slowing in the unexpected case? If the cerebellum is NOT involved in making the prediction, but only involved in detecting the mismatch between predicted and real outcome, why would the patients not show specifically more errors in the unexpected condition?
Thank you for asking these important questions and initiating an interesting discussion. While decision errors and processing efficiency are not fully orthogonal and are likely related, they are not necessarily the same internal construct. The data from Experiments 1 and 2 suggest impaired processing efficiency rather than increased decision error. Reaction time slowing without increased error rates suggests that the CA group can form expectations but respond more slowly, possibly due to reduced processing efficiency. Thus, this analysis of our data can indicate that the cerebellum is not essential for forming expectations, but it plays a critical role in processing their violations.
Relatedly, two important questions remain open in the literature concerning the cerebellum’s role in expectation-related processes. The first is whether the cerebellum contributes to the formation of expectations or the processing of their violations. In Experiments 1 and 2, the CA group did not show impairments in the complexity manipulation. As mentioned by the editors, solving these problems requires the formation of expectations during the reasoning process. Given the intact performance of the CA group, these results suggest that they are not impaired in forming expectations. However, in both Experiments 1 and 2, patients exhibited selective impairments in solving incorrect problems compared to correct problems. Since expectation formation is required in both conditions, but only incorrect problems involve a violation of expectation (VE), we hypothesize that the cerebellum is involved in VE processes. We suggest that the CA group can form expectations in familiar tasks, but are impaired in processing unexpected compared to expected outcomes. This supports the notion that the cerebellum contributes to VE, rather than to forming expectations.
Importantly, while previous experimental manipulations(1–6) have provided important insights, some may have confounded these two internal constructs due to task design limitations (e.g., lack of baseline conditions). Notably, some of these previous studies did not include control conditions (e.g., correct trials) where there was no VE. In addition, other studies did not include a control measure (e.g., complexity effect), which limits their ability to infer the specific cerebellar role in expectation manipulation.
In addition to the editors’ question, we would like to raise a second important question regarding cerebellar contributions to expectations-related processes. While our findings point to a both unique and consistent cerebellar role in VE processes in sequential tasks, we do not aim to generalize this role to all forms of expectations(2,7,8). Another interesting process is how expectations are formed. Expectations can be formed by different processes(2,7,8), and this should be taken into account when defining cerebellar function. For instance, previous experimental paradigms(1–6), aiming to assess VE, utilized tasks that manipulated rule-based errors or probability-based errors, but did not fully dissociate these constructs. In our Experiments 1 and 2, we specifically manipulated error signals derived from previous top-down effects. However, in Experiment 3, the participant’s VE was derived from within-task processes. In Experiment 3, expectations were formed either by statistical learning or by rule-based learning. During the test stage, when evaluating sensitivity to correct and incorrect problems, the CA group showed deficits only when expectations were formed based on rules. These findings suggest that cerebellar patients may retain a general ability to form expectations. However, their deficit appears to be specific to processing rule-based VE, but not statistically derived VE. This pattern of results aligns with the results of Experiments 1 and 2 where the rules are known and based on pre-task knowledge.
We suggest that these two key questions are relevant to both motor and non-motor domains and were not fully addressed even in the previous, well-studied motor domain. Thus, the current experimental design used in three different experiments provides a valuable novel experimental perspective, allowing us to distinguish between some, but not all, of the processes involved in the formation of expectations and their violations. For instance, to our knowledge, this is the first study to demonstrate a selective impairment in rule-based VE processing in cerebellar patients across both numerical reasoning and artificial grammar tasks.
If feasible, we propose that future studies should disentangle different forms of VE by operationalizing them in experimental tasks in an orthogonal manner. This will allow us, as a scientific community, to achieve a more detailed, well-defined cerebellar motor and non-motor mechanistic account.
References
(1) Butcher, P. A. et al. The cerebellum does more than sensory prediction error-based learning in sensorimotor adaptation tasks. J. Neurophysiol. 118, 1622–1636 (2017).
(2) Moberget, T., Gullesen, E. H., Andersson, S., Ivry, R. B. & Endestad, T. Generalized role for the cerebellum in encoding internal models: Evidence from semantic processing. J. Neurosci. 34, 2871–2878 (2014).
(3) Riva, D. The cerebellar contribution to language and sequential functions: evidence from a child with cerebellitis. Cortex. 34, 279–287 (1998).
(4) Sokolov, A. A., Miall, R. C. & Ivry, R. B. The Cerebellum: Adaptive Prediction for Movement and Cognition. Trends Cogn. Sci. 21, 313–332 (2017).
(5) Fiez, J. A., Petersen, S. E., Cheney, M. K. & Raichle, M. E. Impaired non-motor learning and error detection associated with cerebellar damage. A single case study. Brain 115 Pt 1, 155–178 (1992).
(6) Taylor, J. A., Krakauer, J. W. & Ivry, R. B. Explicit and Implicit Contributions to Learning in a Sensorimotor Adaptation Task. J. Neurosci. 34, 3023–3032 (2014).
(7) Sokolov, A. A., Miall, R. C. & Ivry, R. B. The Cerebellum: Adaptive Prediction for Movement and Cognition. Trends Cogn. Sci. 21, 313–332 (2017).
(8) Fiez, J. A., Petersen, S. E., Cheney, M. K. & Raichle, M. E. IMPAIRED NON-MOTOR LEARNING AND ERROR DETECTION ASSOCIATED WITH CEREBELLAR DAMAGEA SINGLE CASE STUDY. Brain 115, 155–178 (1992).
(9) Picciotto, Y. De, Algon, A. L., Amit, I., Vakil, E. & Saban, W. Large-scale evidence for the validity of remote MoCA administration among people with cerebellar ataxia administration among people with cerebellar ataxia. Clin. Neuropsychol. 0, 1–17 (2024).
(10) Binoy, S., Monstaser-Kouhsari, L., Ponger, P. & Saban, W. Remote Assessment of Cognition in Parkinsons Disease and Cerebellar Ataxia: The MoCA Test in English and Hebrew. Front. Hum. Neurosci. 17, (2023).
(11) Saban, W. & Ivry, R. B. Pont: A protocol for online neuropsychological testing. J. Cogn. Neurosci. 33, 2413–2425 (2021).
(12) Algon, A. L. et al. Scale for the assessment and rating of ataxia : a live e ‑ version. J. Neurol. (2025). doi:10.1007/s00415-025-13071-7
(13) McDougle, S. D. et al. Continuous manipulation of mental representations is compromised in cerebellar degeneration. Brain 145, 4246–4263 (2022).
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study uses an innovative task design combined with eye tracking and fMRI to distinguish brain regions that encode the value of individual items from those that accumulate those values for value-based choices. It shows that distinct brain regions carry signals for currently evaluated and previously accumulated evidence. The study provides solid evidence in support of most of its claims, albeit with current minor weaknesses concerning the evidence in favour of gaze-modulation of the fMRI signal. The work will be of interest to neuroscientists working on attention and decision-making.
-
Reviewer #1 (Public review):
Summary:
This study builds upon a major theoretical account of value-based choice, the 'attentional drift diffusion model' (aDDM), and examines whether and how this might be implemented in the human brain using functional magnetic resonance imaging (fMRI). The aDDM states that the process of internal evidence accumulation across time should be weighted by the decision maker's gaze, with more weight being assigned to the currently fixated item. The present study aims to test whether there are (a) regions of the brain where signals related to the currently presented value are affected by the participant's gaze; (b) regions of the brain where previously accumulated information is weighted by gaze.
To examine this, the authors developed a novel paradigm that allowed them to dissociate currently and previously presented evidence, at a timescale amenable to measuring neural responses with fMRI. They asked participants to choose between bundles or 'lotteries' of food times, which they revealed sequentially and slowly to the participant across time. This allowed modelling of the haemodynamic response to each new observation in the lottery, separately for previously accumulated and currently presented evidence.
Using this approach, they find that regions of the brain supporting valuation (vmPFC and ventral striatum) have responses reflecting gaze-weighted valuation of the currently presented item, whereas regions previously associated with evidence accumulation (preSMA and IPS) have responses reflecting gaze-weighted modulation of previously accumulated evidence.
Strengths:
A major strength of the current paper is the design of the task, nicely allowing the researchers to examine evidence accumulation across time despite using a technique with poor temporal resolution. The dissociation between currently presented and previously accumulated evidence in different brain regions in GLM1 (before gaze-weighting), as presented in Figure 5, is already compelling. The result that regions such as preSMA respond positively to |AV| (absolute difference in accumulated value) is particularly interesting, as it would seem that the 'decision conflict' account of this region's activity might predict the exact opposite result. Additionally, the behaviour has been well modelled at the end of the paper when examining temporal weighting functions across the multiple samples.
Weaknesses:
The results relating to gaze-weighting in the fMRI signal could do with some further explication to become more complete. A major concern with GLM2, which looks at the same effects as GLM1 but now with gaze-weighting, is that these gaze-weighted regressors may be (at least partially) correlated with their non-gaze-weighted counterparts (e.g., SVgaze will correlate with SV). But the non-gaze-weighted regressors have been excluded from this model. In other words, the authors are not testing for effects of gaze-weighting of value signals *over and above* the base effects of value in this model. In my mind, this means that the GLM2 results could simply be a replication of the findings from GLM1 at present. GLM3 is potentially a stronger test, as it includes the value signals and the interaction with gaze in the same model. But here, while the link to the currently attended item is quite clear (and a replication of Lim et al, 2011), the link to previously accumulated evidence is a bit contorted, depending upon the interpretation of a behavioural regression to interpret the fMRI evidence. The results from GLM3 are also, by the authors' own admission, marginal in places.
-
Reviewer #2 (Public review):
Summary:
In this paper, the authors seek to disentangle brain areas that encode the subjective value of individual stimuli/items (input regions) from those that accumulate those values into decision variables (integrators) for value-based choice. The authors used a novel task in which stimulus presentation was slowed down to ensure that such a dissociation was possible using fMRI despite its relatively low temporal resolution. In addition, the authors leveraged the fact that gaze increases item value, providing a means of distinguishing brain regions that encode decision variables from those that encode other quantities such as conflict or time-on-task. The authors adopt a region-of-interest approach based on an extensive previous literature and found that the ventral striatum and vmPFC correlated with the item values and not their accumulation, whereas the pre-SMA, IPS, and dlPFC correlated more strongly with their accumulation. Further analysis revealed that the pre-SMA was the only one of the three integrator regions to also exhibit gaze modulation.
Strengths:
The study uses a highly innovative design and addresses an important and timely topic. The manuscript is well-written and engaging, while the data analysis appears highly rigorous.
Weaknesses:
With 23 subjects, the study has relatively low statistical power for fMRI.
-
Author response:
eLife Assessment
This important study uses an innovative task design combined with eye tracking and fMRI to distinguish brain regions that encode the value of individual items from those that accumulate those values for value-based choices. It shows that distinct brain regions carry signals for currently evaluated and previously accumulated evidence. The study provides solid evidence in support of most of its claims, albeit with current minor weaknesses concerning the evidence in favour of gaze-modulation of the fMRI signal. The work will be of interest to neuroscientists working on attention and decision-making.
We thank the Editor and Reviewers for their summary of the strengths of our study, and for their thoughtful review and feedback on our manuscript. We plan to undertake some additional analyses suggested by the Reviewers to bolster the evidence in favor of gaze-modulation of the fMRI signal.
Reviewer #1 (Public review):
Summary:
This study builds upon a major theoretical account of value-based choice, the 'attentional drift diffusion model' (aDDM), and examines whether and how this might be implemented in the human brain using functional magnetic resonance imaging (fMRI). The aDDM states that the process of internal evidence accumulation across time should be weighted by the decision maker's gaze, with more weight being assigned to the currently fixated item. The present study aims to test whether there are (a) regions of the brain where signals related to the currently presented value are affected by the participant's gaze; (b) regions of the brain where previously accumulated information is weighted by gaze.
To examine this, the authors developed a novel paradigm that allowed them to dissociate currently and previously presented evidence, at a timescale amenable to measuring neural responses with fMRI. They asked participants to choose between bundles or 'lotteries' of food times, which they revealed sequentially and slowly to the participant across time. This allowed modelling of the haemodynamic response to each new observation in the lottery, separately for previously accumulated and currently presented evidence.
Using this approach, they find that regions of the brain supporting valuation (vmPFC and ventral striatum) have responses reflecting gaze-weighted valuation of the currently presented item, whereas regions previously associated with evidence accumulation (preSMA and IPS) have responses reflecting gaze-weighted modulation of previously accumulated evidence.
Strengths:
A major strength of the current paper is the design of the task, nicely allowing the researchers to examine evidence accumulation across time despite using a technique with poor temporal resolution. The dissociation between currently presented and previously accumulated evidence in different brain regions in GLM1 (before gaze-weighting), as presented in Figure 5, is already compelling. The result that regions such as preSMA respond positively to |AV| (absolute difference in accumulated value) is particularly interesting, as it would seem that the 'decision conflict' account of this region's activity might predict the exact opposite result. Additionally, the behaviour has been well modelled at the end of the paper when examining temporal weighting functions across the multiple samples.
Thank you!
Weaknesses:
The results relating to gaze-weighting in the fMRI signal could do with some further explication to become more complete. A major concern with GLM2, which looks at the same effects as GLM1 but now with gaze-weighting, is that these gaze-weighted regressors may be (at least partially) correlated with their non-gaze-weighted counterparts (e.g., SVgaze will correlate with SV). But the non-gaze-weighted regressors have been excluded from this model. In other words, the authors are not testing for effects of gaze-weighting of value signals *over and above* the base effects of value in this model. In my mind, this means that the GLM2 results could simply be a replication of the findings from GLM1 at present. GLM3 is potentially a stronger test, as it includes the value signals and the interaction with gaze in the same model. But here, while the link to the currently attended item is quite clear (and a replication of Lim et al, 2011), the link to previously accumulated evidence is a bit contorted, depending upon the interpretation of a behavioural regression to interpret the fMRI evidence. The results from GLM3 are also, by the authors' own admission, marginal in places.
We thank the Reviewer for their thoughtful critique. We acknowledge that our formulation of GLM2 does not test for the effects of gaze-weighted value signals beyond the base effects of value, only in place of the base effects of value. In our revision, we plan to examine alternative ways of quantifying the relative importance of gaze in these results.
Reviewer #2 (Public review):
Summary:
In this paper, the authors seek to disentangle brain areas that encode the subjective value of individual stimuli/items (input regions) from those that accumulate those values into decision variables (integrators) for value-based choice. The authors used a novel task in which stimulus presentation was slowed down to ensure that such a dissociation was possible using fMRI despite its relatively low temporal resolution. In addition, the authors leveraged the fact that gaze increases item value, providing a means of distinguishing brain regions that encode decision variables from those that encode other quantities such as conflict or time-on-task. The authors adopt a region-of-interest approach based on an extensive previous literature and found that the ventral striatum and vmPFC correlated with the item values and not their accumulation, whereas the pre-SMA, IPS, and dlPFC correlated more strongly with their accumulation. Further analysis revealed that the pre-SMA was the only one of the three integrator regions to also exhibit gaze modulation.
Strengths:
The study uses a highly innovative design and addresses an important and timely topic. The manuscript is well-written and engaging, while the data analysis appears highly rigorous.
Weaknesses:
With 23 subjects, the study has relatively low statistical power for fMRI.
We thank the Reviewer for their comments on the strengths of the manuscript, and for highlighting an important limitation. We agree that the number of participants in the study, after exclusions, was lower than your typical fMRI study. However, it is important to note that we do have a lot of data for each subject. Due to our relatively fast, event-related design, we have on average 65 trials per subject (SD = 18) and 5.95 samples per trial (SD \= 4.03), for an average of 387 observations per subject (SD = 18). Our model-based analysis looks for very specific neural time courses across these ~387 observations, giving us substantial power to detect our effects of interest. Still, we acknowledge that our small number of subjects does still limit our power and our ability to generalize to other subjects. We plan to add the following disclaimer to the Discussion section:
“Together with our limited sample size (n = 23), we may not have had adequate statistical power required to observe consistent effects. Additional research with larger sample sizes is needed to resolve this issue.”
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides important insights into the CO₂-dependent activation of Cx43 hemichannels through a well-defined carbamylation motif, supported by multiple independent assays and validation in hippocampal tissue. The evidence convincingly demonstrates that increased pCO₂ enhances Cx43 hemichannel activity, which has potential implications for cellular signaling in cardiomyocytes and astrocytes. While further investigation is needed to fully elucidate the structural mechanisms, the findings offer a foundation for future research in gap junction biology and CO₂ regulation of proteins.
-
Reviewer #1 (Public review):
Summary:
This study builds on previous work demonstrating that several beta connexins (Cx26, Cx30, and Cx32) have a carbamylation motif which renders them sensitive to CO2. In response to CO2, hemichannels composed of these connexins open, enabling diffusion of small molecules (such as ATP) between the cytosol and extracellular environment. Here, the authors have identified that an alpha connexin, Cx43, also contains a carbamylation motif, and they demonstrate that CO2 opens Cx43 hemichannels. Most of the study involves using transfected cells expressing wild-type and mutant Cx43 to define amino acids required for CO2 sensitivity. Hippocampal tissue slices in culture were used to show that CO2-induced synaptic transmission was affected by Cx43 hemichannels, providing a physiological context. The authors point out that the Cx43 gene significantly diverges from the beta connexins that are CO2 sensitive, suggesting that the conserved carbamylation motif was present before the alpha and beta connexin genes diverged.
Strengths:
(1) The molecular analysis defining the amino acids that contribute to the CO2 sensitivity of Cx43 is a major strength of the study. The rigor of analysis was strengthened by using three independent assays for hemichannel opening: dye uptake, patch clamp channel measurements, and ATP secretion. The resulting analysis identified key lysines in Cx43 that were required for CO2-mediated hemichannel opening. A double K to E Cx43 mutant produced a construct that produced hemichannels that were constitutively open, which further strengthened the analysis.
(2) Using hippocampal tissue sections to demonstrate that CO2 can influence field excitatory postsynaptic potentials (fEPSPs) provides a native context for CO2 regulation of Cx43 hemichannels. Cx43 mutations associated with Oculodentodigital Dysplasia (ODDD) inhibited CO2-induced hemichannel opening, although the mechanism by which this occurs was not elucidated.
Weaknesses:
(1) Cx43 channels are sensitive to cytosolic pH, which will be affected by CO2. Cytosolic pH was not measured, and how this affects CO2-induced Cx43 hemichannel activity was not addressed.
(2) Cultured cells are typically grown in incubators containing 5% CO2, which is ~40 mmHg. It is unclear how cells would be viable if Cx43 hemichannels are open at this PCO2.
(3) Experiments using Gap26 to inhibit Cx43 hemichannels in fEPSP measurements used a scrambled peptide as a control. Analysis should also include Gap peptides specifically targeting Cx26, Cx30, and Cx32 as additional controls.
(4) The mechanism by which ODDD mutations impair CO2-mediated hemichannel opening was not addressed. Also, the potential roles for inhibiting Cx43 hemichannels in the pathology of ODDD are unclear.
(5) CO2 has no effect on Cx43-mediated gap junctional communication as opposed to Cx26 gap junctions, which are inhibited by CO2. The molecular basis for this difference was not determined.
(6) Whether there are other non-beta connexins that have a putative carbamylation motif was not addressed. Additional discussion/analysis of how the evolutionary trajectory for Cx43 maintaining a carbamylation motif is unique for non-beta connexins would strengthen the study.
-
Reviewer #2 (Public review):
Summary:
This paper examines the CO2 sensitivity of Cx43 hemichannels and gap junctional channels in transiently transfected Hela cells using several different assays, including ethidium dye uptake, ATP release, whole cell patch clamp recordings, and an imaging assay of gap junctional dye transfer. The results show that raising pCO2 from 20 to 70 mmHg (at a constant pH of 7.3) causes an increase in opening of Cx43 hemichannels but does not block Cx43 gap junctions. This study also showed that raising pCO2 from 20 to 35 mm Hg resulted in an increase in synaptic strength in hippocampal rat brain slices, presumably due to downstream ATP release, suggesting that the CO2 sensitivity of Cx43 may be physiologically relevant. As a further test of the physiological relevance of the CO2 sensitivity of Cx43, it was shown that two pathological mutations of Cx43 that are associated with ODDD caused loss of Cx43 CO2-sensitivity. Cx43 has a potential carbamylation motif that is homologous to the motif in Cx26. To understand the structural changes involved in CO2 sensitivity, a number of mutations were made in Cx43 sites thought to be the equivalent of those known to be involved in the CO2 sensitivity of Cx26, and the CO2 sensitivity of these mutants was investigated.
Strengths:
This study shows that the apparent lack of functional Cx43 hemichannels observed in a number of previous in vitro function studies may be due to the use of HEPES to buffer the external pH. When Cx43 hemichannels were studied in external solutions in which CO2/bicarbonate was used to buffer pH instead of HEPES, Cx43 hemichannels showed significantly higher levels of dye uptake, ATP release, and ionic conductance. These findings may have major physiological implications since Cx43 hemichannels are found in many organs throughout the body, including the brain, heart, and immune system.
Weaknesses:
(1) Interpretation of the site-directed mutation studies is complicated. Although Cx43 has a potential carbamylation motif that is homologous to the motif in Cx26, the results of site-directed mutation studies were inconsistent with a simple model in which K144 and K105 interact following carbamylation to cause the opening of Cx43 hemichannels.
(2) Secondly, although it is shown that two Cx43 ODDD-associated mutations show a loss of CO2 sensitivity, there is no evidence that the absence of CO2 sensitivity is involved in the pathology of ODDD.
-
Reviewer #3 (Public review):
In this paper, the authors aimed to investigate carbamylation effects on the function of Cx43-based hemichannels. Such effects have previously been characterized for other connexins, e.g., for Cx26, which display increased hemichannel (HC) opening and closure of gap junction channels upon exposure to increased CO2 partial pressure (accompanied by increased bicarbonate to keep pH constant).<br /> The authors used HeLa cells transiently transfected with Cx43 to investigate CO2-dependent carbamylation effects on Cx43 HC function. In contrast to Cx43-based gap junction channels that are reported here to be insensitive to PCO2 alterations, they provide evidence that Cx43 HC opening is highly dependent on the PCO2 pressure in the bath solution, over a range of 20 up to 70 mmHg encompassing the physiologically normal resting level of around 40 mmHg. They furthermore identified several Cx43 residues involved in Cx43 HC sensitivity to PCO2: K105, K109, K144 & K234; mutation of 2 or more of these AAs is necessary to abolish CO2 sensitivity. The subject is interesting and the results indicate that a fraction of HCs is open at a physiological 40 mmHg PCO2, which differs from the situation under HEPES buffered solutions where HCs are mostly closed under resting conditions. The mechanism of HC opening with CO2 gassing is linked to carbamylation, and the authors pinpointed several Lys residues involved in this process.
Overall, the work is interesting as it shows that Cx43 HCs have a significant open probability under resting conditions of physiological levels of CO2 gassing, probably applicable to the brain, heart, and other Cx43 expressing organs. The paper gives a detailed account of various experiments performed (dye uptake, electrophysiology, ATP release to assess HC function) and results concluded from those. They further consider many candidate carbamylation sites by mutating them to negatively charged Glu residues. The paper ends with hippocampal slice work showing evidence for connexin-dependent increases of the EPSP amplitude that could be inhibited by HC inhibition with Gap26 (Figure 10). Another line of evidence comes from the Cx43-linked ODDD genetic disease, whereby L90V as well as the A44V mutations of Cx43 prevented the CO2-induced hemichannel opening response (Figure 11). Although the paper is interesting, in its present state, it suffers from (i) a problematic Figure 3, precluding interpretation of the data shown, and (ii) the poor use of hemichannel inhibitors that are necessary to strengthen the evidence in the crucial experiment of Figure 2 and others.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents a valuable finding that KDM5 inhibition activates the interferon response and antigen presentation genes in breast cancer cells through R-loops. The evidence supporting the claims of the authors is solid, although the inclusion of further in vivo studies displaying the effects of KDM5 inhibitors on the immunotherapy responses of breast tumors would have strengthened the study. The work will be of interest to scientists working in the field of breast cancer immunotherapy.
-
Reviewer #1 (Public review):
Summary:
In this manuscript, Lau et al reported that KDM5 inhibition in luminal breast cancer cells results in R-loop-mediated DNA damage, reduced cell fitness and an increase in ISG and AP signatures as well as cell surface Major Histocompatibility Complex (MHC) class I, mediated by RNA:DNA hybrid activation of the CGAS/STING pathway.
Strengths:
More importantly, they have shown that KDM5 inhibition does not result in DNA damage or activation of the CGAS/STING pathway in normal breast epithelial cells. This suggests that KDM5 inhibitors may enable a wide therapeutic window in this setting, as compared to STING agonists or Type I Interferons. Their findings provide new insights into the interplay between epigenetic regulation of genomic repeats, R-loop formation, innate immunity, and cell fitness in the context of cancer evolution and therapeutic vulnerability.
Weaknesses:
More thorough analyses would be appreciated.
-
Reviewer #2 (Public review):
Summary:
In this manuscript, the authors investigated how the type-I interferon response (ISG) and antigen presentation (AP) pathways are repressed in luminal breast cancer cells and how this repression can be overcome. They found that a STING agonist can reactivate these pathways in breast cancer cells, but it also does so in normal cells, suggesting that this is not a good way to create a therapeutic window. Depletion of ADAR and inhibition of KDM5 also activate ISG and AP genes. The activation of ISG and AP genes is dependent on cGAS/STING and the JAK kinase. Interestingly, although both ADAR depletion and KDM5 inhibition activate ISG and AP genes, their effects on cell fitness are different. Furthermore, KDM5 inhibitor selectively activates ISG and AP genes in tumor cells but not normal cells, arguing that it may create a larger therapeutic window than the STING agonist. These results also suggest that KDM5 inhibition may activate ISG and AP genes in a way different from ADAR loss, and this process may affect tumor cell fitness independently of the activation of ISG and AP genes.
The authors further showed that KDM5 inhibition increases R-loops and DNA damage in tumor cells, and XPF, a nuclease that cuts R-loops, is required for the activation of ISG and AP genes. Using H3K4me3 CUT&RUN, they found that KMD5 inhibition results in increased H3K4me3 not only at genes, but also at repetitive elements including SINE, LINE, LTR, telomeres, and centromeres. Using S9.6 CUT&TAG, they confirmed that R-loops are increased at SINE, LINE, and LTR repeated with increased H3K4me3. Together, the results of this study suggest that KMD5 inhibition leads to H3K4me3 and R-loop accumulation in repetitive elements, which induces DNA damage and cGAS/STING activation and subsequently activates AP genes. This provides an exciting approach to stimulate the anti-tumor immunity against breast tumors.
KDM5 inhibition activates interferon and antigen presentation genes through R-loops.
Strengths:
Overall, this study was carefully designed and executed. This is a new approach to make breast tumors "hot" for anti-tumor immunity.
Weaknesses:
Future in vivo studies are needed to show the effects of KDM5 inhibitors on the immunotherapy responses of breast tumors.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study shows that nutrient stress emanating from the microenvironment induces metabolic vulnerabilities in pancreatic ductal adenocarcinoma (PDAC). Using a combination of cell-based and mouse models, the authors provide compelling evidence showing that arginine restriction in the microenvironment disrupts lipid homeostasis in PDAC, resulting in the induction of ferroptosis upon exposure of tumors to polyunsaturated fatty acids. This report is likely to be of broad interest to researchers interested in studying cancer biology, tumor microenvironment, metabolism, and stress adaptation mechanisms.
-
Reviewer #1 (Public review):
Summary:
In this study, the authors set out to define how arginine availability regulates lipid metabolism and to explore the implications of this relationship in pancreatic ductal adenocarcinoma (PDAC), a tumor type known to exist in an arginine-poor microenvironment. Using a combination of rigorous genetic and metabolomic approaches, they uncover a previously underappreciated role for arginine in maintaining lipid homeostasis. Importantly, they demonstrate that arginine deprivation sensitizes PDAC cells to ferroptosis through lipidome perturbations, which can be exploited therapeutically via co-treatment with aESA and ferroptosis inducers (FINs). These findings have meaningful implications for the field. They not only shed light on the metabolic vulnerabilities created by nutrient restriction in PDAC, but also suggest a practical avenue for combination therapies that exploit ferroptosis sensitivity. This is particularly relevant in the context of pancreatic cancer, which is notoriously resistant to conventional treatments. The methods employed are broadly applicable to other nutrient-stress contexts and may inspire similar investigations in other solid tumor types.
Strengths:
One of the major strengths of the study is the use of complementary and well-controlled approaches-including metabolomic profiling, genetic perturbations, and in vivo models-to support the central hypothesis. The experiments are thoughtfully designed and clearly presented, and the conclusions are, for the most part, well supported by the data. The findings provide mechanistic insight into nutrient-lipid crosstalk and identify a potential therapeutic strategy for targeting arginine-deprived tumors.
Weaknesses:
A key weakness of the study lies in the mechanistic connection between arginine levels and SREBP1 activation. While the authors show that arginine restriction leads to reduced SREBP1 expression, the magnitude of this effect appears modest relative to the substantial changes observed in the lipidome. The study would benefit from a deeper analysis of SREBP1 regulation-particularly whether nuclear translocation or activation is affected. This could be addressed by examining the nuclear pool of SREBP1, using either subcellular fractionation or improved immunofluorescence imaging in both cell lines and tissue samples.
Another area where additional context would strengthen the manuscript is in the transcriptomic profiling of PDAC cells cultured in a tumor interstitial fluid mimic (TIFM). While the study emphasizes lipid-related pathways, highlighting the most significantly upregulated and downregulated pathways in Figure 1B would give readers a broader perspective on how arginine restriction reprograms the PDAC transcriptome. For instance, because polyamines are downstream of arginine and are known to influence lipid metabolism, it would be worth discussing whether these metabolites contribute to the phenotypes observed. Similarly, an evaluation of whether Dgat1/2 expression is altered could help delineate the full scope of lipid metabolic rewiring.
Finally, it is worth noting that the KPC mouse model used in this study is based on conditional deletion of p53, which leads to faster-growing tumors and a distinct tumor microenvironment compared to models harboring the p53^R172H point mutation. Including a brief discussion of this distinction would help readers contextualize the translational relevance of the findings.
-
Reviewer #2 (Public review):
This study by Jonker et al. examines how the metabolic adaptations to the microenvironment by pancreatic ductal adenocarcinomas (PDAC) present vulnerabilities that could be used for therapeutic purposes. The evidence supporting the claims of the authors is mostly solid, and the multiplicity of models used, as well as the combination of in vitro and in vivo work, are appreciated, but some conclusions would benefit from additional substantiation. This work would be of interest to biologists working on the impact of microenvironment and metabolism in cancer, and especially those investigating pancreatic cancer.
In this study, the authors use mostly "doublings per day" as an indicator of cell death, notably for Figures 4 to 6. However, proliferative arrest (or a decrease in the proliferative rate) is not necessarily synonymous with cell death. It might be nice to complement these experiments with a true measure of cell death (e.g., PI uptake).
The composition of Tumor Interstitial Fluid Medium (TIFM) was published previously, but nonetheless a reminder of the composition of this medium in a Supplemental file of this study might be helpful. In particular, at the start of the Results section, the nature of serum/lipids in the different media should be specifically noted, especially given that the subsequent focus of the work is on lipids/SREBP. It is known that differences in the extracellular availability of lipids can profoundly alter de novo lipid biosynthesis pathways.
-
Reviewer #3 (Public review):
This important study investigates the impact of nutrient stress in the tumor microenvironment (TME), focusing on lipid metabolism in pancreatic ductal adenocarcinoma (PDAC).
Understanding TME composition is crucial, as it highlights cancer vulnerabilities independent of intracellular mutations, particularly because PDAC tumors are often exposed to limited nutrient availability due to reduced perfusion.
By utilizing a medium that mimics the nutrient conditions of PDAC tumors, the authors convincingly show that TME nutrient stress suppresses SREBP1, leading to reduced lipid synthesis, with low arginine levels identified as a key driver of this suppression. Importantly, mice with arginine-starved pancreatic tumors respond to a polyunsaturated fatty acid-rich diet. This discovery uncovers a synthetic lethal interaction in the tumor microenvironment that could be leveraged through dietary interventions.
The conclusions of this paper are mostly well supported by data; however, below are some aspects that could be further clarified.
This study uses PDAC cells from the LSL-Kras G12D/+ ; Trp53 ; Pdx-1-Cre PDAC model. The authors convincingly demonstrate that the cell-extrinsic stimuli of low arginine availability suppress lipid synthesis and thus exert a dominant effect over the cell-intrinsic oncogenic Ras mutation, which is known to enhance fatty acid synthesis. Could the effect of low arginine on lipid synthesis be specific for certain mutations in PDAC? It would be interesting to investigate or discuss whether different mutations show the same SREBP1 reduction caused by low arginine levels, and whether these low SREBP1 levels can be ameliorated by arginine re-supplementation. Here, Jonker et al. show that human PDAC cells cultured in TIFM have reduced SREBP1 levels (Figure 1 - Figure supplement 1C). It would be further supportive of their conclusions if the authors could show that arginine re-supplementation is sufficient to restore SREBP1 levels in human PDAC cells.
The authors demonstrate that mPDAC cells cultured in RPMI and subsequently implanted into an orthotopic mouse model exhibit reduced expression of SREBP target genes when compared to in vitro cultured mPDAC-RPMI cells. This finding is in line with the observation that culturing PDAC cells in TIFM downregulates SREBP target genes compared to PDAC cells cultured in RPMI. However, caution is needed when directly comparing mPDAC-RPMI cultured cells to those in the orthotopic model, as the latter may include non-tumor cells and additional factors that could confound the results. The authors should explicitly acknowledge this limitation in their study.
The in vivo evidence demonstrating that PUFA-rich tung oil reduces tumor size is compelling. However, the specific in vitro findings regarding its impact on doubling rates per day, particularly in the context of arginine-dependent PUFA supplementation, require further explanation. To enhance the robustness of their data and conclusions, the authors could consider conducting additional cell viability and proliferation assays. Moreover, it would be valuable to assess whether the observed effects on doubling rates per day remain significant after normalizing the data to the initial doubling time prior to PUFA supplementation. This is in particular important regarding the statement that "Addition of arginine significantly decreases sensitivity to a-ESA" as these cells already start with a higher doubling rate prior to a-ESA treatment.
Overall, this paper presents a compelling study that significantly enhances our understanding of the PDAC tumor microenvironment and its complex interactions with the tumor lipid metabolism.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
Using highly sophisticated switching linear dynamical systems (SLDS) analyses applied to functional MRI data, this study provides important insights into network dynamics underlying threat processing. After identifying distinct neural network states associated with varying levels of threat proximity, the paper provides compelling evidence of intrinsically and extrinsically driven contributions to these within-state dynamics and between-state transitions. Although the findings could be made more biologically meaningful, this work will be of interest to a wider functional neuroimaging and systems neuroscience community.
-
Reviewer #1 (Public review):
Summary:
The manuscript uses state-of-the-art analysis technology to document the spatio-temporal dynamics of brain activity during the processing of threats. The authors offer convincing evidence that complex spatio-temporal aspects of brain dynamics are essential to describe brain operations during threat processing.
Strengths:
Rigorous complex analyses well suited to the data.
Weaknesses:
Lack of a simple take-home message about discovery of a new brain operation.
Comments on revisions:
The authors have improved the presentation of the work. The overstatements about existing models of brain functions ignoring exogenous components remains largely unaddressed. The clarifications and improvements provided in revision confirm my assessment of the paper.
-
Reviewer #2 (Public review):
Summary:
This paper by Misra and Pessoa uses switching linear dynamical systems (SLDS) to investigate the neural network dynamics underlying threat processing at varying levels of proximity. Using an existing dataset from a threat-of-shock paradigm in which threat proximity is manipulated in a continuous fashion, the authors first show that they can identify states that each have their own linear dynamical system and are consistently associated with distinct phases of the threat-of-shock task (e.g., "peri-shock", "not near", etc). They then show how activity maps associated with these states are in agreement with existing literature on neural mechanisms of threat processing, and how activity in underlying brain regions alters around state transitions. The central novelty of the paper lies in its analyses of how intrinsic and extrinsic factors contribute to within-state trajectories and between-state transitions. Additional analyses furthermore show how individual brain regions contribute to state dynamics. Finally, the authors show how their findings generalize to another (related) threat paradigm.
Strengths:
The analyses for this study are conducted at a very high level of mathematical and theoretical sophistication. The paper is very well written and effectively communicates complex concepts from dynamical systems. The paper provides valuable neuroscientific insights into threat processing, and the methodology has potential to deepen our understanding at a neurobiological level in future work.
Weaknesses:
I was somewhat disappointed initially by the level of inferences made by the authors based on their analyses at the level of systems neuroscience. After revision this has improved, for instance with inclusion of analyses on the importance of individual brain regions to state dynamics, but I still believe the findings can be made more biologically meaningful, for instance by focusing on what we learn from these sophisticated analyses beyond what is already known from more conventional methodologies. However, the paper as it stands is solid scientific work and such efforts may also be left to future work.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer 1 (Public Review):
Summary
The manuscript uses state-of-the-art analysis technology to document the spatio-temporal dynamics of brain activity during the processing of threats. The authors offer convincing evidence that complex spatio-temporal aspects of brain dynamics are essential to describe brain operations during threat processing.
Strengths
Rigorous complex analyses well suited to the data.
Weaknesses
Lack of a simple take-home message about discovery of a new brain operation.
We have addressed the concern under response to item 1 in Recommendations for the authors of Reviewer 2 below.
Reviewer 1 (Recommendations for the authors):
The paper presents sophisticated analyses of how the spatiotemporal activity of the brain processes threats. While the study is elegant and relevant to the threat processing literature, it could be improved by better clarification of novelty, scope, assumptions and implications. Suggestions are reported below.
(1) Introduction: It is difficult to understand what is unsatisfactory in the present literature and why we need this study. For example, lines 57-64 report what works well in the work of Anderson and Fincham but do not really describe what this approach lacks, either in failing to explain real data in conceptual terms.
We have edited the corresponding lines to better describe what such approaches generally lack:
Introduction; Lines 63-66: However, the mapping between brain signals and putative mental states (e.g., “encoding”) remained speculative. More generally, state-based modeling of fMRI data would benefit from evaluation in contexts where the experimental paradigm affords a clearer mapping between discovered states and experimental manipulation.
(2) Also, based on the introduction it is unclear if the focus is on understanding the processing of threat or in the methodological development of experimental design and analysis paradigms for more ecologically valid situations.
In our present work, we tried to focus on understanding dynamics of threat processing while also contributing to methodological development of analysis of dynamic/ecologically inspired experiments. To that end, we have added a new paragraph at the end of Introduction to clarify the principal focus of our work:
Introduction; Lines 111-118: Is the present contribution focused on threat processing or methodological developments for the analysis of more continuous/ecologically valid paradigms? Our answer is “both”. One goal was to contribute to the development of a framework that considers brain processing to be inherently dynamic and multivariate. In particular, our goal was to provide the formal basis for conceptualizing threat processing as a dynamic process (see (Fanselow and Lester, 1987)) subject to endogenous and exogenous contributions. At the same time, our study revealed how regions studied individually in the past (e.g., anterior insula, cingulate cortex) contribute to brain states with multi-region dynamics.
(3) The repeated statement, based on the Fiete paper, that most analyses or models of brain activity do not include an exogenous drive seems an overstatement. There is plenty of literature that not only includes exogenous drives but also studies and documents them in detail. There are many examples, but a prominent one is the study of auditory processing. Essentially all human brain areas related to hearing (not only the activity of individual areas but also their communication) are entrained by the exogenous drive of speech (e.g. J. Gross et al, PLoS Biology 11 e1001752, 2013).
We have altered the original phrasing, which now reads as:
Introduction; Lines 93-95: Importantly, we estimated both endogenous and exogenous components of the dynamics, whereas some past work has not modeled both contributions (see discussion in (Khona and Fiete, 2022)).
Discussion; Lines 454-455: Work on dynamics of neural circuits in systems neuroscience at times assumes that the target circuit is driven only by endogenous processes (Khona and Fiete, 2022).
(4) Attractor dynamics is used as a prominent descriptor of fMRI activity, yet the discussion of how this may emerge from the interaction between areas is limited. Is it related to the way attractors emerge from physical systems or neural networks (e.g. Hopfield?).
This is an important question that we believe will benefit from computational and mathematical modeling, but we consider it beyond the scope of the present paper.
(5) Fig 4 shows activity of 4 regions, not 2 s stated in lines 201-202. Correct?
Fig. 4 shows activity of two regions and also the average activity of regions belonging to two resting-state networks engaged during threat processing (discussed shortly after lines 201-202). To clarify the above concern, we have changed the following line:
Results; Lines 228-230: In Fig. 4, we probed the average signals from two resting-state networks engaged during threat-related processing, the salience network which is particularly engaged during higher threat, and the default network which is engaged during conditions of relative safety.
(6) It would be useful to state more clearly how Fig 7B, C differs from Fig 2A, B (my understanding it is that in the former they are isolating the stimulus-driven processes)
We have clarified this by adding the following line in the Results:
Results; Lines 290-292: Note that in Fig. 7B/C we evaluated exogenous contributions only for stimuli associated with each state/state transition reported in Fig. 2A/B (see also Methods).
Reviewer 2 (Public Review):
Summary
This paper by Misra and Pessoa uses switching linear dynamical systems (SLDS) to investigate the neural network dynamics underlying threat processing at varying levels of proximity. Using an existing dataset from a threat-of-shock paradigm in which threat proximity is manipulated in a continuous fashion, the authors first show that they can identify states that each has their own linear dynamical system and are consistently associated with distinct phases of the threat-of-shock task (e.g., “peri-shock”, “not near”, etc). They then show how activity maps associated with these states are in agreement with existing literature on neural mechanisms of threat processing, and how activity in underlying brain regions alters around state transitions. The central novelty of the paper lies in its analyses of how intrinsic and extrinsic factors contribute to within-state trajectories and betweenstate transitions. A final set of analyses shows how the findings generalize to another (related) threat paradigm.
Strengths
The analyses for this study are conducted at a very high level of mathematical and theoretical sophistication. The paper is very well written and effectively communicates complex concepts from dynamical systems. I am enthusiastic about this paper, but I think the authors have not yet exploited the full potential of their analyses in making this work meaningful toward increasing our neuroscientific understanding of threat processing, as explained below.
Weaknesses
(1) I appreciate the sophistication of the analyses applied and/or developed by the authors. These methods have many potential use cases for investigating the network dynamics underlying various cognitive and affective processes. However, I am somewhat disappointed by the level of inferences made by the authors based on these analyses at the level of systems neuroscience. As an illustration consider the following citations from the abstract: “The results revealed that threat processing benefits from being viewed in terms of dynamic multivariate patterns whose trajectories are a combination of intrinsic and extrinsic factors that jointly determine how the brain temporally evolves during dynamic threat” and “We propose that viewing threat processing through the lens of dynamical systems offers important avenues to uncover properties of the dynamics of threat that are not unveiled with standard experimental designs and analyses”. I can agree to the claim that we may be able to better describe the intrinsic and extrinsic dynamics of threat processing using this method, but what is now the contribution that this makes toward understanding these processes?
We have addressed the concern under response to item 1 in Recommendations for the authors below.
(2) How sure can we be that it is possible to separate extrinsically and intrinsically driven dynamics?
We have addressed the concern under response to item 2 in Recommendations for the authors below.
Reviewer 2 (Recommendations for the authors):
(1) To address the first point under weaknesses above: I would challenge the authors to make their results more biologically/neuroscientifically meaningful, in particular in the sections (in results and/or discussion) on how intrinsic and extrinsic factors contribute to within-state trajectories and between-state transitions, and make those explicit in both the abstract and the discussion (what exactly are the properties of the dynamics of threat that are uncovered?). The authors may also argue that the current approach lies the groundwork for such efforts, but does not currently provide such insights. If they would take this position, that should be made explicit throughout (which would make it more of a methodological paper).
The SLDS approach provides, we believe, a powerful framework to describe system-level dynamics (of threat processing in the the present case). A complementary type of information can be obtained by studying the contribution of individual components (brain regions) within the larger system (brain), an approach that helps connect our approach to studies that typically focus on the contributions of individual regions, and contributes to providing more neurobiological interpretability to the results. Accordingly, we developed a new measure of region importance that captured the extent to which individual brain regions contributed to driving system dynamics during a given state.
Abstract; Lines 22-25: Furthermore, we developed a measure of region importance that quantifies the contributions of an individual brain region to system dynamics, which complements the system-level characterization that is obtained with the state-space SLDS formalism.
Introduction; Lines 95-99: A considerable challenge in state-based modeling, including SLDS, is linking estimated states and dynamics to interpretable processes. Here, we developed a measure of region importance that provides a biologically meaningful way to bridge this gap, as it quantifies how individual brain regions contribute to steering state trajectories.
Results; Lines 302-321: Region importance and steering of dynamics: Based on time series data and input information, the SLDS approach identifies a set of states and their dynamics. While these states are determined in the latent space, they can be readily mapped back to the brain, allowing for the characterization of spatiotemporal properties across the entire brain. Since not all regions contribute equally to state properties, we propose that a region’s impact on state dynamics serves as a measure of its importance.
We illustrate the concept for STATE 5 (“near miss”) in Fig. 8 (see Fig. S17 for all states). Fig. 8A shows importance in the top row and activity below as a function of time from state entry.The dynamics of importance and activity can be further visualized (Fig. 8B), where some regions of particularly high importance are illustrated together with the ventromedial PFC, a region that is typically not engaged during high-threat conditions. Notably, the importance of the dorsal anterior insula increased quickly in the first time points, and later decreased. In contrast, the importance of the periaqueductal gray was relatively high from the beginning of the state and decreased moderately later.
Fig. 8C depicts the correlation between these measures as a function of time. For all but STATE 1, the correlation increased over time. Interestingly, for STATES 4-5, the correlation was low at the first and second time points of the state (and for STATE 2 at the first time point), and for STATE 3 the measures were actually anticorrelated; both cases indicate a dissociation between activity and importance. In summary, our results illustrate that univariate region activity can differ from multivariate importance, providing a fruitful path to understand how individual brain regions contribute to collective dynamic properties.
Discussion; Lines 466-487: In the Introduction, we motivated our study in terms of determining multivariate and distributed patterns of activity with shared dynamics. At one end of the spectrum, it is possible to conceptualize the whole brain as dynamically evolving during a state; at the other end, we could focus on just a few “key” regions, or possibly a single one (at which point the description would be univariate). Here, we addressed this gap by studying the importance of regions to state dynamics: To what extent does a region steer the trajectory of the system? From a mathematical standpoint, our proposed measure is not merely a function of activity of a region but also of the coefficients of the dynamics matrix capturing its effect on across-region dynamics (Eichler, 2005; Smith et al., 2010).
How distributed should the dynamics of threat be considered? One answer to this question is to consider the distribution of importance values for all states. For STATE 1 (“post shock”), a few regions displayed the highest importance values for a few time points. However, for the other states the distribution of importance values tended to be more uniform at each time point. Thus, based on our proposed importance measure, we conclude that threat-related processing is profitably viewed as substantially distributed. Furthermore, we found that while activity and importance were relatively correlated, they could also diverge substantially. Together, we believe that the proposed importance measure provides a valuable tool for understanding the rich dynamics of threat processing. For example, we discovered that the dorsal anterior insula is important not only during high-anxiety states (such as STATE 5; “near miss”) but also, surprisingly, for a state that followed the aversive shock event (STATE 1; “post shock”). Additionally, we noted that posterior cingulate cortex, widely known to play a central role in the default mode network, to have the highest importance among all other regions in driving dynamics of low-anxiety states (such as STATE 3 and STATE 4; “not near”).
Methods; Lines 840-866: Region importance We performed a “lesion study”, where we quantified how brain regions contribute to state dynamics by eliminating (zeroing) model parameters corresponding to a given region, and observing the resulting changes in system dynamics. According to our approach, the most important regions are those that cause the greatest change in system dynamics when eliminated.
The SLDS model represents dynamics in a low dimensional latent space and model parameters are not readily available at the level of individual regions. Thus, the first step was to project the dynamics equation onto the brain data prior to computing importance values. Thus, the linear dynamics equation in the latent space (Eq. 2) was mapped to the original data space of N = 85 ROIs using the emissions model (Eq. 1):
where C<sup>†</sup> represents the Moore-Penrose pseudoinverse of C, and
and
denote the corresponding dynamics matrix, input matrix, and bias terms in the original data space.
Based on the above, we defined the importance of the i<sup>th</sup> ROI at time t based on quantifying the impact of “lesioning” the i<sup>th</sup> ROI, i.e., by setting the i<sup>th</sup> column of
, the i<sup>th</sup> row of
, and the i<sup>th</sup> element of
to 0, denoted
,
, and
respectively. Formally, the importance of the i<sup>th</sup> ROI
was defined as:
where ‘∗’ indicates element-wise multiplication of a scalar with a vector,
is the activity of i<sup>th</sup> ROI at time
corresponds to the i<sup>th</sup> column of
is the inner product between i<sup>th</sup> row of
and input
corresponds to the i<sup>th</sup> element of
and
represents an indicator vector corresponding to the i<sup>th</sup> ROI. Note that the term is a function of both the i<sup>th</sup> ROI’s activity as well as the coefficients of the dynamics matrix capturing the effect of region i on the one-step dynamics of the entire system (Eichler, 2005; Smith et al., 2010); the remaining terms capture the effect of the external inputs and the bias term on the one-step dynamics of the i<sup>th</sup> ROI.
After computing
for a given run, the resultant importance time series was normalized to zero mean and unit variance.
(2) To address the second point under the weaknesses above: Given that the distinction between intrinsic and extrinsic dynamics appears central to the novelty of the paper, I would suggest the authors explicitly address this issue in the introduction and/or discussion sections.
The distinction between intrinsic and extrinsic dynamics is a modeling assumption of SLDS. We used such an assumption because in experimental designs with experimenter manipulated inputs one can profitably investigate both types of contribution to dynamics. While we should not reify the model’s assumption, we can gain confidence in our separation of extrinsically and intrinsically driven dynamics through controlled experiments where we can manipulate external inputs, or by demonstrating time-scale separation of intrinsic and extrinsic dynamics and that they operate at different frequencies. This is an important question that requires additional computational/mathematical modeling, but we consider it beyond the scope of the current paper. We have added the following lines in the discussion section:
Discussion; Lines 521-528: A further issue that we wish to discuss is related to the distinction between intrinsic and extrinsic dynamics, which is explicitly modeled in our SLDS approach (see Methods, equation 2). We believe this is a powerful approach because in experimental designs with experimenter manipulated inputs, one can profitably investigate both types of contribution to dynamics. However, complete separation between intrinsic and extrinsic dynamics is challenging to ascertain. More generally, one can gain confidence in their separation through controlled experiments where external inputs are manipulated, or by demonstrating timescale separation of intrinsic and extrinsic dynamics.
(3) In the abstract, the statement “.. studies in systems neuroscience that frequently assume that systems are decoupled from external inputs” sounds paradoxical after first introducing how threat processing is almost exclusively studied using blocked and event-related task designs (which obviously rely on external inputs only). Please clarify this.
In this work, we wished to state that the SLDS framework characterizes both endogenous and exogenous contributions to dynamics, whereas some past work has not modeled both contributions. To clarify, we have changed the corresponding line:
Abstract; Lines 19-20: Importantly, we characterized both endogenous and exogenous contributions to dynamics.
(4) In the abstract, the first mention of circles comes out of the blue; the paradigm needs to be introduced first to make this understandable.
We have rephrased the corresponding text:
Abstract; Lines 14-17: First, we demonstrated that the SLDS model learned the regularities of the experimental paradigm, such that states and state transitions estimated from fMRI time series data from 85 regions of interest reflected threat proximity and threat approach vs. retreat.
(5 In Figure 3, the legend shows z-scores representing BOLD changes associated with states. However, the z-scores are extremely low (ranging between -.4 and .4). Can this be correct, given that maps are thresholded at p < ._001 (i.e., _z > 3_._09)? A similar small range of z-scores is shown in the legend of Fig 5. Please check the z-score ranges.
The p-value threshold used in Fig. 3 is based on the voxelwise t-test conducted between the participantbased bootstrapped maps and null maps (see Methods : State spatial maps : “To identify statistically significant voxels, we performed a paired t-test between the participant-based boostrapped maps and the null maps.”). Thus, the p-value threshold in the figure does not correspond to the z-scores of the groupaveraged state-activation maps. Similarly in Fig. 5, we only visualized the state-wise attractors on a brain surface map without any thresholding. The purpose of using a z-score color bar was to provide a scale comparable to that of BOLD activity.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important study reports human single-neuron recordings in subcortical structures while participants performed a tactile detection task around the perceptual threshold. The study and the analyses are well conducted and provide convincing evidence that the thalamus and the subthalamic nucleus contain neurons whose activity correlates with the task, with stimulus presentation, and even with whether the stimulation is consciously detected or not. The study will be relevant for researchers interested in the role of subcortical structures in tactile perception and the neural correlates of consciousness.
-
Reviewer #1 (Public review):
Summary:
A cortico-centric view is dominant in the study for the neural mechanisms of consciousness. This investigation represents the growing interest to understand how subcortical regions are involved in conscious perception. To achieve this, the authors engaged an ambitious and rare procedure in humans of directly recording from neurons in the subthalamic nucleus and thalamus. While participants were in surgery for the placement of deep brain stimulation devices for the treatment of essential tremor and Parkinson's disease, they were awakened and completed a perceptual-threshold tactile detection task. The authors identified individual neurons and analyzed single-unit activity corresponding with the task phases and tactile detection/perception. Among the neurons that were perception-responsive, the authors report changes in firing rate beginning ~150 milliseconds from the onset of the tactile stimulation. Curiously, the majority of the perception-responsive neurons had a higher firing rate for missed/not perceived trials. In summary, this investigation is a valuable addition to the growing literature on the role of subcortical regions in conscious perception.
Strengths:
The authors achieve the challenging task of recording human single-unit activity while participants performed a tactile perception task. The methods and statistics are clearly explained and rigorous, particularly for managing false positives and non-normal distributions. The results offer new detail at the level of individual neurons in the emerging recognition for the role of subcortical regions in conscious perception. Also, this study highlights the timing of neural activity linked to conscious perception (approximately 150 millisecond).
Weaknesses:
Due to constraints of testing with this patient population, a standard report-based detection task was administered. This type of task cannot fully exclude motor preparatory and post-perceptual processing as a factor that contributes to distinguishing between perceived versus not perceived stimuli. The authors show sensitivity to this issue by identifying task-selective neurons and their discussion of the results that refers to the confound of post-perceptual processing. Despite this limitation, the results are valuable for contributing to a growing body of literature on the subcortical neural mechanisms of consciousness.
-
Reviewer #2 (Public review):
The authors have examined subpopulations of individual neurons recorded in the thalamus and subthalamic nucleus (STN) of awake humans performing a simple cognitive task. They have carefully designed their task structure to minimize motor components that could confound their analyses in these subcortical structures, particularly given that the data was collected from patients with Parkinson's disease (PD) and essential tremor (ET). The recorded data represents a valuable contribution to the field. Pereira et al. conclude that their single-neuron recordings indicate task-related activity that is purportedly distinct from previously identified sensory signals.
Despite the significance of the dataset, important limitations arise due to the small number of recorded neurons relative to the high number of participants. That raises concerns about the generalizability of the conclusions drawn from the study.
(1) While I support the work conducted by the authors and their efforts to improve the manuscript, the number of significant neurons is considerably lower than the number of participants studied-approximately 8 neurons, compared to 32 participants. This low number of neurons involved in encoding raises concerns about the strength of the conclusions drawn.
(2) Additionally, the authors state that participants do not need to perform a motor execution, yet they are required to communicate their response verbally. This presents a contradiction, as speech involves the activation of facial muscles, and previous studies have shown that neuronal activity in the ventral premotor cortex can encode such movements in humans (Willet et al., Nature 2023). Clarifying this point would strengthen the argument and ensure consistency in the interpretation of results.
(3) One way to improve the study is to analyze the local field potentials (LFPs) recorded alongside the spikes. By examining different LFP components, particularly the beta band (Haegens et al., PNAS 2011), it may be possible to identify consistent modulation across the 32 recorded participants. This approach could provide additional support for the study's conclusions and help clarify the role of neural activity in the observed phenomena.
-
Reviewer #3 (Public review):
Summary:
This important study relies on a rare dataset: intracranial recordings within the thalamus and the subthalamic nucleus in awake humans, while they were performing a tactile detection task. This procedure allowed the authors to identify a small but significant proportion of individual neurons, in both structures, whose activity correlated with the task (e.g. their firing rate changed following the audio cue signalling the start of a trial) and/or with the stimulus presentation (change in firing rate around 200 ms following tactile stimulation) and/or with participant's reported subjective perception of the stimulus (difference between hits and misses around 200 ms following tactile stimulation). Whereas most studies interested in the neural underpinnings of conscious perception focus on cortical areas, these results suggest that subcortical structures might also play a role in conscious perception, notably tactile detection.
Strengths:
There are two strongly valuable aspects in this study that make the evidence convincing and even compelling. First, these type of data are exceptional, the authors could have access to subcortical recordings in awake and behaving humans during surgery. Additionally, the methods are solid. The behavioral study meets the best standards of the domain, with a careful calibration of the stimulation levels (staircase) to maintain them around detection threshold, and additional selection of time intervals where the behavior was stable. The authors also checked that stimulus intensity was the same on average for hits and misses within these selected periods, which warrants that the effects of detection that are observed here are not confounded by stimulus intensity. The neural data analysis is also very sound and well conducted. The statistical approach complies to current best practices, although I found that, on some instances, it was not entirely clear which type of permutations had been performed, and I would advocate for more clarity in these instances. Globally, the figures are nice, clear and well presented. I appreciated the fact that the precise anatomical location of the neurons was directly shown in each figure.
Weaknesses:
The results rely on a small number of neurons; it is only the beginning of this exploration! Figure S5 is important for observing the variety of ways the neurons' activity correlated with either stimulus presence, or perception, or both. Interpretations are still very open on these different profiles.
-
Public Reviews: Reviewer #1 (Public Review): Summary: A cortico-centric view is dominant in the study of the neural mechanisms of consciousness. This investigation represents the growing interest in understanding how subcortical regions are involved in conscious perception. To achieve this, the authors engaged in an ambitious and rare procedure in humans of directly recording from neurons in the subthalamic nucleus and thalamus. While participants were in surgery for the placement of deep brain stimulation devices for the treatment of essential tremor and Parkinson's disease, they were awakened and completed a perceptual-threshold tactile detection task. The authors identified individual neurons and analyzed single-unit activity corresponding with the task phases and tactile detection/perception. Among the neurons that were perception-responsive, the authors report changes in firing rate beginning ~150 milliseconds from the onset of the tactile stimulation. Curiously, the majority of the perception-responsive neurons had a higher firing rate for missed/not perceived trials. In summary, this investigation is a valuable addition to the growing literature on the role of subcortical regions in conscious perception. Strengths: The authors achieved the challenging task of recording human single-unit activity while participants performed a tactile perception task. The methods and statistics are clearly explained and rigorous, particularly for managing false positives and non-normal distributions. The results offer new detail at the level of individual neurons in the emerging recognition of the role of subcortical regions in conscious perception. We thank the reviewer for their positive comments. Weaknesses: "Nonetheless, it remains unknown how the firing rate of subcortical neurons changes when a stimulus is consciously perceived." (lines 76-77) The authors could be more specific about what exactly single-unit recordings offer for interrogating the role of subcortical regions in conscious perception that is unique from alternative neural activity recordings (e.g., local field potential) or recordings that are used as proxies of neural activity (e.g., fMRI). We agree with the reviewer that the contribution of micro-electrode recordings was not sufficiently put forward in our manuscript. We added the following sentences to the discussion, when discussing the multiple types of neurons we found: Single-unit recordings provide a much higher temporal resolution than functional imaging, which helps assess how the neural correlates of consciousness unfold over time. Contrary to local field potentials, single-unit recordings can expose the variety of functional roles of neurons within subcortical regions, thereby offering a potential for a better mechanistic understanding of perceptual consciousness. Related comment for the following excerpts: "After a random delay ranging from 0.5 to 1 s, a "respond" cue was played, prompting participants to verbally report whether they felt a vibration or not. Therefore, none of the reported analyses are confounded by motor responses." (lines 97-99). "These results show that subthalamic and thalamic neurons are modulated by stimulus onset, irrespective of whether it was reported or not, even though no immediate motor response was required." (lines 188190). "By imposing a delay between the end of the tactile stimulation window and the subjective report, we ensured that neuronal responses reflected stimulus detection and not mere motor responses." (lines 245247). It is a valuable feature of the paradigm that the reporting period was initiated hundreds of milliseconds after the stimulus presentation so that the neural responses should not represent "mere motor responses". However, verbal report of having perceived or not perceived a stimulus is a motor response and because the participants anticipate having to make these reports before the onset of the response period, there may be motor preparatory activity from the time of the perceived stimulus that is absent for the not perceived stimulus. The authors show sensitivity to this issue by identifying task-selective neurons and their discussion of the results that refer to the confound of post-perceptual processing. Still, direct treatment of this possible confound would help the rigor of the interpretation of the results. We agree with the reviewer that direct treatment would have provided the best control. One way to avoid motor preparation is to only provide the stimulus-effector mapping after the stimulus presentation (Bennur & Gold, 2011; Twomey et al., 2016; Fang et al., 2024). Other controls to avoid post-perceptual processing used in consciousness research consist of using no-report paradigms (Tsuchiya et al., 2015) as we did in previous studies (Pereira et al., 2021; Stockart et al., 2024). Unfortunately, neither of these procedures was feasible during the 10 minutes allotted for the research task in an intraoperative setting with auditory cues and vocal responses. We would like to highlight nonetheless that the effects we report are shortlived and incompatible with sustained motor preparation activity. We added the following sentence to the discussion: Future studies ruling out the presence of motor preparation triggered by perceived stimuli (Bennur & Gold, 2011; Fang et al., 2024; Twomey et al., 2016) and verifying that similar neuronal activity occurs in the absence of task-demands (no-reports; Tsuchiya et al., 2015) or attention (Wyart & Tallon-Baudry, 2008) will be useful to support that subcortical neurons contribute specifically to perceptual consciousness. "When analyzing tactile perception, we ensured that our results were not contaminated with spurious behavior (e.g. fluctuation of attention and arousal due to the surgical procedure)." (lines 118-117). Confidence in the results would be improved if the authors clarified exactly what behaviors were considered as contaminating the results (e.g., eye closure, saccades, and bodily movements) and how they were determined. This sentence was indeed unclear. It introduced the trial selection procedure we used to compensate for drifts in the perceptual threshold, which can result from fluctuations in attention or arousal. We modified the sentence, which now reads: When analyzing tactile perception, we ensured that our results were not contaminated by fluctuating attention and arousal due to the surgical procedure. Based on objective criteria, we excluded specific series of trials from analyses and focused on time windows for which hits and misses occurred in commensurate proportions (see methods). During the recordings, the experimenter stood next to the patients and monitored their bodily movements, ensuring they did not close their eyes or produce any other bodily movements synchronous with stimulus presentation. The authors' discussion of the thalamic neurons could be more precise. The authors show that only certain areas of the thalamus were recorded (in or near the ventral lateral nucleus, according to Figure S3C). The ventral lateral nucleus has a unique relationship to tactile and motor systems, so do the authors hypothesize these same perception-selective neurons would be active in the same way for visual, auditory, olfactory, and taste perception? Moreover, the authors minimally interpret the location of the task, sensory, and perception-responsive neurons. Figure S3 suggests these neurons are overlapping. Did the authors expect this overlap and what does it mean for the functional organization of the ventral lateral nucleus and subthalamic nucleus in conscious perception? These are excellent questions, the answers to which we can only speculate. In rodents, the LT is known as a hub for multisensory processing, as over 90% of LT neurons respond to at least two sensory modalities (for a review, see Yang et al., 2024). Yet, no study has compared how LT neurons in rodents encode perceived and nonperceived stimuli across modalities. Evidence in humans is scarce, with only a few studies documenting supramodal neural correlates of consciousness at the cortical level with noninvsasive methods (Noel et al., 2018; Sanchez et al., 2020; Filimonov et al., 2022). We now refer to these studies in the revised discussion: Moreover, given the prominent role of the thalamus in multisensory processing, it will be interesting to assess if it is specifically involved in tactile consciousness or if it has a supramodal contribution, akin to what is found in the cortex (Noel et al., 2018; Sanchez et al., 2020; Filimonov et al., 2022). Concerning the anatomical overlap of neurons, we could not reconstruct the exact locations of the DBS tracts for all participants. Because of the limited number of recorded neurons, we preferred to refrain from drawing strong conclusions about the functional organization of the ventral lateral nucleus. "We note that, 6 out of 8 neurons had higher firing rates for missed trials than hit trials, although this proportion was not significant (binomial test: p = 0.145)." (lines 215-216). It appears that in the three example neurons shown in Figure 4, 2 out of 3 (#001 and #068) show a change in firing rate predominantly for the missed stimulations. Meanwhile, #034 shows a clear hit response (although there is an early missed response - decreased firing rate - around 150 ms that is not statistically significant). This is a counterintuitive finding when compared to previous results from the thalamus (e.g., local field potentials and fMRI) that show the opposite response profile (i.e., missed/not perceived trials display no change or reduced response relative to hit/perceived trials). The discussion of the results should address this, including if these seemingly competing findings can be rectified. We thank the reviewer for pointing out this limitation of the discussion. We avoided putting too much emphasis on these aspects due to the limited number of perception-selective neurons. Although subcortical connectivity models would predict that neurons in the thalamus should increase their firing rate for perceived stimuli, we were not surprised to see this heterogeneity as we had previously found neurons decreasing their firing rates for missed stimuli in the posterior parietal cortex (Pereira et al., 2021). We answer these points in response to the reviewer’s last comment below on the latencies of the effects. The authors report 8 perception-responsive neurons, but there are only 5 recording sites highlighted (i.e., filled-in squares and circles) in Figures S3C and 4D. Was this an omission or were three neurons removed from the perception-responsive analysis? Unfortunately, we could not obtain anatomical images for all participants. This information was present in the methods section, although not clearly enough: For 34 / 50 neurons, preoperative MRI and postoperative CT scans (co-registered in patient native space using CranialSuite) were available to precisely reconstruct surgical trajectories and recording locations (for the remaining 16 neurons, localizations were based on neurosurgical planning and confirmed by electrophysiological recordings at various depths). Therefore, we added the following sentence in Figures 2, 3, 4 and S3. [...] for patients for which we could obtain anatomical images. Could the authors speak to the timing of the responses reported in Figure 4? The statistically significant intervals suggested both early (~160-200ms) to late responses (~300ms). Some have hypothesized that subcortical regions are early - ahead of cortical activation that may be linked with conscious perception. Do these results say anything about this temporal model for when subcortical regions are active in conscious perception? We agree that response timing could have been better described. We performed a new analysis of the latencies at which our main effects were observed. This analysis revealed the existence of the two clusters mentioned by the reviewer very clearly. We now include this analysis in a new Figure 5 in the revised manuscript. We also performed a new analysis to support the existence of bimodal distributions and quantified the latencies. We added this text to the result section: We note that the timings of sensory and perception effects in Figures 3 and 4 showed a bimodal distribution with an early cluster (149 ms for sensory neurons; 121 ms for perception neurons; c.f. methods) and a later cluster (330 ms for sensory neurons; 315 ms for perception neurons; Figure 5). and this section to the methods: To measure bimodal timings of effect latencies, we fitted a two-component Gaussian mixture distribution to the data in Figure 5 by minimizing the mean square error with an interior-point method. We took the best of 20 runs with random initialization points and verified that the resulting mean square error was markedly (> 4 times) better than using a single component. We updated the discussion, including the points made in the comment about higher activity for missed stimuli (above): The early cluster’s average timing around 150 ms post-stimulus corresponds to the onset of a putative cortical correlate of tactile consciousness, the somatosensory awareness negativity (Dembski et al., 2021). Similar electroencephalographic markers are found in the visual and auditory modality. It is unclear, however, whether these markers are related to perceptual consciousness or selective attention (Dembski et al., 2021). The later cluster is centered around 300 ms and could correspond to a well known electroencephalographic marker, the P3b (Polich, 2007) whose association with perceptual consciousness has been questioned (Pitts et al., 2014; Dembski et al., 2021) although brain activity related to consciousness has been observed at similar timing even in the absence of report demands (Sergent et al., 2021; Stockart et al., 2024). It is also important to note that these clusters contain neurons with both increased and decreased firing rates following stimulus onset, similar to what was observed previously in the posterior parietal cortex (Pereira et al., 2021). Reviewer #2 (Public Review): The authors have studied subpopulations of individual neurons recorded in the thalamus and subthalamic nucleus (STN) of awake humans performing a simple cognitive task. They have carefully designed their task structure to eliminate motor components that could confound their analyses in these subcortical structures, given that the data was recorded in patients with Parkinson's Disease (PD) and diagnosed with an Essential Tremor (ET). The recorded data represents a promising addition to the field. The analyses that the authors have applied can serve as a strong starting point for exploring the kinds of complex signals that can emerge within a single neuron's activity. Pereira et. al conclude that their results from single neurons indicate that task-related activity occurs, purportedly separate from previously identified sensory signals. These conclusions are a promising and novel perspective for how the field thinks about the emergence of decisions and sensory perception across the entire brain as a unit. We thank the reviewer for these positive comments. Despite the strength of the data that was obtained and the relevant nature of the conclusions that were drawn, there are certain limitations that must be taken into consideration: (1) The authors make several claims that their findings are direct representations of consciousnessidentifiable in subcortical structures. The current context for consciousness does not sufficiently define how the consciousness is related to the perceptual task. This is indeed a complex issue in all studies concerned with perceptual consciousness and we were careful not to make such “direct” claims. Instead, we used the state-of-the-art tools available to study consciousness (see below) and only interpreted our findings with respect to consciousness in the discussion. For example, in the abstract, our claim is that “Our results provide direct neurophysiological evidence of the involvement of the subthalamic nucleus and the thalamus for the detection of vibrotactile stimuli, thereby calling for a less cortico-centric view of the neural correlates of consciousness.” In brief, first, we used near-threshold stimuli which allowed us to contrast reported vs. unreported trials while keeping the physical properties of the stimulus comparable. Second, we used subjective reports without incentive for participants to be more conservative or liberal in their response (e.g. through reward). Third, we introduced a random delay before the responses to limit confounding effects due to the report. We also acknowledged that “... it will be important in future studies to examine if similar subcortical responses are obtained when stimuli are unattended (Wyart & Tallon-Baudry, 2008), task-irrelevant (Shafto & Pitts, 2015), or when participants passively experience stimuli without the instruction to report them (i.e., no-report paradigms) (Tsuchyia et al., 2015)”. This last sentence now reads (to address a point made by Reviewer 1 about motor preparation): Future studies ruling out the presence of motor preparation triggered by perceived stimuli (Bennur & Gold, 2011; Fang et al., 2024; Twomey et al., 2016) and verifying that similar neuronal activity occurs in the absence of task-demands (no-reports; Tsuchiya et al., 2015) or attention (Wyart & Tallon-Baudry, 2008) will be useful to support that subcortical neurons contribute specifically to perceptual consciousness. (2) The current work would benefit greatly from a description and clarification of what all the neurons thathave been recorded are doing. The authors' criteria for selecting subpopulations with task-relevant activity are appropriate, but understanding the heterogeneity in a population of single neurons is important for broader considerations that are being studied within the field. We followed the reviewer’s suggestions and added new results regarding the latencies of the reported effects (new Figure 5). We also now show firing rates for hits, misses and overall sensory activity (hits and misses combined) for all perception-selective or sensory-selective (when behavior was good enough; Figure S5). Although a more detailed characterization of the heterogeneity of the neurons identified would have been relevant, it seems beyond the scope of the present study, especially given the relatively small number of neurons we identified, as well as the relative simplicity of the paradigm imposed by the clinical context in which we worked. (3) The authors have omitted a proper set of controls for comparison against the active trials, forexample, where a response was not necessary. Please explain why this choice was made and what implications are necessary to consider. We had mentioned this limitation in the discussion: Nevertheless, it will be important in future studies to examine if similar subcortical responses are obtained when stimuli are unattended (Wyart & TallonBaudry, 2008), task-irrelevant (Shafto & Pitts, 2015), or when participants passively experience stimuli without the instruction to report them (i.e., no-report paradigms) (Tsuchyia et al., 2015). We agree that such a control would have been relevant, but this was not feasible during the 10 minutes allotted for the research task in an intraoperative setting. These constraints are both clinical, to minimize discomfort for patients and practical, as is difficult to track neurons in an intraoperative setting for more than 10 minutes. We added a sentence to this effect in the discussion. Reviewer #3 (Public Review): Summary: This important study relies on a rare dataset: intracranial recordings within the thalamus and the subthalamic nucleus in awake humans, while they were performing a tactile detection task. This procedure allowed the authors to identify a small but significant proportion of individual neurons, in both structures, whose activity correlated with the task (e.g. their firing rate changed following the audio cue signalling the start of a trial) and/or with the stimulus presentation (change in firing rate around 200 ms following tactile stimulation) and/or with participant's reported subjective perception of the stimulus (difference between hits and misses around 200 ms following tactile stimulation). Whereas most studies interested in the neural underpinnings of conscious perception focus on cortical areas, these results suggest that subcortical structures might also play a role in conscious perception, notably tactile detection. Strengths: There are two strongly valuable aspects in this study that make the evidence convincing and even compelling. First, these types of data are exceptional, the authors could have access to subcortical recordings in awake and behaving humans during surgery. Additionally, the methods are solid. The behavioral study meets the best standards of the domain, with a careful calibration of the stimulation levels (staircase) to maintain them around the detection threshold, and an additional selection of time intervals where the behavior was stable. The authors also checked that stimulus intensity was the same on average for hits and misses within these selected periods, which warrants that the effects of detection that are observed here are not confounded by stimulus intensity. The neural data analysis is also very sound and well-conducted. The statistical approach complies with current best practices, although I found that, in some instances, it was not entirely clear which type of permutations had been performed, and I would advocate for more clarity in these instances. Globally the figures are nice, clear, and well presented. I appreciated the fact that the precise anatomical location of the neurons was directly shown in each figure. We thank the reviewer for this positive evaluation. Weaknesses: Some clarification is needed for interpreting Figure 3, top rows: in my understanding the black curve is already the result of a subtraction between stimulus present trials and catch trials, to remove potential drifts; if so, it does not make sense to compare it with the firing rate recorded for catch trials. The black curve represents the firing rate without any subtraction. We only subtracted the firing rates of catch trials in the statistical procedure, as the reviewer noted, to remove potential drift. We added (before baseline correction) to the legend of Figure 3. I also think that the article could benefit from a more thorough presentation of the data and that this could help refine the interpretation which seems to be a bit incomplete in the current version. There are 8 stimulus-responsive neurons and 8 perception-selective neurons, with only one showing both effects, resulting in a total of 15 individual neurons being in either category or 13 neurons if we exclude those in which the behavior is not good enough for the hit versus miss analysis (Figure S4A). In my opinion, it should be feasible to show the data for all of them (either in a main figure, or at least in supplementary), but in the present version, we get to see the data for only 3 neurons for each analysis. This very small selection includes the only neuron that shows both effects (neuron #001; which is also cue selective), but this is not highlighted in the text. It would be interesting to see both the stimulus-response data and the hit versus miss data for all 13 neurons as it could help develop the interpretation of exactly how these neurons might be involved in stimulus processing and conscious perception. This should give rise to distinct interpretations for the three possible categories. Neurons that are stimulus-responsive but not perception-selective should show the same response for both hits and misses and hence carry out indifferently conscious and unconscious responses. The fact that some neurons show the opposite pattern is particularly intriguing and might give rise to a very specific interpretation: if the neuron really doesn't tend to respond to the stimulus when hits and misses are put together, it might be a neuron that does not directly respond to the stimulus, but whose spontaneous fluctuations across trials affect how the stimulus is perceived when they occur in a specific time window after the stimulus. Finally, neuron #001 responds with what looks like a real burst of evoked activity to stimulation and also shows a difference between hits and misses, but intriguingly, the response is strongest for misses. In the discussion, the interesting interpretation in terms of a specific gating of information by subcortical structures seems to apply well to this last example, but not necessarily to the other categories. We now provide a supplementary Figure showing firing rates for hits, misses and the combination of both. The reviewer’s analysis about whether a perception-selective neuron also has to respond to the stimulus to be involved in gating is interesting. With more data, a finer characterization of these neurons would have been possible. In our study, it is possible that more neurons have similar characteristics as #001 (e.g. #032, #062, #068) but do not show a significant difference with respect to baseline when both hits and misses are considered. We now avoid interpreting null effects, especially considering the low number of trials with near-threshold detection behavior we could collect in 10 minutes. We also realized that we had not updated Figure S7 after the last revision in which we had corrected for possible drifts to obtain sensory-selective neurons. The corrected panel A is provided below. Recommendations for the authors: Reviewer #1 (Recommendations For The Authors): It appears that the correct rejection was low for most participants. It would improve interpretation of the behavioral results if correct rejection was shown as a rate (i.e., # of correct rejection trials / total number of no stimulus/blank trials) rather than or in addition to reporting the number of correct rejection trials (Figure 1C). We added the following figure to the supplementary information. The axis tick marks in Figure 5A late versus early are incorrect (appears the axis was duplicated). Thank you for spotting this, it has been corrected. Reviewer #2 (Recommendations For The Authors): We would like to congratulate the authors on this strongly supported contribution to the field. The manuscript is well-written, although a little bit too concise in sections. See the following comments for the methods that could benefit the present conclusions: Thank you for these suggestions that we believe improved our interpretations. Major Points (1) The subpopulations of neurons that are considered are small, but it is not a confounding issue for the conclusions drawn. However, the behavior of the neurons that were excluded should be considered by calculating the percentage of neurons that are selective for the distinct parameters, as a function of time. This would greatly strengthen the understanding of what can be observed in the two subcortical structures. We thank the reviewer for this suggestion. We performed a new analysis of the latencies at which our main effects were observed. This analysis revealed the existence of two clusters, as shown in the new Figure 5 copied below We also performed a new analysis to support the existence of bimodal distributions and quantified the latencies. We added this text to the result section: We note that the timings of sensory and perception effects in Figures 3 and 4 showed a bimodal distribution with an early cluster (149 ms for sensory neurons; 121 ms for perception neurons; c.f. methods) and a later cluster (330 ms for sensory neurons; 315 ms for perception neurons; Figure 5). and this section to the methods: To measure bimodal timings of effect latencies, we fitted a two-component Gaussian mixture distribution to the data in Figure 5 by minimizing the mean square error with an interior-point method. We took the best of 20 runs with random initialization points and verified that the resulting mean square error was markedly (> 4 times) better than using a single component. We also updated the discussion: The early cluster’s average timing around 150 ms post-stimulus corresponds to the onset of a putative cortical correlate of tactile consciousness, the somatosensory awareness negativity (Dembski et al., 2021). Similar electroencephalographic markers are found in the visual and auditory modality. It is unclear, however, whether these markers are related to perceptual consciousness or selective attention (Dembski et al., 2021). The later cluster is centered around 300 ms and could correspond to a well known electroencephalographic marker, the P3b (Polich, 2007) whose association with perceptual consciousness has been questioned (Pitts et al., 2014; Dembski et al., 2021) although brain activity related to consciousness has been observed at similar timing even in the absence of report demands (Sergent et al., 2021; Stockart et al., 2024). It is also important to note that these clusters contain neurons with both increased and decreased firing rates following stimulus onset, similar to what was observed previously in the posterior parietal cortex (Pereira et al., 2021). (2) We highly recommend that the authors consider employing some analysis that decodes therepresentations observable in the activity of individual neurons as a function of time (e.g. Shannon's Mutual Information). This would reinforce and emphasize the most relevant conclusions. We thank the reviewers for this suggestion. Unfortunately, such methods would require many more trials than what we were able to collect in the 10-minute slots available in the operating room. (3) Although there are small populations recorded in each of the two subcortical structures, they aresufficient to attempt a study using population dynamics (primarily, PCA can still work with smaller populations). Given the broad range of dynamics that are observed in a population of single units typically involved in decision-making, it would be interesting to consider whether heterogeneity is a hallmark of decision-making, and trying to summarize the variance in the activity of the entire population should provide a certain understanding of the cue-selective versus the perception-selective qualities, as an example. We now present all 13 neurons that were sensory- or perception-selective for which we had good enough behavior to show hit vs. miss differences in Supplementary Figure S5. Although population-level analyses would be relevant, they are not compatible with the number of neurons we identified. (4) A stronger presentation of what the expectations are for the results would also benefit theinterpretability of the manuscript when added to the introduction and discussion sections. Due to the scarcity of single-neuron data related to perceptual consciousness, especially in the subcortical structures we explored, our prior expectations did not exceed finding perception-selective neurons. We would prefer to avoid refining these expectations post-hoc. Minor Comments (1) Add the shared overlap between differently selective neurons explicitly in the manuscript. We added this information at the end of the results section. (2) Add a consideration in the methods of why the Wilcoxon test or permutation test was selected forseparate uses. How do the results compare? Sorry for this misunderstanding. We clarified this in revised methods: To deal with possibly non-parametric distributions, we used Wilcoxon rank sum test or sign test instead of t-tests to test differences between distributions. We used permutation tests instead of Binomial tests to test whether a reported number of neurons could have been obtained by chance. Reviewer #3 (Recommendations For The Authors): Suggestions for improved or additional experiments, data or analysis: As suggested already in the public review, it might be worth showing all 13 neurons with either stimulusresponsive or perception-selective behaviour and, based on that, deepen the potential interpretation of the results for the different categories. We agree that this information improves the understanding of the underlying data and this addition was also proposed by reviewer 2. We added it in a new supplementary Figure S5. Recommendations for improving the writing and presentation As mentioned in the public review, I think Figure 3 needs clarification. I found that, in some instances, it was not entirely clear which type of analyses or permutation tests had been performed, and I would advocate for more clarity in these instances. For example: Page 6 line 146 "permuting trial labels 1000 times": do you mean randomly attributing a trial to aneuron? Or something else? We agree that this was somewhat unclear. We modified the sentence to: permuting the sign of the trial-wise differences We now define a sign permutation test for paired tests and a trial permutation test for two-sample tests in the methods and specify which test was used in the maintext. Page 7, neurons which have their firing rate modulated by the stimulus: I think you ought to be moreexplicit about the analysis so that we grasp it on the first read. To understand what is shown in Figure 3 I had to go back and forth between the main text and the method, and I am still not sure I completely understood. You compare the firing rate in sliding windows following stimulus onset with the mean firing rate during the 300ms baseline. Sliding windows are between 0 and 400 ms post-stim (according to methods ?) and a neuron is deemed responsive if you find at least one temporal cluster that shows a significant difference with baseline activity (using cluster permutation). Is that correct? Either way, I would recommend being a bit more precise about the analysis that was carried out in the main text, so that we only need to refer to methods when we need specialized information. We agree that the methods section was unclear. We re-wrote the following two paragraphs: To identify sensory-selective neurons, we assumed that subcortical signatures of stimulus detection ought to be found early following its onset and looked for differences in the firing rates during the first 400 ms post-stimulus onset compared to a 300 ms pre-stimulus baseline. To correct for possible drifts occurring during the trial, we subtracted the average cue-locked activity from catch trials to the cuelocked activity of each stimulus-present trials before realigning to stimulus onset. We defined a cluster as a set of adjacent time points for which the firing rates were significantly different between hits and misses, as assessed by a non-parametric sign rank test. A putative neuron was considered sensory-selective when the length of a cluster was above 80 ms, corresponding to twice the standard deviation of the smoothing kernel used to compute the firing rate. Whether for the shuffled data or the observed data, if more than one cluster was obtained, we discarded all but the longest cluster. This permutation test allowed us to control for multiple comparisons across time and participants. For perception-selective neurons, we looked for differences in the firing rates between hit and miss trials during the first 400 ms post-stimulus onset. We defined a cluster as a set of adjacent time points for which the firing rates were significantly different between hits and misses as assessed by a nonparametric Wilcoxon rank sum test. As for sensory-selective neurons, a putative neuron was considered perception-selective when the length of a cluster was above 80 ms, corresponding to twice the standard deviation of the smoothing kernel used to compute the firing rate and we discarded all but the longest cluster. Minor points : Figure 3: inset showing action potentials, please also provide the time scale (in the legend for example), so that it's clear that it is not commensurate with the firing rate curve below, but rather corresponds to the dots of the raster plot. We added the text ”[...], duration: 2.5 ms” in Figures 2, 3, and 4. Line 210: I recommend: “we found 8 neurons [...] showing a significant difference *between hits and misses* after stimulus onset." We made the change. Top of page 9, the following sentence is misleading “This result suggests that neurons in these two subcortical structures have mostly different functional roles ; this could read as meaning that functional roles are different between the two structures. Probably what you mean is rather something along this line : “these two subcortical structures both contain neurons displaying several different functional roles” Changed. Line 329: remove double “when” We made the change, thank you for spotting this.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
Summary:
A cortico-centric view is dominant in the study of the neural mechanisms of consciousness. This investigation represents the growing interest in understanding how subcortical regions are involved in conscious perception. To achieve this, the authors engaged in an ambitious and rare procedure in humans of directly recording from neurons in the subthalamic nucleus and thalamus. While participants were in surgery for the placement of deep brain stimulation devices for the treatment of essential tremor and Parkinson's disease, they were awakened and completed a perceptual-threshold tactile detection task. The authors identified individual neurons and analyzed single-unit activity corresponding with the task phases and tactile detection/perception. Among the neurons that were perception-responsive, the authors report changes in firing rate beginning ~150 milliseconds from the onset of the tactile stimulation. Curiously, the majority of the perception-responsive neurons had a higher firing rate for missed/not perceived trials. In summary, this investigation is a valuable addition to the growing literature on the role of subcortical regions in conscious perception.
Strengths:
The authors achieved the challenging task of recording human single-unit activity while participants performed a tactile perception task. The methods and statistics are clearly explained and rigorous, particularly for managing false positives and non-normal distributions. The results offer new detail at the level of individual neurons in the emerging recognition of the role of subcortical regions in conscious perception.
We thank the reviewer for their positive comments.
Weaknesses:
"Nonetheless, it remains unknown how the firing rate of subcortical neurons changes when a stimulus is consciously perceived." (lines 76-77) The authors could be more specific about what exactly single-unit recordings offer for interrogating the role of subcortical regions in conscious perception that is unique from alternative neural activity recordings (e.g., local field potential) or recordings that are used as proxies of neural activity (e.g., fMRI).
We agree with the reviewer that the contribution of micro-electrode recordings was not sufficiently put forward in our manuscript. We added the following sentences to the discussion, when discussing the multiple types of neurons we found:
Single-unit recordings provide a much higher temporal resolution than functional imaging, which helps assess how the neural correlates of consciousness unfold over time. Contrary to local field potentials, single-unit recordings can expose the variety of functional roles of neurons within subcortical regions, thereby offering a potential for a better mechanistic understanding of perceptual consciousness.
Related comment for the following excerpts:
"After a random delay ranging from 0.5 to 1 s, a "respond" cue was played, prompting participants to verbally report whether they felt a vibration or not. Therefore, none of the reported analyses are confounded by motor responses." (lines 97-99).
"These results show that subthalamic and thalamic neurons are modulated by stimulus onset, irrespective of whether it was reported or not, even though no immediate motor response was required." (lines 188190).
"By imposing a delay between the end of the tactile stimulation window and the subjective report, we ensured that neuronal responses reflected stimulus detection and not mere motor responses." (lines 245247).
It is a valuable feature of the paradigm that the reporting period was initiated hundreds of milliseconds after the stimulus presentation so that the neural responses should not represent "mere motor responses". However, verbal report of having perceived or not perceived a stimulus is a motor response and because the participants anticipate having to make these reports before the onset of the response period, there may be motor preparatory activity from the time of the perceived stimulus that is absent for the not perceived stimulus. The authors show sensitivity to this issue by identifying task-selective neurons and their discussion of the results that refer to the confound of post-perceptual processing. Still, direct treatment of this possible confound would help the rigor of the interpretation of the results.
We agree with the reviewer that direct treatment would have provided the best control. One way to avoid motor preparation is to only provide the stimulus-effector mapping after the stimulus presentation (Bennur & Gold, 2011; Twomey et al., 2016; Fang et al., 2024). Other controls to avoid post-perceptual processing used in consciousness research consist of using no-report paradigms (Tsuchiya et al., 2015) as we did in previous studies (Pereira et al., 2021; Stockart et al., 2024). Unfortunately, neither of these procedures was feasible during the 10 minutes allotted for the research task in an intraoperative setting with auditory cues and vocal responses. We would like to highlight nonetheless that the effects we report are shortlived and incompatible with sustained motor preparation activity.
We added the following sentence to the discussion:
Future studies ruling out the presence of motor preparation triggered by perceived stimuli (Bennur & Gold, 2011; Fang et al., 2024; Twomey et al., 2016) and verifying that similar neuronal activity occurs in the absence of task-demands (no-reports; Tsuchiya et al., 2015) or attention (Wyart & Tallon-Baudry, 2008) will be useful to support that subcortical neurons contribute specifically to perceptual consciousness.
"When analyzing tactile perception, we ensured that our results were not contaminated with spurious behavior (e.g. fluctuation of attention and arousal due to the surgical procedure)." (lines 118-117).
Confidence in the results would be improved if the authors clarified exactly what behaviors were considered as contaminating the results (e.g., eye closure, saccades, and bodily movements) and how they were determined.
This sentence was indeed unclear. It introduced the trial selection procedure we used to compensate for drifts in the perceptual threshold, which can result from fluctuations in attention or arousal. We modified the sentence, which now reads:
When analyzing tactile perception, we ensured that our results were not contaminated by fluctuating attention and arousal due to the surgical procedure. Based on objective criteria, we excluded specific series of trials from analyses and focused on time windows for which hits and misses occurred in commensurate proportions (see methods).
During the recordings, the experimenter stood next to the patients and monitored their bodily movements, ensuring they did not close their eyes or produce any other bodily movements synchronous with stimulus presentation.
The authors' discussion of the thalamic neurons could be more precise. The authors show that only certain areas of the thalamus were recorded (in or near the ventral lateral nucleus, according to Figure S3C). The ventral lateral nucleus has a unique relationship to tactile and motor systems, so do the authors hypothesize these same perception-selective neurons would be active in the same way for visual, auditory, olfactory, and taste perception? Moreover, the authors minimally interpret the location of the task, sensory, and perception-responsive neurons. Figure S3 suggests these neurons are overlapping. Did the authors expect this overlap and what does it mean for the functional organization of the ventral lateral nucleus and subthalamic nucleus in conscious perception?
These are excellent questions, the answers to which we can only speculate. In rodents, the LT is known as a hub for multisensory processing, as over 90% of LT neurons respond to at least two sensory modalities (for a review, see Yang et al., 2024). Yet, no study has compared how LT neurons in rodents encode perceived and nonperceived stimuli across modalities. Evidence in humans is scarce, with only a few studies documenting supramodal neural correlates of consciousness at the cortical level with noninvsasive methods (Noel et al., 2018; Sanchez et al., 2020; Filimonov et al., 2022). We now refer to these studies in the revised discussion: Moreover, given the prominent role of the thalamus in multisensory processing, it will be interesting to assess if it is specifically involved in tactile consciousness or if it has a supramodal contribution, akin to what is found in the cortex (Noel et al., 2018; Sanchez et al., 2020; Filimonov et al., 2022).
Concerning the anatomical overlap of neurons, we could not reconstruct the exact locations of the DBS tracts for all participants. Because of the limited number of recorded neurons, we preferred to refrain from drawing strong conclusions about the functional organization of the ventral lateral nucleus.
"We note that, 6 out of 8 neurons had higher firing rates for missed trials than hit trials, although this proportion was not significant (binomial test: p = 0.145)." (lines 215-216).
It appears that in the three example neurons shown in Figure 4, 2 out of 3 (#001 and #068) show a change in firing rate predominantly for the missed stimulations. Meanwhile, #034 shows a clear hit response (although there is an early missed response - decreased firing rate - around 150 ms that is not statistically significant). This is a counterintuitive finding when compared to previous results from the thalamus (e.g., local field potentials and fMRI) that show the opposite response profile (i.e., missed/not perceived trials display no change or reduced response relative to hit/perceived trials). The discussion of the results should address this, including if these seemingly competing findings can be rectified.
We thank the reviewer for pointing out this limitation of the discussion. We avoided putting too much emphasis on these aspects due to the limited number of perception-selective neurons. Although subcortical connectivity models would predict that neurons in the thalamus should increase their firing rate for perceived stimuli, we were not surprised to see this heterogeneity as we had previously found neurons decreasing their firing rates for missed stimuli in the posterior parietal cortex (Pereira et al., 2021). We answer these points in response to the reviewer’s last comment below on the latencies of the effects.
The authors report 8 perception-responsive neurons, but there are only 5 recording sites highlighted (i.e., filled-in squares and circles) in Figures S3C and 4D. Was this an omission or were three neurons removed from the perception-responsive analysis?
Unfortunately, we could not obtain anatomical images for all participants. This information was present in the methods section, although not clearly enough:
For 34 / 50 neurons, preoperative MRI and postoperative CT scans (co-registered in patient native space using CranialSuite) were available to precisely reconstruct surgical trajectories and recording locations (for the remaining 16 neurons, localizations were based on neurosurgical planning and confirmed by electrophysiological recordings at various depths).
Therefore, we added the following sentence in Figures 2, 3, 4 and S3.
[...] for patients for which we could obtain anatomical images.
Could the authors speak to the timing of the responses reported in Figure 4? The statistically significant intervals suggested both early (~160-200ms) to late responses (~300ms). Some have hypothesized that subcortical regions are early - ahead of cortical activation that may be linked with conscious perception. Do these results say anything about this temporal model for when subcortical regions are active in conscious perception?
We agree that response timing could have been better described. We performed a new analysis of the latencies at which our main effects were observed. This analysis revealed the existence of the two clusters mentioned by the reviewer very clearly. We now include this analysis in a new Figure 5 in the revised manuscript.
We also performed a new analysis to support the existence of bimodal distributions and quantified the latencies. We added this text to the result section:
We note that the timings of sensory and perception effects in Figures 3 and 4 showed a bimodal distribution with an early cluster (149 ms for sensory neurons; 121 ms for perception neurons; c.f. methods) and a later cluster (330 ms for sensory neurons; 315 ms for perception neurons; Figure 5). and this section to the methods:
To measure bimodal timings of effect latencies, we fitted a two-component Gaussian mixture distribution to the data in Figure 5 by minimizing the mean square error with an interior-point method. We took the best of 20 runs with random initialization points and verified that the resulting mean square error was markedly (> 4 times) better than using a single component.
We updated the discussion, including the points made in the comment about higher activity for missed stimuli (above):
The early cluster’s average timing around 150 ms post-stimulus corresponds to the onset of a putative cortical correlate of tactile consciousness, the somatosensory awareness negativity (Dembski et al., 2021). Similar electroencephalographic markers are found in the visual and auditory modality. It is unclear, however, whether these markers are related to perceptual consciousness or selective attention (Dembski et al., 2021). The later cluster is centered around 300 ms and could correspond to a well known electroencephalographic marker, the P3b (Polich, 2007) whose association with perceptual consciousness has been questioned (Pitts et al., 2014; Dembski et al., 2021) although brain activity related to consciousness has been observed at similar timing even in the absence of report demands (Sergent et al., 2021; Stockart et al., 2024). It is also important to note that these clusters contain neurons with both increased and decreased firing rates following stimulus onset, similar to what was observed previously in the posterior parietal cortex (Pereira et al., 2021).
Reviewer #2 (Public Review):
The authors have studied subpopulations of individual neurons recorded in the thalamus and subthalamic nucleus (STN) of awake humans performing a simple cognitive task. They have carefully designed their task structure to eliminate motor components that could confound their analyses in these subcortical structures, given that the data was recorded in patients with Parkinson's Disease (PD) and diagnosed with an Essential Tremor (ET). The recorded data represents a promising addition to the field. The analyses that the authors have applied can serve as a strong starting point for exploring the kinds of complex signals that can emerge within a single neuron's activity. Pereira et. al conclude that their results from single neurons indicate that task-related activity occurs, purportedly separate from previously identified sensory signals. These conclusions are a promising and novel perspective for how the field thinks about the emergence of decisions and sensory perception across the entire brain as a unit.
We thank the reviewer for these positive comments.
Despite the strength of the data that was obtained and the relevant nature of the conclusions that were drawn, there are certain limitations that must be taken into consideration:
(1) The authors make several claims that their findings are direct representations of consciousnessidentifiable in subcortical structures. The current context for consciousness does not sufficiently define how the consciousness is related to the perceptual task.
This is indeed a complex issue in all studies concerned with perceptual consciousness and we were careful not to make such “direct” claims. Instead, we used the state-of-the-art tools available to study consciousness (see below) and only interpreted our findings with respect to consciousness in the discussion. For example, in the abstract, our claim is that “Our results provide direct neurophysiological evidence of the involvement of the subthalamic nucleus and the thalamus for the detection of vibrotactile stimuli, thereby calling for a less cortico-centric view of the neural correlates of consciousness.”
In brief, first, we used near-threshold stimuli which allowed us to contrast reported vs. unreported trials while keeping the physical properties of the stimulus comparable. Second, we used subjective reports without incentive for participants to be more conservative or liberal in their response (e.g. through reward). Third, we introduced a random delay before the responses to limit confounding effects due to the report. We also acknowledged that “... it will be important in future studies to examine if similar subcortical responses are obtained when stimuli are unattended (Wyart & Tallon-Baudry, 2008), task-irrelevant (Shafto & Pitts, 2015), or when participants passively experience stimuli without the instruction to report them (i.e., no-report paradigms) (Tsuchyia et al., 2015)”. This last sentence now reads (to address a point made by Reviewer 1 about motor preparation):
Future studies ruling out the presence of motor preparation triggered by perceived stimuli (Bennur & Gold, 2011; Fang et al., 2024; Twomey et al., 2016) and verifying that similar neuronal activity occurs in the absence of task-demands (no-reports; Tsuchiya et al., 2015) or attention (Wyart & Tallon-Baudry, 2008) will be useful to support that subcortical neurons contribute specifically to perceptual consciousness.
(2) The current work would benefit greatly from a description and clarification of what all the neurons thathave been recorded are doing. The authors' criteria for selecting subpopulations with task-relevant activity are appropriate, but understanding the heterogeneity in a population of single neurons is important for broader considerations that are being studied within the field.
We followed the reviewer’s suggestions and added new results regarding the latencies of the reported effects (new Figure 5). We also now show firing rates for hits, misses and overall sensory activity (hits and misses combined) for all perception-selective or sensory-selective (when behavior was good enough; Figure S5). Although a more detailed characterization of the heterogeneity of the neurons identified would have been relevant, it seems beyond the scope of the present study, especially given the relatively small number of neurons we identified, as well as the relative simplicity of the paradigm imposed by the clinical context in which we worked.
(3) The authors have omitted a proper set of controls for comparison against the active trials, forexample, where a response was not necessary. Please explain why this choice was made and what implications are necessary to consider.
We had mentioned this limitation in the discussion: Nevertheless, it will be important in future studies to examine if similar subcortical responses are obtained when stimuli are unattended (Wyart & TallonBaudry, 2008), task-irrelevant (Shafto & Pitts, 2015), or when participants passively experience stimuli without the instruction to report them (i.e., no-report paradigms) (Tsuchyia et al., 2015). We agree that such a control would have been relevant, but this was not feasible during the 10 minutes allotted for the research task in an intraoperative setting. These constraints are both clinical, to minimize discomfort for patients and practical, as is difficult to track neurons in an intraoperative setting for more than 10 minutes.
We added a sentence to this effect in the discussion.
Reviewer #3 (Public Review):
Summary:
This important study relies on a rare dataset: intracranial recordings within the thalamus and the subthalamic nucleus in awake humans, while they were performing a tactile detection task. This procedure allowed the authors to identify a small but significant proportion of individual neurons, in both structures, whose activity correlated with the task (e.g. their firing rate changed following the audio cue signalling the start of a trial) and/or with the stimulus presentation (change in firing rate around 200 ms following tactile stimulation) and/or with participant's reported subjective perception of the stimulus (difference between hits and misses around 200 ms following tactile stimulation). Whereas most studies interested in the neural underpinnings of conscious perception focus on cortical areas, these results suggest that subcortical structures might also play a role in conscious perception, notably tactile detection.
Strengths:
There are two strongly valuable aspects in this study that make the evidence convincing and even compelling. First, these types of data are exceptional, the authors could have access to subcortical recordings in awake and behaving humans during surgery. Additionally, the methods are solid. The behavioral study meets the best standards of the domain, with a careful calibration of the stimulation levels (staircase) to maintain them around the detection threshold, and an additional selection of time intervals where the behavior was stable. The authors also checked that stimulus intensity was the same on average for hits and misses within these selected periods, which warrants that the effects of detection that are observed here are not confounded by stimulus intensity. The neural data analysis is also very sound and well-conducted. The statistical approach complies with current best practices, although I found that, in some instances, it was not entirely clear which type of permutations had been performed, and I would advocate for more clarity in these instances. Globally the figures are nice, clear, and well presented. I appreciated the fact that the precise anatomical location of the neurons was directly shown in each figure.
We thank the reviewer for this positive evaluation.
Weaknesses:
Some clarification is needed for interpreting Figure 3, top rows: in my understanding the black curve is already the result of a subtraction between stimulus present trials and catch trials, to remove potential drifts; if so, it does not make sense to compare it with the firing rate recorded for catch trials.
The black curve represents the firing rate without any subtraction. We only subtracted the firing rates of catch trials in the statistical procedure, as the reviewer noted, to remove potential drift. We added (before baseline correction) to the legend of Figure 3.
I also think that the article could benefit from a more thorough presentation of the data and that this could help refine the interpretation which seems to be a bit incomplete in the current version. There are 8 stimulus-responsive neurons and 8 perception-selective neurons, with only one showing both effects, resulting in a total of 15 individual neurons being in either category or 13 neurons if we exclude those in which the behavior is not good enough for the hit versus miss analysis (Figure S4A). In my opinion, it should be feasible to show the data for all of them (either in a main figure, or at least in supplementary), but in the present version, we get to see the data for only 3 neurons for each analysis. This very small selection includes the only neuron that shows both effects (neuron #001; which is also cue selective), but this is not highlighted in the text. It would be interesting to see both the stimulus-response data and the hit versus miss data for all 13 neurons as it could help develop the interpretation of exactly how these neurons might be involved in stimulus processing and conscious perception. This should give rise to distinct interpretations for the three possible categories. Neurons that are stimulus-responsive but not perception-selective should show the same response for both hits and misses and hence carry out indifferently conscious and unconscious responses. The fact that some neurons show the opposite pattern is particularly intriguing and might give rise to a very specific interpretation: if the neuron really doesn't tend to respond to the stimulus when hits and misses are put together, it might be a neuron that does not directly respond to the stimulus, but whose spontaneous fluctuations across trials affect how the stimulus is perceived when they occur in a specific time window after the stimulus. Finally, neuron #001 responds with what looks like a real burst of evoked activity to stimulation and also shows a difference between hits and misses, but intriguingly, the response is strongest for misses. In the discussion, the interesting interpretation in terms of a specific gating of information by subcortical structures seems to apply well to this last example, but not necessarily to the other categories.
We now provide a supplementary Figure showing firing rates for hits, misses and the combination of both. The reviewer’s analysis about whether a perception-selective neuron also has to respond to the stimulus to be involved in gating is interesting. With more data, a finer characterization of these neurons would have been possible. In our study, it is possible that more neurons have similar characteristics as #001 (e.g. #032, #062, #068) but do not show a significant difference with respect to baseline when both hits and misses are considered. We now avoid interpreting null effects, especially considering the low number of trials with near-threshold detection behavior we could collect in 10 minutes.
We also realized that we had not updated Figure S7 after the last revision in which we had corrected for possible drifts to obtain sensory-selective neurons. The corrected panel A is provided below.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
It appears that the correct rejection was low for most participants. It would improve interpretation of the behavioral results if correct rejection was shown as a rate (i.e., # of correct rejection trials / total number of no stimulus/blank trials) rather than or in addition to reporting the number of correct rejection trials (Figure 1C).
We added the following figure to the supplementary information.
The axis tick marks in Figure 5A late versus early are incorrect (appears the axis was duplicated).
Thank you for spotting this, it has been corrected.
Reviewer #2 (Recommendations For The Authors):
We would like to congratulate the authors on this strongly supported contribution to the field. The manuscript is well-written, although a little bit too concise in sections. See the following comments for the methods that could benefit the present conclusions:
Thank you for these suggestions that we believe improved our interpretations.
Major Points
(1) The subpopulations of neurons that are considered are small, but it is not a confounding issue for the conclusions drawn. However, the behavior of the neurons that were excluded should be considered by calculating the percentage of neurons that are selective for the distinct parameters, as a function of time. This would greatly strengthen the understanding of what can be observed in the two subcortical structures.
We thank the reviewer for this suggestion. We performed a new analysis of the latencies at which our main effects were observed. This analysis revealed the existence of two clusters, as shown in the new Figure 5 copied below
We also performed a new analysis to support the existence of bimodal distributions and quantified the latencies. We added this text to the result section:
We note that the timings of sensory and perception effects in Figures 3 and 4 showed a bimodal distribution with an early cluster (149 ms for sensory neurons; 121 ms for perception neurons; c.f. methods) and a later cluster (330 ms for sensory neurons; 315 ms for perception neurons; Figure 5). and this section to the methods:
To measure bimodal timings of effect latencies, we fitted a two-component Gaussian mixture distribution to the data in Figure 5 by minimizing the mean square error with an interior-point method. We took the best of 20 runs with random initialization points and verified that the resulting mean square error was markedly (> 4 times) better than using a single component.
We also updated the discussion:
The early cluster’s average timing around 150 ms post-stimulus corresponds to the onset of a putative cortical correlate of tactile consciousness, the somatosensory awareness negativity (Dembski et al., 2021). Similar electroencephalographic markers are found in the visual and auditory modality. It is unclear, however, whether these markers are related to perceptual consciousness or selective attention (Dembski et al., 2021). The later cluster is centered around 300 ms and could correspond to a well known electroencephalographic marker, the P3b (Polich, 2007) whose association with perceptual consciousness has been questioned (Pitts et al., 2014; Dembski et al., 2021) although brain activity related to consciousness has been observed at similar timing even in the absence of report demands (Sergent et al., 2021; Stockart et al., 2024). It is also important to note that these clusters contain neurons with both increased and decreased firing rates following stimulus onset, similar to what was observed previously in the posterior parietal cortex (Pereira et al., 2021).
(2) We highly recommend that the authors consider employing some analysis that decodes therepresentations observable in the activity of individual neurons as a function of time (e.g. Shannon's Mutual Information). This would reinforce and emphasize the most relevant conclusions.
We thank the reviewers for this suggestion. Unfortunately, such methods would require many more trials than what we were able to collect in the 10-minute slots available in the operating room.
(3) Although there are small populations recorded in each of the two subcortical structures, they aresufficient to attempt a study using population dynamics (primarily, PCA can still work with smaller populations). Given the broad range of dynamics that are observed in a population of single units typically involved in decision-making, it would be interesting to consider whether heterogeneity is a hallmark of decision-making, and trying to summarize the variance in the activity of the entire population should provide a certain understanding of the cue-selective versus the perception-selective qualities, as an example.
We now present all 13 neurons that were sensory- or perception-selective for which we had good enough behavior to show hit vs. miss differences in Supplementary Figure S5. Although population-level analyses would be relevant, they are not compatible with the number of neurons we identified.
(4) A stronger presentation of what the expectations are for the results would also benefit theinterpretability of the manuscript when added to the introduction and discussion sections.
Due to the scarcity of single-neuron data related to perceptual consciousness, especially in the subcortical structures we explored, our prior expectations did not exceed finding perception-selective neurons. We would prefer to avoid refining these expectations post-hoc.
Minor Comments
(1) Add the shared overlap between differently selective neurons explicitly in the manuscript.
We added this information at the end of the results section.
(2) Add a consideration in the methods of why the Wilcoxon test or permutation test was selected forseparate uses. How do the results compare?
Sorry for this misunderstanding. We clarified this in revised methods:
To deal with possibly non-parametric distributions, we used Wilcoxon rank sum test or sign test instead of t-tests to test differences between distributions. We used permutation tests instead of Binomial tests to test whether a reported number of neurons could have been obtained by chance.
Reviewer #3 (Recommendations For The Authors):
Suggestions for improved or additional experiments, data or analysis:
As suggested already in the public review, it might be worth showing all 13 neurons with either stimulusresponsive or perception-selective behaviour and, based on that, deepen the potential interpretation of the results for the different categories.
We agree that this information improves the understanding of the underlying data and this addition was also proposed by reviewer 2. We added it in a new supplementary Figure S5.
Recommendations for improving the writing and presentation
As mentioned in the public review, I think Figure 3 needs clarification. I found that, in some instances, it was not entirely clear which type of analyses or permutation tests had been performed, and I would advocate for more clarity in these instances. For example:
Page 6 line 146 "permuting trial labels 1000 times": do you mean randomly attributing a trial to aneuron? Or something else?
We agree that this was somewhat unclear. We modified the sentence to:
permuting the sign of the trial-wise differences
We now define a sign permutation test for paired tests and a trial permutation test for two-sample tests in the methods and specify which test was used in the maintext.
Page 7, neurons which have their firing rate modulated by the stimulus: I think you ought to be moreexplicit about the analysis so that we grasp it on the first read. To understand what is shown in Figure 3 I had to go back and forth between the main text and the method, and I am still not sure I completely understood. You compare the firing rate in sliding windows following stimulus onset with the mean firing rate during the 300ms baseline. Sliding windows are between 0 and 400 ms post-stim (according to methods ?) and a neuron is deemed responsive if you find at least one temporal cluster that shows a significant difference with baseline activity (using cluster permutation). Is that correct? Either way, I would recommend being a bit more precise about the analysis that was carried out in the main text, so that we only need to refer to methods when we need specialized information.
We agree that the methods section was unclear. We re-wrote the following two paragraphs:
To identify sensory-selective neurons, we assumed that subcortical signatures of stimulus detection ought to be found early following its onset and looked for differences in the firing rates during the first 400 ms post-stimulus onset compared to a 300 ms pre-stimulus baseline. To correct for possible drifts occurring during the trial, we subtracted the average cue-locked activity from catch trials to the cuelocked activity of each stimulus-present trials before realigning to stimulus onset. We defined a cluster as a set of adjacent time points for which the firing rates were significantly different between hits and misses, as assessed by a non-parametric sign rank test. A putative neuron was considered sensory-selective when the length of a cluster was above 80 ms, corresponding to twice the standard deviation of the smoothing kernel used to compute the firing rate. Whether for the shuffled data or the observed data, if more than one cluster was obtained, we discarded all but the longest cluster. This permutation test allowed us to control for multiple comparisons across time and participants.
For perception-selective neurons, we looked for differences in the firing rates between hit and miss trials during the first 400 ms post-stimulus onset. We defined a cluster as a set of adjacent time points for which the firing rates were significantly different between hits and misses as assessed by a nonparametric Wilcoxon rank sum test. As for sensory-selective neurons, a putative neuron was considered perception-selective when the length of a cluster was above 80 ms, corresponding to twice the standard deviation of the smoothing kernel used to compute the firing rate and we discarded all but the longest cluster.
Minor points:
Figure 3: inset showing action potentials, please also provide the time scale (in the legend for example), so that it's clear that it is not commensurate with the firing rate curve below, but rather corresponds to the dots of the raster plot.
We added the text ”[...], duration: 2.5 ms” in Figures 2, 3, and 4.
Line 210: I recommend: “we found 8 neurons [...] showing a significant difference *between hits and misses* after stimulus onset."
We made the change.
Top of page 9, the following sentence is misleading “This result suggests that neurons in these two subcortical structures have mostly different functional roles ; this could read as meaning that functional roles are different between the two structures. Probably what you mean is rather something along this line : “these two subcortical structures both contain neurons displaying several different functional roles”
Changed.
Line 329: remove double “when”
We made the change, thank you for spotting this.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This useful study provides the first assessment of potentially interactive effects of seasonality and blood source on mosquito fitness, together in one study. During revision, the manuscript has been substantively improved, providing additional solid data to support the robustness of observations. Overall, this interesting study will advance our current understanding of mosquito biology.
-
Reviewer #1 (Public review):
This study examines the role of host blood meal source, temperature, and photoperiod on the reproductive traits of Cx. quinquefasciatus, an important vector of numerous pathogens of medical importance. The host use pattern of Cx. quinquefasciatus is interesting in that it feeds on birds during spring and shifts to feeding on mammals towards fall. Various hypotheses have been proposed to explain the seasonal shift in host use in this species but have provided limited evidence. This study examines whether the shifting of host classes from birds to mammals towards autumn offers any reproductive advantages to Cx. quinquefasciatus in terms of enhanced fecundity, fertility, and hatchability of the offspring. The authors found no evidence of this, suggesting that alternate mechanisms may drive the seasonal shift in host use in Cx. quinquefasciatus.
-
Reviewer #2 (Public review):
Conceptually, this study is interesting and is the first attempt to account for the potentially interactive effects of seasonality and blood source on mosquito fitness, which the authors frame as a possible explanation for previously observed host-switching of Culex quinquefasciatus from birds to mammals in the fall. The authors hypothesize that if changes in fitness by blood source change between seasons, higher fitness on birds in the summer and on mammals in the autumn could drive observed host switching. To test this, the authors fed individuals from a colony of Cx. quinquefasciatus on chickens (bird model) and mice (mammal model) and subjected each of these two groups to two different environmental conditions reflecting the high and low temperatures and photoperiod experienced in summer and autumn in Córdoba, Argentina (aka seasonality). They measured fecundity, fertility, and hatchability over two gonotrophic cycles. The authors then used generalized linear mixed models to evaluate the impact of host species, seasonality, and gonotrophic cycle on fecundity, fertility, and hatchability. The authors were trying to test their hypothesis by determining whether there was an interactive effect of season and host species on mosquito fitness. This is an interesting hypothesis; if it had been supported, it would provide support for a new mechanism driving host switching. While the authors did report an interactive impact of seasonality and host species, the directionality of the effect was the opposite from that hypothesized. The authors have done a very good job of addressing many of the reviewer's concerns, especially by adding two additional replicates.
-
Author response:
The following is the authors’ response to the previous reviews
We would like to thank you for your valuable comments and suggestions, which have greatly contributed to improving our manuscript.
We have carefully addressed all the reviewers' suggestions, and detailed responses for each Reviewer are provided at the end of this letter. In summary:
• The Introduction has been revised to provide a more focused discussion on results, toning down the speculative discussion on seasonal host shifts.
• The methodology section has been clarified, particularly the power analysis, which now includes a clearer explanation. The random effects in the models have been better described to ensure transparency.
• The Results section was reorganized to highlight the key findings more effectively.
• The Discussion has been restructured for clarity and conciseness, ensuring the interpretation of the results is clearer and better aligned with the study objectives.
• Minor edits throughout the manuscript were made to improve readability and accuracy.
We hope you find this revised version of the manuscript satisfactory.
Reviewer #1 (Public review):
Summary:
This study examines the role of host blood meal source, temperature, and photoperiod on the reproductive traits of Cx. quinquefasciatus, an important vector of numerous pathogens of medical importance. The host use pattern of Cx. quinquefasciatus is interesting in that it feeds on birds during spring and shifts to feeding on mammals towards fall. Various hypotheses have been proposed to explain the seasonal shift in host use in this species but have provided limited evidence. This study examines whether the shifting of host classes from birds to mammals towards autumn offers any reproductive advantages to Cx.
quinquefasciatus in terms of enhanced fecundity, fertility, and hatchability of the offspring. The authors found no evidence of this, suggesting that alternate mechanisms may drive the seasonal shift in host use in Cx. quinquefasciatus.
Strengths:
Host blood meal source, temperature, and photoperiod were all examined together.
Weaknesses:
The study was conducted in laboratory conditions with a local population of Cx. quinquefasciatus from Argentina. I'm not sure if there is any evidence for a seasonal shift in the host use pattern in Cx. quinquefasciatus populations from the southern latitudes.
Comments on the revision:
Overall, the manuscript is much improved. However, the introduction and parts of the discussion that talk about addressing the question of seasonal shift in host use pattern of Cx. quin are still way too strong and must be toned down. There is no strong evidence to show this host shift in Argentinian mosquito populations. Therefore, it is just misleading. I suggest removing all this and sticking to discussing only the effects of blood meal source and seasonality on the reproductive outcomes of Cx. quin.
Introduction and discussion have been modified, toned down and sticked to discuss the results as suggested.
Reviewer #1 (Recommendations for the authors):
Some more minor comments are mentioned below.
Line 51: Because 'of' this,
Changed as suggested.
Line 56: specialists 'or' generalists
Changed as suggested.
Line 56: primarily
Changed as suggested.
Line 98: Because 'of' this,
Changed as suggested.
Reviewer #2 (Public review):
Summary:
Conceptually, this study is interesting and is the first attempt to account for the potentially interactive effects of seasonality and blood source on mosquito fitness, which the authors frame as a possible explanation for previously observed hostswitching of Culex quinquefasciatus from birds to mammals in the fall. The authors hypothesize that if changes in fitness by blood source change between seasons, higher fitness on birds in the summer and on mammals in the autumn could drive observed host switching. To test this, the authors fed individuals from a colony of Cx. quinquefasciatus on chickens (bird model) and mice (mammal model) and subjected each of these two groups to two different environmental conditions reflecting the high and low temperatures and photoperiod experienced in summer and autumn in Córdoba, Argentina (aka seasonality). They measured fecundity, fertility, and hatchability over two gonotrophic cycles. The authors then used generalized linear mixed models to evaluate the impact of host species, seasonality, and gonotrophic cycle on fecundity, fertility, and hatchability. The authors were trying to test their hypothesis by determining whether there was an interactive effect of season and host species on mosquito fitness. This is an interesting hypothesis; if it had been supported, it would provide support for a new mechanism driving host switching. While the authors did report an interactive impact of seasonality and host species, the directionality of the effect was the opposite from that hypothesized. The authors have done a very good job of addressing many of the reviewer's concerns, especially by adding two additional replicates. Several minor concerns remain, especially regarding unclear statements in the discussion.
Strengths:
(1) Using a combination of laboratory feedings and incubators to simulate seasonal environmental conditions is a good, controlled way to assess the potentially interactive impact of host species and seasonality on the fitness of Culex quinquefasciatus in the lab.
(2) The driving hypothesis is an interesting and creative way to think about a potential driver of host switching observed in the field.
Weaknesses:
(1) The methods would be improved by some additional details. For example, clarifying the number of generations for which mosquitoes were maintained in colony (which was changed from 20 to several) and whether replicates were conducted at different time points.
Changed as suggested.
(2) The statistical analysis requires some additional explanation. For example, you suggest that the power analysis was conducted a priori, but this was not mentioned in your first two drafts, so I wonder if it was actually conducted after the first replicate. It would be helpful to include further detail, such as how the parameters were estimated. Also, it would be helpful to clarify why replicate was included as a random effect for fecundity and fertility but as a fixed effect for hatchability. This might explain why there were no significant differences for hatchability given that you were estimating for more parameters.
The power analysis was conducted a posteriori, as you correctly inferred. While I did not indicate that it was performed a priori, you are right in noting that this was not explicitly mentioned. As you suggested, the methodology for the power analysis has been revised to clarify any potential doubts.
Regarding the model for hatchability, a model without a random effect variable was used, as all attempts to fit models with random effects resulted in poor validation. These points have now been clarified and explained in the corresponding section.
(3) A number of statements in the discussion are not clear. For example, what do you mean by a mixed perspective in the first paragraph? Also, why is the expectation mentioned in the second paragraph different from the hypothesis you described in your introduction?
Changed as suggested.
(4) According to eLife policy, data must be made freely available (not just upon request).
Data and code will be publicly available. The corresponding section was modified.
Reviewer #2 (Recommendations for the authors):
Your manuscript is much improved by the inclusion of two additional replicates! The results are much more robust when we can see that the trends that you report are replicable across 3 iterations of the experiment. Congratulations on a greatly improved study and paper! I have several minor concerns and suggestions, listed below:
38-39: I think it is clearer to say "no statistically significant effect of season on hatchability of eggs" ... or specify if you are referring to blood or the interaction of blood and season. It isn't clear which treatment you are referring to here.
Changed as suggested.
54-57: This could be stated more succinctly. Instead of citing papers that deal with specific examples of patterns, I would suggest citing a review paper that defines these terms.
Changed as suggested.
83-84: What if another migratory bird is the preferred host in Argentina? I would state this more cautiously (e.g. "may not be applicable...").
Changed as suggested.
95-96: I don't understand what you mean by this. These hypotheses are specifically meant to understand mosquitoes that DO have a distinct seasonal phenology, so I'm not sure why this caveat is relevant. And naturally this hypothesis is host dependent, since it is based on specific host reproductive investments. I think that the strongest caveat to this hypothesis is simply that it hasn't been proven.
Changed as suggested.
97-115: This is a great paragraph! Very clear and compelling.
Thanks for your words!
118: Do you have an exact or estimated number of rafts collected?
Sorry, I have not the exact number of rafts, but it was at leas more than 20-30.
135: "over twenty" was changed to "several"; several would imply about 3 generations, so this is misleading. If the colony was actually maintained for over twenty generations, then you should keep that wording.
Changed as suggested.
163-164: Can you please clarify whether the replicates were conducted a separate time points?
Changed as suggested.
Note: the track changes did not capture all of the changes made; e.g. 163-164 should show as new text but does not.
You are absolutely right; when I uploaded the last version, I unfortunately deleted all tracked changes and cannot recover them. In this new version, I will ensure that all minimal changes are included as tracked changes.
186 - 189: the terms should be "fixed effect" and "random effect"
Changed as suggested.
191: Edit: linear
Changed as suggested.
194: why was replicate not included as a random effect here when it was above? Also, can you please clarify "interaction effects"? Which interactions did you include?
Changed as suggested. Explained above and in methodology. Hatchability models with random effect variable were poor fitted and validated. The interactions for hatchability were a four-way (season, blood source, cycle and replicate)
207-208: I'm not sure what you mean by "aimed to achieve"? Weren't you doing this after you conducted the experiments, so wouldn't this be determining the power of your model (post-hoc power analysis)? Also, I think you should provide the parameter estimates that were used (e.g. effect size - did you use the effect size you estimated across the 3 replicates?).
Changed as suggested.
214-215: this should be reworded to acknowledge that this is estimated for the given effect size; for example, something like "This sample size was sufficient to detect the observed effect with a statistical power of 0.8" or something along those lines (unless I am misunderstanding how you conducted this test).
Changed as suggested.
246. Abbreviate Culex
Changed as suggested.
253-255: This sentence isn't clear. What do you mean by mixed? Also, the season really seemed to mainly impact the fitness of mosquitoes fed on mouse blood and here the way it is phrased seems to indicate that season has an impact on the fitness of those fed with chicken blood.
Changed as suggested.
258-260: You stated your hypothesis as the relative fitness shifting between seasons, but this statement about the expectation is different from your hypothesis stated earlier. Please clarify.
You are right. Thank you for noting this. It was changed as suggested.
263-266: I also don't understand this sentence; what does the first half of the sentence have to do with the second?
Changed as suggested.
269-270: This doesn't align with your observation exactly; you say first AND second are generally most productive, but you observed a drop in the second. Please clarify this.
Changed as suggested.
280: I suggest removing "as same as other studies"; your caveats are distinct because your experimental design was unique
Changed as suggested.
287: you shouldn't be looking for a "desired" effect; I suggest removing this word
Changed as suggested.
288: It wasn't really a priori though, since you conducted it after your first replicate (unless you didn't use the results from the first replicate you reported in the original drafts?)
It was a posteriori. Changed as suggested.
290: Why is 290 written here?
It was a mistype. Deleted as suggested.
291-298: The meaning of this section of your paragraph is not clear.
Improve as suggested.
304-313: This list of 3 explanations are directed at different underlying questions. Explanations 1 and 2 are alternative explanations for why host switching occurs if not due to differences in fitness. This isn't really an explanation of your results so much as alternative explanations for a previously reported phenomenon. And the third is an explanation for why you may not have observed the expected effect. I suggest restructuring this to include the fact that Argentinian quinqs may not host switch as part of your previous list of caveats. Then you can include your two alternative explanations for host switching as a possible future direction (although I would say that it is really just one explanation because "vector biology" is too broad of a statement to be testable). Also, you haven't discussed possible explanations for your actual result, which showed that mosquito fitness decreased when feeding on mouse blood in autumn conditions and in the second gonotrophic, while those that fed on chicken did not experience these changes. Why might that be?
The discussion was restructured to include all these suggested changes. Additionally, it was also discussed some possible explanations of our results.
315-317: This statement is vague without a direct explanation of how this will provide insight. I suggest removing or providing an explanation of how this provides insight to transmission and forecasting.
Changed as suggested.
319-320: According to eLife policy, all data should be publicly available. From guidelines: "Media Policy FAQs Data Availability Purpose and General Principles To maintain high standards of research reproducibility, and to promote the reuse of new findings, eLife requires all data associated with an article to be made freely and widely available. These must be in the most useful formats and according to the relevant reporting standards, unless there are compelling legal or ethical reasons to restrict access. The provision of data should comply with FAIR principles (Findable, Accessible, Interoperable, Reusable). Specifically, authors must make all original data used to support the claims of the paper, or that is required to reproduce them, available in the manuscript text, tables, figures or supplementary materials, or at a trusted digital repository (the latter is recommended). This must include all variables, treatment conditions, and observations described in the manuscript. The authors must also provide a full account of the materials and procedures used to collect, pre-process, clean, generate and analyze the data that would enable it to be independently reproduced by other researchers."
- so you need to make your data available online; I also understand the last sentence to indicate that code should be made available.
Data and code will be publicly available.
Table 1: it is notable that in replicate 2, the autumn:mouse:gonotrophic cycle II fecundity and fertility are actually higher than in the summer, which is the opposite of reps 1 and 3 and the overall effect you reported from the model. This might be worth mentioning in the discussion.
Mentioned in the discussion as suggested.
Tables 1 and 2: shouldn't this just be 8 treatments? You included replicate as a random effect, so it isn't really a separate set of treatments.
This table reflects the output of the whole experiment, that is why it is present the 24 expetiments.
Figure 3: Can you please clarify if this is showing raw data?
Changed as suggested.
Note: grammatical copy editing would be beneficial throughout
Grammar was improved as suggested.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This important work advances our understanding of how the SARS-CoV-2 Nsp16 protein is regulated by host E3 ligases to promote viral mRNA capping. Support for the overall claims in the revised manuscript is convincing . This work will be of interest to those working in host-viral interactions and the role of the ubiquitin-proteasome system in viral replication.
-
Reviewer #1 (Public review):
In this study, Tiang et al. explore the role of ubiquitination of non-structural protein 16 (nsp16) in the SARS-CoV-2 life cycle. nsp16, in conjunction with nsp10, performs the final step of viral mRNA capping through its 2'-O-methylase activity. This modification allows the virus to evade host immune responses and protects its mRNA from degradation. The authors demonstrate that nsp16 undergoes ubiquitination and subsequent degradation by the host E3 ubiquitin ligases UBR5 and MARCHF7 via the ubiquitin-proteasome system (UPS). Specifically, UBR5 and MARCHF7 mediate nsp16 degradation through K48- and K27-linked ubiquitination, respectively. Notably, degradation of nsp16 by either UBR5 or MARCHF7 operates independently, with both mechanisms effectively inhibiting SARS-CoV-2 replication in vitro and in vivo. Furthermore, UBR5 and MARCHF7 exhibit broad-spectrum antiviral activity by targeting nsp16 variants from various SARS-CoV-2 strains. This research advances our understanding of how nsp16 ubiquitination impacts viral replication and highlights potential targets for developing broadly effective antiviral therapies.
Strengths:
The proposed study is of significant interest to the virology community because it aims to elucidate the biological role of ubiquitination in coronavirus proteins and its impact on the viral life cycle. Understanding these mechanisms will address broadly applicable questions about coronavirus biology and enhance our overall knowledge of ubiquitination's diverse functions in cell biology. Employing in vivo studies is a strength.
Weaknesses:
Minor comments:<br /> Figure 5A- The authors should ensure that the figure is properly labeled to clearly distinguish between the IP (Immunoprecipitation) panel and the input panel.
-
Reviewer #3 (Public review):
Summary:
The manuscript "SARS-CoV-2 nsp16 is regulated by host E3 ubiquitin ligases, UBR5 and MARCHF7" is an interesting work by Tian et al. describing the degradation/ stability of NSP16 of SARS CoV2 via K48 and K27-linked Ubiquitination and proteasomal degradation. The authors have demonstrated that UBR5 and MARCHF7, an E3 ubiquitin ligase bring about the ubiquitination of NSP16. The concept, and experimental approach to prove the hypothesis looks ok. The in vivo data looks ok with the controls. Overall, the manuscript is good.
Strengths:
The study identified important E3 ligases (MARCHF7 and UBR5) that can ubiquitinate NSP16, an important viral factor.
Comments on revisions:
I had gone through the revised form of the manuscript thoroughly. The authors have addressed all of my concerns. To me, the experimental approach looks convincing that the host E3 ubiquitin ligases (UBR5 and MARCHF7) ubiquitinate NSP16 and mark it for proteasomal degradation via K48- and K27- linkage. The authors have represented the final figure (Fig.8) in a convincing manner, opening a new window to explore the mechanism of capping the vRNA bu NSP16.
-
Author response:
The following is the authors’ response to the previous reviews
Public Reviews:
Reviewer #1 (Public review):
In this study, Tian et al. explore the role of ubiquitination of non-structural protein 16 (nsp16) in the SARS-CoV-2 life cycle. nsp16, in conjunction with nsp10, performs the final step of viral mRNA capping through its 2'-O-methylase activity. This modification allows the virus to evade host immune responses and protects its mRNA from degradation. The authors demonstrate that nsp16 undergoes ubiquitination and subsequent degradation by the host E3 ubiquitin ligases UBR5 and MARCHF7 via the ubiquitin-proteasome system (UPS). Specifically, UBR5 and MARCHF7 mediate nsp16 degradation through K48- and K27-linked ubiquitination, respectively. Notably, degradation of nsp16 by either UBR5 or MARCHF7 operates independently, with both mechanisms effectively inhibiting SARS-CoV-2 replication in vitro and in vivo. Furthermore, UBR5 and MARCHF7 exhibit broad-spectrum antiviral activity by targeting nsp16 variants from various SARS-CoV-2 strains. This research advances our understanding of how nsp16 ubiquitination impacts viral replication and highlights potential targets for developing broadly effective antiviral therapies.
Strengths:
The proposed study is of significant interest to the virology community because it aims to elucidate the biological role of ubiquitination in coronavirus proteins and its impact on the viral life cycle. Understanding these mechanisms will address broadly applicable questions about coronavirus biology and enhance our overall knowledge of ubiquitination's diverse functions in cell biology. Employing in vivo studies is a strength.
Weaknesses:
Minor comments:
Figure 5A- The authors should ensure that the figure is properly labeled to clearly distinguish between the IP (Immunoprecipitation) panel and the input panel.
Thank you for your suggestion. We have exchanged Figure 5 in this version.
Reviewer #3 (Public review):
Summary:
The manuscript "SARS-CoV-2 nsp16 is regulated by host E3 ubiquitin ligases, UBR5 and MARCHF7" is an interesting work by Tian et al. describing the degradation/ stability of NSP16 of SARS CoV2 via K48 and K27-linked Ubiquitination and proteasomal degradation. The authors have demonstrated that UBR5 and MARCHF7, an E3 ubiquitin ligase bring about the ubiquitination of NSP16. The concept, and experimental approach to prove the hypothesis looks ok. The in vivo data looks ok with the controls. Overall, the manuscript is good.
Strengths:
The study identified important E3 ligases (MARCHF7 and UBR5) that can ubiquitinate NSP16, an important viral factor.
Comments on revisions:
I had gone through the revised form of the manuscript thoroughly. The authors have addressed all of my concerns. To me, the experimental approach looks convincing that the host E3 ubiquitin ligases (UBR5 and MARCHF7) ubiquitinate NSP16 and mark it for proteasomal degradation via K48- and K27- linkage. The authors have represented the final figure (Fig.8) in a convincing manner, opening a new window to explore the mechanism of capping the vRNA bu NSP16.
Thank you for your recognition.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This manuscript shows that chronic chemogenetic excitation of dopaminergic neurons in the mouse midbrain results in differential degeneration of axons and somas across distinct regions (SNc vs VTA). These findings are important for two reasons. This approach can be used as a mouse model for Parkinson's Disease without the need for the infusion of toxins (e.g. 6-OHDA or MPTP) — this mouse model also has the advantage of showing axon-first degeneration over a time course (2–4 weeks) that is suitable for experimental investigation. Also, the findings that direct excitation of dopaminergic neurons causes differential degeneration sheds light on the mechanisms of dopaminergic neuron selective vulnerability. The evidence that activation of dopaminergic neurons causes degeneration, alters motor behavior, and alters mRNA expression is convincing. This is an exciting paper that will have an impact on the Parkinson's Disease field.
-
Reviewer #1 (Public review):
Summary:
In this manuscript, the authors investigated the effect of chronic activation of dopamine neurons using chemogenetics. Using Gq-DREADDs, the authors chronically activated midbrain dopamine neurons and observed that these neurons, particularly their axons, exhibit increased vulnerability and degeneration, resembling the pathological symptoms of Parkinson's disease. Baseline calcium levels in midbrain dopamine neurons were also significantly elevated following the chronic activation. Lastly, to identify cellular and circuit-level changes in response to dopaminergic neuronal degeneration caused by chronic activation, the authors employed spatial genomics (Visium) and revealed comprehensive changes in gene expression in the mouse model subjected to chronic activation. In conclusion, this study presents novel data on the consequences of chronic hyperactivation of midbrain dopamine neurons.
Strengths:
This study provides direct evidence that the chronic activation of dopamine neurons is toxic and gives rise to neurodegeneration. In addition, the authors achieved the chronic activation of dopamine neurons using water application of clozapine-N-oxide (CNO), a method not commonly employed by researchers. This approach may offer new insights into pathophysiological alterations of dopamine neurons in Parkinson's disease. The authors also utilized state-of-the-art spatial gene expression analysis, which can provide valuable information for other researchers studying dopamine neurons. They also presented a substantial number of intriguing ideas in their discussion, which are worth further investigation.
Weaknesses:
Although not fully supported by data, the authors provided a well-explained rationale and proposed possible mechanisms for dopamine neuron degeneration due to chronic activation in their results and discussion.
Comments on revised version:
The authors have adequately addressed most of my comments, and I have no further concerns.
-
Reviewer #2 (Public review):
Rademacher et al. present a paper showing that chronic chemogenetic excitation of dopaminergic neurons in the mouse midbrain results in differential degeneration of axons and somas across distinct regions (SNc vs VTA). These findings are important for two reasons: 1. This approach can be used as a mouse model for Parkinson's Disease without the need for the infusion of toxins (e.g. 6-OHDA or MPTP). This mouse model also has the advantage of showing a axon-first degeneration over an experimentally-useful time course (2-4 weeks). 2. The findings that direct excitation of dopaminergic neurons causes differential degeneration sheds light on the mechanisms of dopaminergic neuron selective vulnerability. The evidence that activation of dopaminergic neurons causes degeneration, alters motor behavior, and alters mRNA expression is convincing. This is an exciting and important paper and will have an impact on the Parkinson's Disease field.
Strengths:
This is an exciting and important paper and will have an impact on the Parkinson's Disease field.
It presents a new highly useful mouse model of PD.
The paper compares mouse transcriptomics with human patient data.
It shows that selective degeneration can occur across the midbrain dopaminergic neurons even in the absence of a genetic, prion, or toxin neurodegeneration mechanism.
Weaknesses:
The authors have addressed all my concerns. This is an interesting, important, and carefully-controlled study.
-
Reviewer #3 (Public review):
Summary:
In this manuscript, Rademacher and colleagues examined the effect on the integrity of the dopamine system in mice of chronically stimulating dopamine neurons using a chemogenetic approach. They find that one to two weeks of constant exposure to the chemogenetic activator CNO leads to a decrease in the density of tyrosine hydroxylase staining in striatal brain sections and to a small reduction of the global population of tyrosine hydroxylase positive neurons in the ventral midbrain. They also report alterations in gene expression in both regions using a spatial transcriptomics approach. Globally, the work is well done and valuable and some of the conclusions are interesting. However, the conceptual advance is perhaps a bit limited in the sense that there is extensive previous work in the literature showing that excessive depolarization of multiple types of neurons associated with intracellular calcium elevations promotes neuronal degeneration. The present work adds to this by showing evidence of a similar phenomenon in dopamine neurons. In terms of the mechanisms explaining the neuronal loss observed after 2 to 4 weeks of chemogenetic activation, it would be important to consider that dopamine neurons are known from a lot of previous literature to undergo a decrease in firing through a depolarization-block mechanism when chronically depolarized. Is it possible that such a phenomenon explains much of the results observed in the present study? It would be important to consider this in the manuscript. The relevance to Parkinson's disease (PD) is also not totally clear because there is not a lot of previous solid evidence showing that the firing of dopamine neurons is increased in PD, either in human subjects or in mouse models of the disease.
Comments on revisions:
The authors have done a good job at revising the manuscript. The revised manuscript better frames the results in the context of previous literature.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer #1 (Public Review):
Summary:
In this manuscript, the authors investigated the effect of chronic activation of dopamine neurons using chemogenetics. Using Gq-DREADDs, the authors chronically activated midbrain dopamine neurons and observed that these neurons, particularly their axons, exhibit increased vulnerability and degeneration, resembling the pathological symptoms of Parkinson's disease. Baseline calcium levels in midbrain dopamine neurons were also significantly elevated following the chronic activation. Lastly, to identify cellular and circuit-level changes in response to dopaminergic neuronal degeneration caused by chronic activation, the authors employed spatial genomics (Visium) and revealed comprehensive changes in gene expression in the mouse model subjected to chronic activation. In conclusion, this study presents novel data on the consequences of chronic hyperactivation of midbrain dopamine neurons.
Strengths:
This study provides direct evidence that the chronic activation of dopamine neurons is toxic and gives rise to neurodegeneration. In addition, the authors achieved the chronic activation of dopamine neurons using water application of clozapine-N-oxide (CNO), a method not commonly employed by researchers. This approach may offer new insights into pathophysiological alterations of dopamine neurons in Parkinson's disease. The authors also utilized state-of-the-art spatial gene expression analysis, which can provide valuable information for other researchers studying dopamine neurons. Although the authors did not elucidate the mechanisms underlying dopaminergic neuronal and axonal death, they presented a substantial number of intriguing ideas in their discussion, which are worth further investigation.
We thank the reviewer for these positive comments.
Weaknesses:
Many claims raised in this paper are only partially supported by the experimental results. So, additional data are necessary to strengthen the claims. The effects of chronic activation of dopamine neurons are intriguing; however, this paper does not go beyond reporting phenomena. It lacks a comprehensive explanation for the degeneration of dopamine neurons and their axons. While the authors proposed possible mechanisms for the degeneration in their discussion, such as differentially expressed genes, these remain experimentally unexplored.
We thank the reviewer for this review. We do believe that the manuscript has a substantial mechanistic component, as the central experiments involve direct manipulation of neuronal activity, and we show an increase in calcium levels and gene expression changes in dopamine neurons that coincide with the degeneration. However, we agree that deeper mechanistic investigation would strengthen the conclusions of the paper. We have executed several important revisions, including the addition of CNO behavioral controls, manipulation of intracellular calcium using isradipine, additional transcriptomics experiments and further validation of findings. We believe that these additions significantly bolster the conclusions of the paper.
Reviewer #2 (Public Review):
Summary:
Rademacher et al. present a paper showing that chronic chemogenetic excitation of dopaminergic neurons in the mouse midbrain results in differential degeneration of axons and somas across distinct regions (SNc vs VTA). These findings are important. This mouse model also has the advantage of showing a axon-first degeneration over an experimentally-useful time course (2-4 weeks). 2. The findings that direct excitation of dopaminergic neurons causes differential degeneration sheds light on the mechanisms of dopaminergic neuron selective vulnerability. The evidence that activation of dopaminergic neurons causes degeneration and alters mRNA expression is convincing, as the authors use both vehicle and CNO control groups, but the evidence that chronic dopaminergic activation alters circadian rhythm and motor behavior is incomplete as the authors did not run a CNO-control condition in these experiments.
Strengths:
This is an exciting and important paper.
The paper compares mouse transcriptomics with human patient data.
It shows that selective degeneration can occur across the midbrain dopaminergic neurons even in the absence of a genetic, prion, or toxin neurodegeneration mechanism.
We thank the reviewer for these comments.
Weaknesses:
Major concerns:
(1) The lack of a CNO-positive, DREADD-negative control group in the behavioral experiments is the main limitation in interpreting the behavioral data. Without knowing whether CNO on its own has an impact on circadian rhythm or motor activity, the certainty that dopaminergic hyperactivity is causing these effects is lacking.
We thank the reviewer for this important recommendation. Although the initial version showed that CNO does not produce degeneration of DA neuron terminals, it did not exclude a contribution to the behavioral changes. To address this, we now include a cohort of DREADD free non-injected mice treated with either vehicle or CNO (Figure S1C). We found that on its own, CNO did not significantly impact either light cycle or dark cycle running. Together these results along with the lack of degeneration observed with CNO treatment in non-DREADD mice (Figure 2D) support that our behavioral and histological results are the result of dopamine neuron activation.
(2) One of the most exciting things about this paper is that the SNc degenerates more strongly than the VTA when both regions are, in theory, excited to the same extent. However, it is not perfectly clear that both regions respond to CNO to the same extent. The electrophysiological data showing CNO responsiveness is only conducted in the SNc. If the VTA response is significantly reduced vs the SNc response, then the selectivity of the SNc degeneration could just be because the SNc was more hyperactive than the VTA. Electrophysiology experiments comparing the VTA and SNc response to CNO could support the idea that the SNc has substantial intrinsic vulnerability factors compared to the VTA.
We agree that additional electrophysiology conducted in the VTA dopamine neurons would meaningfully add to our understanding of the selective vulnerability in this model, and have completed these experiments in the revision (Figure 1, Figure S2). We now show that in vivo treatment with CNO causes some of the same physiological changes in VTA dopamine neurons as we found in SNc dopamine neurons, including an increased spontaneous firing rate, and a similar decrease in responsiveness to CNO in the slice recordings. Together these observations support the conclusion that SNc axons are intrinsically more vulnerable to increased activity than VTA dopamine axons.
(3) The mice have access to a running wheel for the circadian rhythm experiments. Running has been shown to alter the dopaminergic system (Bastioli et al., 2022) and so the authors should clarify whether the histology, electrophysiology, fiber photometry, and transcriptomics data are conducted on mice that have been running or sedentary.
We have clarified which mice had access to a running wheel in the methods of our revision. Briefly, mice for histology, electrophysiology, and transcriptomics all had access to a running wheel during their treatment. The mice used for photometry underwent about 7 days of running wheel access approximately 3 weeks prior to the beginning of the experiment. The photometry headcaps prevented mice from having access to a running wheel in their home cage. Mice used for non-responder and non-hM3Dq (CNO alone) experiments also had access to a running wheel during their treatment. Mice used for the isradipine experiment did not have access to a running wheel, as the number of mice was too large and while unilateral hM3Dq expression allows for within-animal controls, it does not lend to clear interpretation of running wheel data.
Reviewer #3 (Public Review):
Summary:
In this manuscript, Rademacher and colleagues examined the effect on the integrity of the dopamine system in mice of chronically stimulating dopamine neurons using a chemogenetic approach. They find that one to two weeks of constant exposure to the chemogenetic activator CNO leads to a decrease in the density of tyrosine hydroxylase staining in striatal brain sections and to a small reduction of the global population of tyrosine hydroxylase positive neurons in the ventral midbrain. They also report alterations in gene expression in both regions using a spatial transcriptomics approach. Globally, the work is well done and valuable and some of the conclusions are interesting. However, the conceptual advance is perhaps a bit limited in the sense that there is extensive previous work in the literature showing that excessive depolarization of multiple types of neurons associated with intracellular calcium elevations promotes neuronal degeneration. The present work adds to this by showing evidence of a similar phenomenon in dopamine neurons.
We thank the reviewer for the careful and thoughtful review of our manuscript.
While extensive depolarization and associated intracellular calcium elevations promote degeneration generally, we emphasize that the process we describe is novel. Indeed, prior studies delivering chronic DREADDs to vulnerable neurons in models of Alzheimer’s disease did not detect an increase in neurodegeneration, despite seeing changes in protein aggregation (e.g. Yuan and Grutzendler, J Neurosci 2016, PMID: 26758850; Hussaini et al., PLOS Bio 2020, PMID: 32822389). Further, a critical finding from our study is that in our paradigm, this stressor does not impact all dopamine neurons equally, as the SNc DA neurons are more vulnerable than VTA DA neurons, mirroring selective vulnerability characteristic of Parkinson’s disease. This is consistent with a large body of literature that SNc dopamine neurons are less capable of handling large energetic and calcium loads compared to neighboring VTA neurons, and the finding that chronically altered activity is sufficient to drive this preferential loss is novel. In addition, we are not aware of prior studies that have chronically activated DREADDs over several weeks to produce neurodegeneration.
In terms of the mechanisms explaining the neuronal loss observed after 2 to 4 weeks of chemogenetic activation, it would be important to consider that dopamine neurons are known from a lot of previous literature to undergo a decrease in firing through a depolarization-block mechanism when chronically depolarized. Is it possible that such a phenomenon explains much of the results observed in the present study? It would be important to consider this in the manuscript.
Thank you for this comment. As discussed in greater detail in the “comments on results section” below, our data suggests this isn’t a prominent feature in our model. However, we cannot rule out a contribution of depolarization block, and have expanded on the discussion of this possibility in the revised manuscript.
The relevance to Parkinson's disease (PD) is also not totally clear because there is not a lot of previous solid evidence showing that the firing of dopamine neurons is increased in PD, either in human subjects or in mouse models of the disease. As such, it is not clear if the present work is really modelling something that could happen in PD in humans.
We completely agree that evidence of increased dopamine neuron activity from human PD patients is lacking, and the little data that exists is difficult to interpret without human controls. However, as we outline in the manuscript, multiple lines of evidence suggest that the activity level of dopamine neurons almost certainly does change in PD. Therefore, it is very important that we understand how changes in the level of neural activity influence the degeneration of DA neurons. In this paper we examine the impact of increased activity. Increased activity may be compensatory after initial dopamine neuron loss, or may be an initial driver of death (Rademacher & Nakamura, Exp Neurol 2024, PMID: 38092187). In addition to the human and rodent data already discussed in the manuscript, additional support for increased activity in PD models include:
• Elevated firing rates in asymptomatic MitoPark mice (Good et al., FASEB J 2011, PMID: 21233488)
• Increased frequency of spontaneous firing in patient-derived iPSC dopamine neurons and primary mouse dopamine neurons that overexpress synuclein (Lin et al., Acta Neuropath Comm 2021, PMID: 34099060)
• Increased spontaneous firing in dopamine neurons of rats injected with synuclein preformed fibrils compared to sham (Tozzi et al., Brain 2021, PMID: 34297092)
We have included citation of these important examples in our revision. In our model, we have found that chronic hyperactivity causes a substantial loss of nigral DA terminals while mesolimbic terminals are relatively spared (Figure 2), and that striatal DA levels are markedly decreased (Figure S6), phenomena that are hallmarks of Parkinson’s disease.
There are additional levels of complexity to accurately model changes in PD, which may differ between subtypes of the disease, the disease stage, and the subtype of dopamine neuron. Our study models a form of increased intrinsic activity, and interpretation of our results will be facilitated as we learn more about how the activity of DA neurons changes in humans in PD. Similarly, in future studies, it will also be important to study the impact of decreasing DA neuron activity.
Comments on the introduction:
The introduction cites a 1990 paper from the lab of Anthony Grace as support of the fact that DA neurons increase their firing rate in PD models. However, in this 1990 paper, the authors stated that: "With respect to DA cell activity, depletions of up to 96% of striatal DA did not result in substantial alterations in the proportion of DA neurons active, their mean firing rate, or their firing pattern. Increases in these parameters only occurred when striatal DA depletions exceeded 96%." Such results argue that an increase in firing rate is most likely to be a consequence of the almost complete loss of dopamine neurons rather than an initial driver of neuronal loss. The present introduction would thus benefit from being revised to clarify the overriding hypothesis and rationale in relation to PD and better represent the findings of the paper by Hollerman and Grace.
We agree that the findings of Hollerman and Grace support compensatory changes in dopamine neuron activity in response to loss of dopamine neurons, rather than informing whether dopamine neuron loss can also be an initial driver of activity. Importantly, while significant changes to burst firing were not seen until almost complete loss of dopamine neurons, these recordings were made in anesthetized rats which may not be representative of neural activity in awake animals. We adjusted the text so that this is no longer referred to as ‘partial’ loss. At the same time, we point out that the results of other studies on this point are mixed: a 50% reduction in dopamine neurons didn’t alter firing rate or bursting (Harden and Grace, J Neurosci 1995, PMID: 7666198; Bilbao et al., Brain Res 2006, PMID: 16574080), while a 40% loss was found to increase firing rate and bursting (Chen et al., Brain Res 2009. PMID: 19545547) and larger reductions alter burst firing (Hollerman & Grace, Brain Res 1990, PMID: 2126975; Stachowiak et al., J Neurosci 1987, PMID: 3110381). Importantly, even if compensatory, such late-stage increases in dopamine neuron activity may contribute to disease progression and drive a vicious cycle of degeneration in surviving neurons. In addition, we also don’t know how the threshold of dopamine neuron loss and altered activity may differ between mice and humans, and PD patients do not present with clinical symptoms until ~30-60% of nigral neurons are lost (Burke & O’Malley, Exp Neurol 2013, PMID: 22285449; Shulman et al., Annu Rev Pathol 2011, PMID: 21034221).
Other lines of evidence support the potential role of hyperactivity in disease initiation, including increased activity before dopamine neuron loss in MitoPark mice (Good et al., FASEB J 2011, PMID: 21233488), increased spontaneous firing in patient-derived iPSC dopamine neurons (Lin et al., Acta Neuropath Comm 2021, PMID: 34099060), and increased activity observed in genetic models of PD (Bishop et al., J Neurophysiol 2010, PMID: 20926611; Regoni et al., Cell Death Dis 2020, PMID: 33173027).
It would be good that the introduction refers to some of the literature on the links between excessive neuronal activity, calcium, and neurodegeneration. There is a large literature on this and referring to it would help frame the work and its novelty in a broader context.
We agree that a discussion of hyperactivity, calcium, and neurodegeneration would benefit the introduction. Accordingly, we have expanded on our citation of this literature in both the introduction and discussion sections. However, we believe that the novelty of our study lies in: 1) a chronic chemogenetic activation paradigm via drinking water, 2) demonstrating selective vulnerability of dopamine neurons as a result of altering their activity/excitability alone, and 3) comparing mouse and human spatial transcriptomics.
Comments on the results section:
The running wheel results of Figure 1 suggest that the CNO treatment caused a brief increase in running on the first day after which there was a strong decrease during the subsequent days in the active phase. This observation is also in line with the appearance of a depolarization block.
The authors examined many basic electrophysiological parameters of recorded dopamine neurons in acute brain slices. However, it is surprising that they did not report the resting membrane potential, or the input resistance. It would be important that this be added because these two parameters provide key information on the basal excitability of the recorded neurons. They would also allow us to obtain insight into the possibility that the neurons are chronically depolarized and thus in depolarization block.
We do report the input resistance in Figure S1C (now Figure S2A, S2B), which was unchanged in CNO-treated animals compared to controls. We did not previously report the resting membrane potential because many of the DA neurons were spontaneously firing. In the revision, we now report the initial membrane potential on first breaking into the cell for the whole cell recordings, which did not vary between groups (Figure S2). This is still influenced by action potential activity, but is the timepoint in the recording least impacted by dialyzing the neuron with the internal solution, which might alter the intracellular concentrations of ions. We observed increased spontaneous action potential activity ex vivo in slices from CNO-treated mice (Figure 1D), thus at least under these conditions these dopamine neurons are not in depolarization block. We also did not see strong evidence of changes in other intrinsic properties of the neurons with whole cell recordings (e.g. Figure S2). Overall, our electrophysiology experiments are not consistent with the depolarization block model, at least not due to changes in the intrinsic properties of the neurons. Although our ex vivo findings cannot exclude a contribution of depolarization block in vivo, we do show that CNO-treated mice removed from their cages for open field testing continue to have a strong trend for increased activity for approximately 10 days (Figure S4B). This finding is also consistent with increased activity of the DA neurons. We have added discussion of these important considerations in the revision.
It is great that the authors quantified not only TH levels but also the levels of mCherry, coexpressed with the chemogenetic receptor. This could in principle help to distinguish between TH downregulation and true loss of dopamine neuron cell bodies. However, the approach used here has a major caveat in that the number of mCherry-positive dopamine neurons depends on the proportion of dopamine neurons that were infected and expressed the DREADD and this could very well vary between different mice. It is very unlikely that the virus injection allowed to infect 100% of the neurons in the VTA and SNc. This could for example explain in part the mismatch between the number of VTA dopamine neurons counted in panel 2G when comparing TH and mCherry counts. Also, I see that the mCherry counts were not provided at the 2-week time point. If the mCherry had been expressed genetically by crossing the DAT-Cre mice with a floxed fluorescent reported mice, the interpretation would have been simpler. In this context, I am not convinced of the benefit of the mCherry quantifications. The authors should consider either removing these results from the final manuscript or discussing this important limitation.
We thank the reviewer for this comment, and we agree that this is a caveat of our mCherry quantification. Quantitation of the number of mCherry+ DA neurons specifically informs the impact on transduced DA neurons, and mCherry appears to be less susceptible to downregulation versus TH. As the reviewer points out, it carries the caveat that there is some variability between injections. Our control animals give us an indicator of injection variability, which is likely substantial and prevents us from detecting more subtle changes. Nonetheless, we believe that it conveys useful complementary data. We discuss this caveat in our revision. Note that mCherry was not quantified at the two-week timepoint because there is no loss of TH+ cells at that time.
Although the authors conclude that there is a global decrease in the number of dopamine neurons after 4 weeks of CNO treatment, the post-hoc tests failed to confirm that the decrease in dopamine number was significant in the SNc, the region most relevant to Parkinson's. This could be due to the fact that only a small number of mice were tested. A "n" of just 4 or 5 mice is very small for a stereological counting experiment. As such, this experiment was clearly underpowered at the statistical level. Also, the choice of the image used to illustrate this in panel 2G should be reconsidered: the image suggests that a very large loss of dopamine
neurons occurred in the SNc and this is not what the numbers show. A more representative image should be used.
We agree that the stereology experiments were performed on relatively small numbers of animals, such that only robust effects would be detected. Combined with the small effect size, this may have contributed to the post-hoc tests showing a trend of p=0.1 for both the TH and mCherry dopamine cell counts in the SN at 4 weeks. Given this small effect size, we would indeed need much larger groups to better discern these changes. Stereology is an intensive technique, and we have therefore elected to focus on terminal loss. We have also replaced panel 2G with a more representative CNO image.
In Figure 3, the authors attempt to compare intracellular calcium levels in dopamine neurons using GCaMP6 fluorescence. Because this calcium indicator is not quantitative (unlike ratiometric sensors such as Fura2), it is usually used to quantify relative changes in intracellular calcium. The present use of this probe to compare absolute values is unusual and the validity of this approach is unclear. This limitation needs to be discussed. The authors also need to refer in the text to the difference between panels D and E of this figure. It is surprising that the fluctuations in calcium levels were not quantified. I guess the hypothesis was that there should be more or larger fluctuations in the mice treated with CNO if the CNO treatment led to increased firing. This needs to be clarified.
We thank the reviewer for this comment. We understand that this method of comparing absolute values is unconventional. However, these animals were tested concurrently on the same system, and a clear effect on the absolute baseline was observed. We have included a caveat of this in our discussion. Panel D of this figure shows the raw, uncorrected photometry traces, whereas panel E shows the isosbestic corrected traces for the same recording. In panel E, the traces follow time in ascending order. We have also included frequency and amplitude data for these recordings (Figure S4A), along with discussion of the significance of these findings.
Although the spatial transcriptomic results are intriguing and certainly a great way to start thinking about how the CNO treatment could lead to the loss of dopamine neurons, the presented results, the focusing of some broad classes of differentially expressed genes and on some specific examples, do not really suggest any clear mechanism of neurodegeneration. It would perhaps be useful for the authors to use the obtained data to validate that a state of chronic depolarization was indeed induced by the chronic CNO treatment. Were genes classically linked to increased activity like cfos or bdnf elevated in the SNc or VTA dopamine neurons? In the striatum, the authors report that the levels of DARP32, a gene whose levels are linked to dopamine levels, are unchanged. Does this mean that there were no major changes in dopamine levels in the striatum of these mice?
While levels of DARPP32 mRNA were unchanged, our additional HPLC data show strong decreases in striatal dopamine in hyperactivated mice. We do not see strong changes in classic activity-related genes (data not shown), however these genes may behave differently in the context of chronic hyperactivity and ongoing degeneration. Instead, we employed NEUROeSTIMator (Bahl et al., Nature Comm. 2024, PMID: 38278804), a deep learning method to predict neural activation based on transcriptomic data. We found that predicted activity scores were significantly higher in GqCNO dopaminergic regions compared to controls (Figure X). Indeed, some of the genes used within the model to predict activity are immediate early genes eg. c-fos.
The usefulness of comparing the transcriptome of human PD SNc or VTA sections to that of the present mouse model should be better explained. In the human tissues, the transcriptome reflects the state of the tissue many years after extensive loss of dopamine neurons. It is expected that there will be few if any SNc neurons left in such sections. In comparison, the mice after 7 days of CNO treatment do not appear to have lost any dopamine neurons. As such, how can the two extremely different conditions be reasonably compared? Our mouse model and human PD progress over distinct timescales, as is the case with essentially all mouse models of neurodegenerative diseases. Nonetheless, in our view there is still great value in comparing gene expression changes in mouse models with those in human disease. It seems very likely that the same pathologic processes that drive degeneration early in the disease continue to drive degeneration later in the disease. Note that we have tried to address the discrepancy in time scales in part by comparing our mouse model to early PD samples when there is more limited SNc DA neuron loss (see the proportion of DA neurons within the areas of human tissues we selected for sampling in Author response image 1). Therefore, we can indeed use spatial transcriptomics to compare dopamine neurons from mice with initial degeneration to those in patients where degeneration is ongoing.
Author response image 1.
Violin plot of DA neuron proportions sampled within the vulnerable SNV (deconvoluted RCTD method used in unmasked tissue sections of the SNV). Control and early PD subjects.
Comments on the discussion:
In the discussion, the authors state that their calcium photometry results support a central role of calcium in activity-induced neurodegeneration. This conclusion, although plausible because of the very broad pre-existing literature linking calcium elevation (such as in excitotoxicity) to neuronal loss, should be toned down a bit as no causal relationship was established in the experiments that were carried out in the present study.
Our model utilizes hM3Dq-DREADDs that function by activating Gq pathways that are classically expected to increase intracellular calcium to increase neuronal excitability. Indeed in slices from mice that were not treated with CNO, acute CNO application caused depolarizations (Figure 1E) that can be due to an increase in intracellular calcium and also cause increases in intracellular calcium. Additionally, our results show increased calcium by fiber photometry and changes to calcium-related genes, suggesting a causal relation and crucial role of calcium in the mechanism of degeneration. However, we agree that we have not experimentally proven this point. Indeed, a small preliminary experiment with chronic isradipine failed to show protection, although it lacked power to detect a partial effect. We have acknowledged this in the text, and also briefly consider other mechanisms such as increased dopamine levels that could also mediate the toxicity.
In the discussion, the authors discuss some of the parallel changes in gene expression detected in the mouse model and in the human tissues. Because few if any dopamine neurons are expected to remain in the SNc of the human tissues used, this sort of comparison has important conceptual limitations and these need to be clearly addressed.
As discussed, we sampled SN DA neurons in early PD (see Author response image 1), and in our view there is great value for such comparisons.
A major limitation of the present discussion is that it does not discuss the possibility that the observed phenotypes are caused by the induction of a chronic state of depolarization block by the chronic CNO treatment. I encourage the authors to consider and discuss this hypothesis.
As discussed above, our analyses of DA neuron firing in slices and open field testing to date do not support a prominent contribution of depolarization block with chronic CNO treatment. However, we cannot rule out this hypothesis, therefore we have included additional electrophysiology experiments and have added discussion of this important consideration.
Also, the authors need to discuss the fact that previous work was only able to detect an increase in the firing rate of dopamine neurons after more than 95% loss of dopamine neurons. As such, the authors need to clearly discuss the relevance of the present model to PD. Are changes in firing rate a driver of neuronal loss in PD, as the authors try to make the case here, or are such changes only a secondary consequence of extensive neuronal loss (for example because a major loss of dopamine would lead to reduced D2 autoreceptor activation in the remaining neurons, and to reduced autoreceptor-mediated negative feedback on firing). This needs to be discussed.
As discussed above, while increases in dopamine neuron activity may be compensatory after loss of neurons, the precise percentage required to induce such compensatory changes is not defined in mice and varies between paradigms, and the threshold level is not known in humans. We also reiterate that a compensatory increase in activity could still promote the degeneration of critical surviving DA neurons, whose loss underlies the substantial decline in motor function that typically occurs over the course of PD. Moreover, there are also multiple lines of evidence to suggest that changes in activity can initiate and drive dopamine neuron degeneration (Rademacher & Nakamura, Exp Neurol 2024). For example, overexpression of synuclein can increase firing in cultured dopamine neurons (Dagra et al., NPJ Parkinsons Dis 2021, PMID: 34408150), while mice expressing mutant Parkin have higher mean firing rates (Regoni et al., Cell Death Dis 2020, PMID: 33173027). Similarly, an increased firing rate has been reported in the MitoPark mouse model of PD at a time preceding DA neuron degeneration (Good et al., FASEB J 2011, PMID: 21233488). We also acknowledge that alterations to dopamine neuron activity are likely complex in PD, and that dopamine neuron health and function can be impacted not just by simple increases in activity, but also by changes in activity patterns and regularity. We have amended our discussion to include the important caveat of changes in activity occurring as compensation, as well as further evidence of changes in activity preceding dopamine neuron death.
There is a very large, multi-decade literature on calcium elevation and its effects on neuronal loss in many different types of neurons. The authors should discuss their findings in this context and refer to some of this previous work. In a nutshell, the observations of the present manuscript could be summarized by stating that the chronic membrane depolarization induced by the CNO treatment is likely to induce a chronic elevation of intracellular calcium and this is then likely to activate some of the well-known calcium-dependent cell death mechanisms. Whether such cell death is linked in any way to PD is not really demonstrated by the present results. The authors are encouraged to perform a thorough revision of the discussion to address all of these issues, discuss the major limitations of the present model, and refer to the broad pre-existing literature linking membrane depolarization, calcium, and neuronal loss in many neuronal cell types.
While our model demonstrates classic excitotoxic cell death pathways, we would like to emphasize both the chronic nature of our manipulation and the progressive changes observed, with increasing degeneration seen at 1, 2, and 4 weeks of hyperactivity in an axon-first manner. This is a unique aspect of our study, in contrast to much of the previous literature which has focused on shorter timescales. Thus, while we have revised the discussion to more comprehensively acknowledge previous studies of calcium-dependent neuron cell death, we believe we have made several new contributions that are not predicted by existing literature. We have shown that this chronic manipulation is specifically toxic to nigral dopamine neurons, and the data that VTA dopamine neurons continue to be resilient even at 4 weeks is interesting and disease-relevant. We therefore do not want to use findings from other neuron types to draw assumptions about DA neurons, which are a unique and very diverse population. We acknowledge that as with all preclinical models of PD, we cannot draw definitive conclusions about PD with this data. However, we reiterate that we strongly believe that drawing connections to human disease is important, as dopamine neuron activity is very likely altered in PD and a clearer understanding of how dopamine neuron survival is impacted by activity will provide insight into the mechanisms of PD.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
(1) The temporal design of the experiments is quite confusing. For instance, Figures 1 and 3 illustrate the daily changes of the mice and suggest some critical time points within 2 weeks of CNO administration, whereas Figure 2 presents data at 2 and 4 weeks, which are much later than the proposed critical time points. Furthermore, Figure 4 includes only 1 week data, and lacks subsequent data from 2 and 4 weeks, at which significant changes such as calcium levels and neuronal/axonal degeneration are observed.
While interesting behavior and calcium phenotypes were detected within 2 and 4 weeks of CNO administration (Figures 1 and 3), we only collected tissues for histology at the 2 and 4 week time points (Figure 2). Observing degeneration of DA neuron axons but not cell bodies at 2 weeks served as a rationale to extend to the 4 week time point to determine whether degeneration was progressive. At the same time, our primary focus is on identifying early changes that may drive or contribute to the degeneration. As such, we recorded calcium changes over a 2-week treatment period, capturing the period during which almost all of the dopamine axons are lost. Similarly, we had the capacity to perform spatial transcriptomics at only one time point, and the 1 week time point was selected to capture transcriptomic changes that precede and potentially contribute to the mild and severe degeneration that occurs at 2 and 4 weeks, respectively. We have added text clarifying the rationale for the time points chosen.
(2) The authors showed the changes in neuronal firing in dopamine neurons by the administration of CNO. However, one of the most important features of dopaminergic neuronal activity is dopamine release at its axon terminals in the striatum. Thus, the claims raised in this paper would be better supported if the authors further show any alterations in dopamine release (by FSCV or fluorescent dopamine sensors) at some critical time points during or after CNO application.
While we are confident that DA release is altered due to the significant changes in behavior when hM3Dq DREADDs are activated specifically in DA neurons, the current manuscript does not quantify this, or distinguish between axonal and somatodendritic DA release. Interestingly, we did find significantly decreased striatal dopamine by HPLC after chronic activation (Figure S6). We believe that resolving these questions is beyond the scope of this manuscript, but have added text indicating the importance of these experiments.
(3) The authors used 2% sucrose as a vehicle via drinking water. Please explain the rationale behind this choice.
We used 2% sucrose as the vehicle because it is also added to the CNO water to counteract the bitterness of CNO (Kumar et al., J Neurotrauma 2024, PMID: 37905504). We have clarified this in the manuscript.
(4) As we know, mRNA levels of some genes do not always predict their protein levels; there is sometimes a huge discrepancy between mRNA and protein abundance. In this paper, the mechanistic interpretation of the results by the authors heavily relies on the spatial transcriptomics of the midbrain and striatum. Thus, the authors need to provide additional data proving that the gene expression of some genes in the CNO group is also changed at the level of protein.
We agree that validating hits at the protein level is valuable, however we were limited in our ability to assess these changes for the revision. However, we have done additional transcriptomics with the high resolution Xenium platform to increase confidence in a subset of hits of interest for follow up in future work, and we included data on genes related to DA metabolism and markers of DA neurons.
(5) The authors provided spatial transcriptomics data only for mice with one week of chronic activation. However, other data also indicate significant differences when the activation period extends beyond 10 to 12 days (Figure 1C, Figure 3D-F). While a 7-day chronic activation time point might be crucial, additional transcriptomics data from later time points would be beneficial to confirm the persistence of these changes in gene expression. Furthermore, differential gene expression (DEG) analysis at these later time points could identify novel pathways or genes influenced by the chronic activation of dopamine neurons.
This is an interesting point and would provide valuable data as to how chronic activity influences gene expression, however additional transcriptomics at later timepoints is beyond the scope of this paper. In future studies we will assess changes observed in this manuscript at other time points.
(6) Figure 1D, Figure S1C:
The authors should present the sample recording traces to demonstrate that the electrophysiological recordings were appropriately made.
These data have been provided in Figure S2.
(7) Figure S1C:
AP thresholds in SNc dopamine neurons from both groups look quite high. In addition, considering the data from the previous reports, AP peak amplitudes in SNc dopamine neurons from both groups seem to be very low. Are these values correct?
The thresholds and peaks are correct, including the AP (threshold to peak), which is typical in our (Dr. Margolis’s) experience. AP thresholds are measured from an average of at least 10 APs, as the voltage at which the derivative of the trace first exceeds 10 V/s. As mentioned in the methods section, junction potentials were not corrected, which can result in values that are a bit depolarized from ground truth. This junction potential would be consistent across all recordings, thus not impede detection of a difference in AP thresholds between groups of animals.
(8) Figure 1E:
It would be better if the statistical significance is depicted in the graph.
We don’t perform repeated measures statistics across data like these, as the data are continuous, collected at 10 kHz. For ease of displaying the data, the data for each neuron is binned and then these traces are averaged together. We display SEM to give a sense of the variance across neurons. We have provided sample traces of individual neurons to better demonstrate the variability and significance of this data (Figure S2).
(9) Figure 2C:
The representative staining images appear to be taken from coronal slices at anatomically different positions along the rostral-to-caudal axis. Although the total numbers of TH+ cells are comparable between vehicle and CNO groups in the graph, the sample images do not reflect this result. The authors should replace the current images with the better ones.
We have replaced this image in the manuscript.
Reviewer #2 (Recommendations For The Authors):
Minor concerns:
(1) The authors claim that their transcriptomics experiments are conducted 'before any degeneration has occurred'. And they do not see significant differences in the TH expression in the striatum. However, the n for these mice at 1 week is lower than the n use at 2 weeks (n=5 vs n=8-9) and the images used to show 'no degeneration' really look like there is some degeneration going on. Also, throughout the paper, there is a stronger effect when degeneration is measured with mCherry compared to when it is measured with TH. The 'no change' claim is made only with the TH comparison. It seems possible (and almost likely) that there would be significant axonal degeneration at one week with either a higher sample size or using the mCherry comparison. The authors should simply claim that their transcriptomics data is collected before any 'somatic' degeneration occurs.
Thank you, we have included data that shows partial terminal loss after one week of activation (Figure S3B, Figure S5A) and have corrected this language in the manuscript to reflect transcriptomics occurring before somatic degeneration.
(2) While selective degeneration is one of the most interesting findings in the paper, that finding is not emphasized and why it would be interesting to compare the VTA vs SNc is not discussed in the introduction.
Emphasis for comparing the VTA vs the SNc has been added to the introduction, along with additional electrophysiology data in VTA dopamine neurons in Figure 1 and Figure S2.
(3) In a similar direction, the vulnerability of dopaminergic neurons has been shown to be differential even within the SNc, with the ventral tier neurons degenerating more severely and the dorsal tier neurons remaining resilient. Is there any evidence for a ventral-dorsal degeneration gradient in the SNc in these experiments?
This is a really interesting point and changes to dopamine neuron subtypes along the ventraldorsal axis may be occurring in this model, particularly as there is more selective loss of SNc neurons. However, the cell type involved would be difficult to determine at this stage, since single cell transcriptomic resolution is necessary across the entire SNc to identify cell subtypes. Transcriptomic identification is further complicated given that transcriptome change has recently been shown with genetic manipulation (Gaertner et al., bioRxiv 2024, PMID: 38895448), and we would think could similarly change with increased activity. Assessing these issues are beyond the scope of this paper.
(4) The running data is very interesting and the circadian rhythm alterations are compelling.
However, it is unclear whether the CNO mice run more total compared with the vehicle mice.
The authors should show the combined total running data to evaluate this. We now show total running data in Figure 1C.
(5) The finding that acute CNO has no effect on the membrane potential of SNc neurons after chronic CNO exposure is very peculiar! Especially because the fiber photometry data suggests that CNO continues to have an effect in vivo. Is there any explanation for this?
While there is no acute electrophysiological response to CNO detected in this group, there may be intracellular pathways activated by the DREADD that do not acutely impact membrane potential in current clamp (I = 0 pA) mode.
(6) The terminology of chronic CNO is sometimes confusing as it refers to both 2-week and 4week administration. Using additional terminology such as 'early' and 'late' might help with clarity.
We have decreased usage of ‘chronic,’ and increased usage of more specific treatment times in order to increase clarity throughout the manuscript.
(7) In Figure 2C, the SNc image looks binarized.
This image has been updated.
(8) Also in Figure 2, why are TH and mCherry measured for the 4-week time point, but only TH measured for the 2-week time point?
mCherry quantification was performed to further support the finding of DA neuron death, and was therefore not assessed at 2 weeks given that there was no change in the TH stereology.
(9) Additional scale bars and labeling is needed in Figure 3. In addition, there is such a strong reduction in noise after chronic CNO in the fiber photometry recordings, and the noise does not return upon CNO washout. What is the explanation for this?
Additional scale bars were added to Figure 3. Traces are not getting less noisy with chronic CNO treatment, rather, there is less bursting activity in the dopamine cells. Our interpretation is that the baseline activity is rescued during washout but this bursting activity is not.
(10) While not necessary to support the claims in this paper, it would be very interesting to see if chronic inhibition of dopaminergic neurons had a similar or different effect, as too little dopaminergic activity may also cause degeneration in some cases.
We agree that assessing chronic inhibition is valuable, and this is an important area for future research.
Reviewer #3 (Recommendations For The Authors):
All the mice used in the study are not listed in the methods section. For example, the GCaMP6f floxed mice discussed in the results section are not listed in the methods. Also, the breeding scheme used for the different mouse lines needs to be described. For example, did the DAT-Cre mice carry one or two alleles?
Both the DAT<sup>IRES</sup>Cre and GCaMP6f floxed (Ai148) Jax mouse line numbers and RRIDs are included in the methods. DAT<sup>IRES</sup>Cre mice carried two alleles.
In the methods section, the amount of virus injected needs to be mentioned.
This information has been added to the methods section.
In all result graphs, please include the individual data points so that the readers can see the distribution of the data and quickly see the sample size.
Graphs have been updated to include all individual data points. For line graphs, the distribution is communicated by the error bars, while the n is in the legends.
The authors provide running wheel data in supplementary figure 1A to validate that chemogenetic activation of dopamine neurons leads to increased locomotor activity. The results shown in the figure appear to be qualitative as no average data is presented. The authors should provide average data from all mice tested.
Average IP response data for all mice assessed for running wheel activity has been included in Figure S1.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife assessment
In this manuscript, Rademacher and colleagues examined the effect of a chemogenetic approach on the integrity of the dopamine system in mice with chronically stimulating dopamine neurons. These findings are important: (1)This approach led to an axon-first degeneration over an experimentally useful time course (2-4 weeks); (2) The finding that direct excitation of dopaminergic neurons causes differential degeneration sheds light on dopaminergic neuron selective vulnerability mechanisms. Overall, the strength of the evidence is solid, but the behavior experiments that do not include a CNO control provide incomplete support for the findings.
-
Reviewer #1 (Public Review):
Summary:
In this manuscript, the authors investigated the effect of chronic activation of dopamine neurons using chemogenetics. Using Gq-DREADDs, the authors chronically activated midbrain dopamine neurons and observed that these neurons, particularly their axons, exhibit increased vulnerability and degeneration, resembling the pathological symptoms of Parkinson's disease. Baseline calcium levels in midbrain dopamine neurons were also significantly elevated following the chronic activation. Lastly, to identify cellular and circuit-level changes in response to dopaminergic neuronal degeneration caused by chronic activation, the authors employed spatial genomics (Visium) and revealed comprehensive changes in gene expression in the mouse model subjected to chronic activation. In conclusion, this study presents novel data on the consequences of chronic hyperactivation of midbrain dopamine neurons.
Strengths:
This study provides direct evidence that the chronic activation of dopamine neurons is toxic and gives rise to neurodegeneration. In addition, the authors achieved the chronic activation of dopamine neurons using water application of clozapine-N-oxide (CNO), a method not commonly employed by researchers. This approach may offer new insights into pathophysiological alterations of dopamine neurons in Parkinson's disease. The authors also utilized state-of-the-art spatial gene expression analysis, which can provide valuable information for other researchers studying dopamine neurons. Although the authors did not elucidate the mechanisms underlying dopaminergic neuronal and axonal death, they presented a substantial number of intriguing ideas in their discussion, which are worth further investigation.
Weaknesses:
Many claims raised in this paper are only partially supported by the experimental results. So, additional data are necessary to strengthen the claims. The effects of chronic activation of dopamine neurons are intriguing; however, this paper does not go beyond reporting phenomena. It lacks a comprehensive explanation for the degeneration of dopamine neurons and their axons. While the authors proposed possible mechanisms for the degeneration in their discussion, such as differentially expressed genes, these remain experimentally unexplored.
-
Reviewer #2 (Public Review):<br /> <br /> Summary:
Rademacher et al. present a paper showing that chronic chemogenetic excitation of dopaminergic neurons in the mouse midbrain results in differential degeneration of axons and somas across distinct regions (SNc vs VTA). These findings are important. This mouse model also has the advantage of showing a axon-first degeneration over an experimentally-useful time course (2-4 weeks). 2. The findings that direct excitation of dopaminergic neurons causes differential degeneration sheds light on the mechanisms of dopaminergic neuron selective vulnerability. The evidence that activation of dopaminergic neurons causes degeneration and alters mRNA expression is convincing, as the authors use both vehicle and CNO control groups, but the evidence that chronic dopaminergic activation alters circadian rhythm and motor behavior is incomplete as the authors did not run a CNO-control condition in these experiments.
Strengths:<br /> This is an exciting and important paper.<br /> The paper compares mouse transcriptomics with human patient data.<br /> It shows that selective degeneration can occur across the midbrain dopaminergic neurons even in the absence of a genetic, prion, or toxin neurodegeneration mechanism.
Weaknesses:
Major concerns:
(1) The lack of a CNO-positive, DREADD-negative control group in the behavioral experiments is the main limitation in interpreting the behavioral data. Without knowing whether CNO on its own has an impact on circadian rhythm or motor activity, the certainty that dopaminergic hyperactivity is causing these effects is lacking.
(2) One of the most exciting things about this paper is that the SNc degenerates more strongly than the VTA when both regions are, in theory, excited to the same extent. However, it is not perfectly clear that both regions respond to CNO to the same extent. The electrophysiological data showing CNO responsiveness is only conducted in the SNc. If the VTA response is significantly reduced vs the SNc response, then the selectivity of the SNc degeneration could just be because the SNc was more hyperactive than the VTA. Electrophysiology experiments comparing the VTA and SNc response to CNO could support the idea that the SNc has substantial intrinsic vulnerability factors compared to the VTA.
(3) The mice have access to a running wheel for the circadian rhythm experiments. Running has been shown to alter the dopaminergic system (Bastioli et al., 2022) and so the authors should clarify whether the histology, electrophysiology, fiber photometry, and transcriptomics data are conducted on mice that have been running or sedentary.
-
Reviewer #3 (Public Review):
Summary:
In this manuscript, Rademacher and colleagues examined the effect on the integrity of the dopamine system in mice of chronically stimulating dopamine neurons using a chemogenetic approach. They find that one to two weeks of constant exposure to the chemogenetic activator CNO leads to a decrease in the density of tyrosine hydroxylase staining in striatal brain sections and to a small reduction of the global population of tyrosine hydroxylase positive neurons in the ventral midbrain. They also report alterations in gene expression in both regions using a spatial transcriptomics approach. Globally, the work is well done and valuable and some of the conclusions are interesting. However, the conceptual advance is perhaps a bit limited in the sense that there is extensive previous work in the literature showing that excessive depolarization of multiple types of neurons associated with intracellular calcium elevations promotes neuronal degeneration. The present work adds to this by showing evidence of a similar phenomenon in dopamine neurons. In terms of the mechanisms explaining the neuronal loss observed after 2 to 4 weeks of chemogenetic activation, it would be important to consider that dopamine neurons are known from a lot of previous literature to undergo a decrease in firing through a depolarization-block mechanism when chronically depolarized. Is it possible that such a phenomenon explains much of the results observed in the present study? It would be important to consider this in the manuscript. The relevance to Parkinson's disease (PD) is also not totally clear because there is not a lot of previous solid evidence showing that the firing of dopamine neurons is increased in PD, either in human subjects or in mouse models of the disease. As such, it is not clear if the present work is really modelling something that could happen in PD in humans.
Comments on the introduction:
The introduction cites a 1990 paper from the lab of Anthony Grace as support of the fact that DA neurons increase their firing rate in PD models. However, in this 1990 paper, the authors stated that: "With respect to DA cell activity, depletions of up to 96% of striatal DA did not result in substantial alterations in the proportion of DA neurons active, their mean firing rate, or their firing pattern. Increases in these parameters only occurred when striatal DA depletions exceeded 96%." Such results argue that an increase in firing rate is most likely to be a consequence of the almost complete loss of dopamine neurons rather than an initial driver of neuronal loss. The present introduction would thus benefit from being revised to clarify the overriding hypothesis and rationale in relation to PD and better represent the findings of the paper by Hollerman and Grace.
It would be good that the introduction refers to some of the literature on the links between excessive neuronal activity, calcium, and neurodegeneration. There is a large literature on this and referring to it would help frame the work and its novelty in a broader context.
Comments on the results section:
The running wheel results of Figure 1 suggest that the CNO treatment caused a brief increase in running on the first day after which there was a strong decrease during the subsequent days in the active phase. This observation is also in line with the appearance of a depolarization block.
The authors examined many basic electrophysiological parameters of recorded dopamine neurons in acute brain slices. However, it is surprising that they did not report the resting membrane potential, or the input resistance. It would be important that this be added because these two parameters provide key information on the basal excitability of the recorded neurons. They would also allow us to obtain insight into the possibility that the neurons are chronically depolarized and thus in depolarization block.
It is great that the authors quantified not only TH levels but also the levels of mCherry, co-expressed with the chemogenetic receptor. This could in principle help to distinguish between TH downregulation and true loss of dopamine neuron cell bodies. However, the approach used here has a major caveat in that the number of mCherry-positive dopamine neurons depends on the proportion of dopamine neurons that were infected and expressed the DREADD and this could very well vary between different mice. It is very unlikely that the virus injection allowed to infect 100% of the neurons in the VTA and SNc. This could for example explain in part the mismatch between the number of VTA dopamine neurons counted in panel 2G when comparing TH and mCherry counts. Also, I see that the mCherry counts were not provided at the 2-week time point. If the mCherry had been expressed genetically by crossing the DAT-Cre mice with a floxed fluorescent reported mice, the interpretation would have been simpler. In this context, I am not convinced of the benefit of the mCherry quantifications. The authors should consider either removing these results from the final manuscript or discussing this important limitation.
Although the authors conclude that there is a global decrease in the number of dopamine neurons after 4 weeks of CNO treatment, the post-hoc tests failed to confirm that the decrease in dopamine number was significant in the SNc, the region most relevant to Parkinson's. This could be due to the fact that only a small number of mice were tested. A "n" of just 4 or 5 mice is very small for a stereological counting experiment. As such, this experiment was clearly underpowered at the statistical level. Also, the choice of the image used to illustrate this in panel 2G should be reconsidered: the image suggests that a very large loss of dopamine neurons occurred in the SNc and this is not what the numbers show. A more representative image should be used.
In Figure 3, the authors attempt to compare intracellular calcium levels in dopamine neurons using GCaMP6 fluorescence. Because this calcium indicator is not quantitative (unlike ratiometric sensors such as Fura2), it is usually used to quantify relative changes in intracellular calcium. The present use of this probe to compare absolute values is unusual and the validity of this approach is unclear. This limitation needs to be discussed. The authors also need to refer in the text to the difference between panels D and E of this figure. It is surprising that the fluctuations in calcium levels were not quantified. I guess the hypothesis was that there should be more or larger fluctuations in the mice treated with CNO if the CNO treatment led to increased firing. This needs to be clarified.
Although the spatial transcriptomic results are intriguing and certainly a great way to start thinking about how the CNO treatment could lead to the loss of dopamine neurons, the presented results, the focussing of some broad classes of differentially expressed genes and on some specific examples, do not really suggest any clear mechanism of neurodegeneration. It would perhaps be useful for the authors to use the obtained data to validate that a state of chronic depolarization was indeed induced by the chronic CNO treatment. Were genes classically linked to increased activity like cfos or bdnf elevated in the SNc or VTA dopamine neurons? In the striatum, the authors report that the levels of DARP32, a gene whose levels are linked to dopamine levels, are unchanged. Does this mean that there were no major changes in dopamine levels in the striatum of these mice?
The usefulness of comparing the transcriptome of human PD SNc or VTA sections to that of the present mouse model should be better explained. In the human tissues, the transcriptome reflects the state of the tissue many years after extensive loss of dopamine neurons. It is expected that there will be few if any SNc neurons left in such sections. In comparison, the mice after 7 days of CNO treatment do not appear to have lost any dopamine neurons. As such, how can the two extremely different conditions be reasonably compared?
Comments on the discussion:
In the discussion, the authors state that their calcium photometry results support a central role of calcium in activity-induced neurodegeneration. This conclusion, although plausible because of the very broad pre-existing literature linking calcium elevation (such as in excitotoxicity) to neuronal loss, should be toned down a bit as no causal relationship was established in the experiments that were carried out in the present study.
In the discussion, the authors discuss some of the parallel changes in gene expression detected in the mouse model and in the human tissues. Because few if any dopamine neurons are expected to remain in the SNc of the human tissues used, this sort of comparison has important conceptual limitations and these need to be clearly addressed.
A major limitation of the present discussion is that it does not discuss the possibility that the observed phenotypes are caused by the induction of a chronic state of depolarization block by the chronic CNO treatment. I encourage the authors to consider and discuss this hypothesis. Also, the authors need to discuss the fact that previous work was only able to detect an increase in the firing rate of dopamine neurons after more than 95% loss of dopamine neurons. As such, the authors need to clearly discuss the relevance of the present model to PD. Are changes in firing rate a driver of neuronal loss in PD, as the authors try to make the case here, or are such changes only a secondary consequence of extensive neuronal loss (for example because a major loss of dopamine would lead to reduced D2 autoreceptor activation in the remaining neurons, and to reduced autoreceptor-mediated negative feedback on firing). This needs to be discussed.
There is a very large, multi-decade literature on calcium elevation and its effects on neuronal loss in many different types of neurons. The authors should discuss their findings in this context and refer to some of this previous work. In a nutshell, the observations of the present manuscript could be summarized by stating that the chronic membrane depolarization induced by the CNO treatment is likely to induce a chronic elevation of intracellular calcium and this is then likely to activate some of the well-known calcium-dependent cell death mechanisms. Whether such cell death is linked in any way to PD is not really demonstrated by the present results.
The authors are encouraged to perform a thorough revision of the discussion to address all of these issues, discuss the major limitations of the present model, and refer to the broad pre-existing literature linking membrane depolarization, calcium, and neuronal loss in many neuronal cell types.
-
Author response:
Reviewer #1 (Public Review):
Summary:
In this manuscript, the authors investigated the effect of chronic activation of dopamine neurons using chemogenetics. Using Gq-DREADDs, the authors chronically activated midbrain dopamine neurons and observed that these neurons, particularly their axons, exhibit increased vulnerability and degeneration, resembling the pathological symptoms of Parkinson's disease. Baseline calcium levels in midbrain dopamine neurons were also significantly elevated following the chronic activation. Lastly, to identify cellular and circuit-level changes in response to dopaminergic neuronal degeneration caused by chronic activation, the authors employed spatial genomics (Visium) and revealed comprehensive changes in gene expression in the mouse model subjected to chronic activation. In conclusion, this study presents novel data on the consequences of chronic hyperactivation of midbrain dopamine neurons.
Strengths:
This study provides direct evidence that the chronic activation of dopamine neurons is toxic and gives rise to neurodegeneration. In addition, the authors achieved the chronic activation of dopamine neurons using water application of clozapine-N-oxide (CNO), a method not commonly employed by researchers. This approach may offer new insights into pathophysiological alterations of dopamine neurons in Parkinson's disease. The authors also utilized state-of-the-art spatial gene expression analysis, which can provide valuable information for other researchers studying dopamine neurons. Although the authors did not elucidate the mechanisms underlying dopaminergic neuronal and axonal death, they presented a substantial number of intriguing ideas in their discussion, which are worth further investigation.
We thank the reviewer for these positive comments.
Weaknesses:
Many claims raised in this paper are only partially supported by the experimental results. So, additional data are necessary to strengthen the claims. The effects of chronic activation of dopamine neurons are intriguing; however, this paper does not go beyond reporting phenomena. It lacks a comprehensive explanation for the degeneration of dopamine neurons and their axons. While the authors proposed possible mechanisms for the degeneration in their discussion, such as differentially expressed genes, these remain experimentally unexplored.
We thank the reviewer for this review. We do believe that the manuscript has a mechanistic component, as the central experiments involve direct manipulation of neuronal activity, and we show an increase in calcium levels and gene expression changes in dopamine neurons that coincide with the degeneration. However, we agree that deeper mechanistic investigation would strengthen the conclusions of the paper. We have planned several important revisions, including the addition of CNO behavioral controls, manipulation of intracellular calcium using isradipine, additional transcriptomics experiments and further validation of findings. We anticipate that these additions will significantly bolster the conclusions of the paper.
Reviewer #2 (Public Review):
Summary:
Rademacher et al. present a paper showing that chronic chemogenetic excitation of dopaminergic neurons in the mouse midbrain results in differential degeneration of axons and somas across distinct regions (SNc vs VTA). These findings are important. This mouse model also has the advantage of showing a axon-first degeneration over an experimentally-useful time course (2-4 weeks). 2. The findings that direct excitation of dopaminergic neurons causes differential degeneration sheds light on the mechanisms of dopaminergic neuron selective vulnerability. The evidence that activation of dopaminergic neurons causes degeneration and alters mRNA expression is convincing, as the authors use both vehicle and CNO control groups, but the evidence that chronic dopaminergic activation alters circadian rhythm and motor behavior is incomplete as the authors did not run a CNO-control condition in these experiments.
Strengths:
This is an exciting and important paper.
The paper compares mouse transcriptomics with human patient data.
It shows that selective degeneration can occur across the midbrain dopaminergic neurons even in the absence of a genetic, prion, or toxin neurodegeneration mechanism.
We thank the reviewer for these insightful comments.
Weaknesses:
Major concerns:
(1) The lack of a CNO-positive, DREADD-negative control group in the behavioral experiments is the main limitation in interpreting the behavioral data. Without knowing whether CNO on its own has an impact on circadian rhythm or motor activity, the certainty that dopaminergic hyperactivity is causing these effects is lacking.
This is an important point. Although we show that CNO does not produce degeneration of DA neuron terminals, we do not exclude a contribution to the behavioral changes. We agree that this behavioral control is necessary, and will address it in revision with a CNO-only running wheel cohort.
(2) One of the most exciting things about this paper is that the SNc degenerates more strongly than the VTA when both regions are, in theory, excited to the same extent. However, it is not perfectly clear that both regions respond to CNO to the same extent. The electrophysiological data showing CNO responsiveness is only conducted in the SNc. If the VTA response is significantly reduced vs the SNc response, then the selectivity of the SNc degeneration could just be because the SNc was more hyperactive than the VTA. Electrophysiology experiments comparing the VTA and SNc response to CNO could support the idea that the SNc has substantial intrinsic vulnerability factors compared to the VTA.
We agree that additional electrophysiology conducted in the VTA dopamine neurons would meaningfully add to our understanding of the selective vulnerability in this model, and will complete these experiments in revision.
(3) The mice have access to a running wheel for the circadian rhythm experiments. Running has been shown to alter the dopaminergic system (Bastioli et al., 2022) and so the authors should clarify whether the histology, electrophysiology, fiber photometry, and transcriptomics data are conducted on mice that have been running or sedentary.
We will explicitly clarify which mice had access to a running wheel in our revision. Briefly, mice for histology, electrophysiology, and transcriptomics all had access to a running wheel during their treatment. The mice used for photometry underwent about 7 days of running wheel access approximately 3 weeks prior to the beginning of the experiment. The photometry headcaps sterically prevented mice from having access to a running wheel in their home cage.
Reviewer #3 (Public Review):
Summary:
In this manuscript, Rademacher and colleagues examined the effect on the integrity of the dopamine system in mice of chronically stimulating dopamine neurons using a chemogenetic approach. They find that one to two weeks of constant exposure to the chemogenetic activator CNO leads to a decrease in the density of tyrosine hydroxylase staining in striatal brain sections and to a small reduction of the global population of tyrosine hydroxylase positive neurons in the ventral midbrain. They also report alterations in gene expression in both regions using a spatial transcriptomics approach. Globally, the work is well done and valuable and some of the conclusions are interesting. However, the conceptual advance is perhaps a bit limited in the sense that there is extensive previous work in the literature showing that excessive depolarization of multiple types of neurons associated with intracellular calcium elevations promotes neuronal degeneration. The present work adds to this by showing evidence of a similar phenomenon in dopamine neurons.
We thank the reviewer for the careful and thoughtful review of our manuscript.
While extensive depolarization and associated intracellular calcium elevations promotes degeneration generally, we emphasize that the process we describe is novel. Indeed, prior studies delivering chronic DREADDs to vulnerable neurons in models of Alzheimer’s disease did not report an increase in neurodegeneration, despite seeing changes in protein aggregation (e.g. Yuan and Grutzendler, J Neurosci 2016, PMID: 26758850; Hussaini et al., PLOS Bio 2020, PMID: 32822389). Further, a critical finding from our study is that in our paradigm, this stressor does not impact all dopamine neurons equally, as the SNc DA neurons are more vulnerable than the VTA, mirroring selective vulnerability characteristic of Parkinson’s disease. This is consistent with a large body of literature that SNc dopamine neurons are less capable of handling large energetic and calcium loads compared to neighboring VTA neurons, and the finding that chronically altered activity is sufficient to drive this preferential loss is novel.
In addition, we are not aware of prior studies that have chronically activated DREADDs to produce neurodegeneration. Other studies have shown that acute excitotoxic stressors can produce neuronal degeneration, but the chronic increase in activity is central to our approach.
In terms of the mechanisms explaining the neuronal loss observed after 2 to 4 weeks of chemogenetic activation, it would be important to consider that dopamine neurons are known from a lot of previous literature to undergo a decrease in firing through a depolarization-block mechanism when chronically depolarized. Is it possible that such a phenomenon explains much of the results observed in the present study? It would be important to consider this in the manuscript.
As discussed in greater detail in the results section below, our data suggests this may not be a prominent feature in our model. However, we cannot rule out a contribution of depolarization block, and will expand on the discussion of this possibility in the revised manuscript.
The relevance to Parkinson's disease (PD) is also not totally clear because there is not a lot of previous solid evidence showing that the firing of dopamine neurons is increased in PD, either in human subjects or in mouse models of the disease. As such, it is not clear if the present work is really modelling something that could happen in PD in humans.
We completely agree that evidence of increased dopamine neuron activity from human PD patients is lacking and the existing data are difficult to interpret without human controls. However, as we outline in the manuscript, multiple lines of evidence suggest that the activity level of dopamine neurons almost certainly does change in PD. Therefore, it is very important that we understand how changes in the level of neural activity influence the degeneration of DA neurons. In this paper we examine the impact of increased activity. Increased activity may be compensatory after initial dopamine neuron loss, or may be an initial driver of death (Rademacher & Nakamura, Exp Neurol 2024, PMID: 38092187). Beyond what is already discussed in the manuscript, additional support for increased activity in PD models include:
- Elevated firing rates in asymptomatic MitoPark mice (Good et al., FASEB J 2011, PMID: 21233488)
- Increased frequency of spontaneous firing in patient-derived iPSC dopamine neurons and primary mouse dopamine neurons that overexpress synuclein (Lin et al., Acta Neuropath Comm 2021, PMID: 34099060)
- Increased spontaneous firing in dopamine neurons of rats injected with synuclein preformed fibrils compared to sham (Tozzi et al., Brain 2021, PMID: 34297092)
We will include and further discuss these important examples in our revision.
Similarly, in future studies, it will also be important to study the impact of decreasing DA neuron activity. There will be additional levels of complexity to accurately model changes in PD, which may differ between subtypes of the disease, the disease stage, and the subtype of dopamine neuron. Our study models the possibility of chronically increased pacemaking, and interpretation of our results will be informed as we learn more about how the activity of DA neurons changes in humans in PD. We will discuss and elaborate on these important points in the revision.
Comments on the introduction:
The introduction cites a 1990 paper from the lab of Anthony Grace as support of the fact that DA neurons increase their firing rate in PD models. However, in this 1990 paper, the authors stated that: "With respect to DA cell activity, depletions of up to 96% of striatal DA did not result in substantial alterations in the proportion of DA neurons active, their mean firing rate, or their firing pattern. Increases in these parameters only occurred when striatal DA depletions exceeded 96%." Such results argue that an increase in firing rate is most likely to be a consequence of the almost complete loss of dopamine neurons rather than an initial driver of neuronal loss. The present introduction would thus benefit from being revised to clarify the overriding hypothesis and rationale in relation to PD and better represent the findings of the paper by Hollerman and Grace.
We agree that the findings of Hollerman and Grace support compensatory changes in dopamine neuron activity in response to loss of dopamine neurons, rather than informing whether dopamine neuron loss can also be an initial driver of activity. We will clarify this point in our revision. In addition, the results of other studies on this point are mixed: a 50% reduction in dopamine neurons didn’t alter firing rate or bursting (Harden and Grace, J Neurosci 1995, PMID: 7666198; Bilbao et al, Brain Res 2006, PMID: 16574080), while a 40% loss was found to increase firing rate and bursting (Chen et al, Brain Res 2009. PMID: 19545547) and larger reductions alter burst firing (Hollerman & Grace, Brain Res 1990, PMID: 2126975; Stachowiak et al, J Neurosci 1987, PMID: 3110381). Importantly, even if compensatory, such late-stage increases in dopamine neuron activity may contribute to disease progression and drive a vicious cycle of degeneration in surviving neurons. In addition, we also don’t know how the threshold of dopamine neuron loss and altered activity may differ between mice and humans, and PD patients do not present with clinical symptoms until ~30-60% of nigral neurons are lost (Burke & O’Malley, Exp Neurol 2013, PMID: 22285449; Shulman et al, Annu Rev Pathol 2011, PMID: 21034221).
Other lines of evidence support the potential role of hyperactivity in disease initiation, including increased activity before dopamine neuron loss in MitoPark mice (Good et al., FASEB J 2011, PMID: 21233488), increased spontaneous firing in patient-derived iPSC dopamine neurons (Lin et al., Acta Neuropath Comm 2021, PMID: 34099060), and increased activity observed in genetic models of PD (Bishop et al., J Neurophysiol 2010, PMID: 20926611; Regoni et al., Cell Death Dis 2020, PMID: 33173027).
It would be good that the introduction refers to some of the literature on the links between excessive neuronal activity, calcium, and neurodegeneration. There is a large literature on this and referring to it would help frame the work and its novelty in a broader context.
We agree that a discussion of hyperactivity, calcium, and neurodegeneration would benefit the introduction. While we briefly discuss calcium and neurodegeneration in the discussion, we will expand on this literature in both the introduction and discussion sections. We will carefully review and contextualize our work within existing frameworks of calcium and neurodegeneration (e.g. Surmeier & Schumacker, J Biol Chem 2013, PMID: 23086948; Verma et al., Transl Neurodegener 2022, PMID: 35078537). We believe that the novelty of our study lies in 1) a chronic chemogenetic activation paradigm via drinking water, 2) demonstrating selective vulnerability of dopamine neurons as a result of altering their activity/excitability alone, and 3) comparing mouse and human spatial transcriptomics.
Comments on the results section:
The running wheel results of Figure 1 suggest that the CNO treatment caused a brief increase in running on the first day after which there was a strong decrease during the subsequent days in the active phase. This observation is also in line with the appearance of a depolarization block.
The authors examined many basic electrophysiological parameters of recorded dopamine neurons in acute brain slices. However, it is surprising that they did not report the resting membrane potential, or the input resistance. It would be important that this be added because these two parameters provide key information on the basal excitability of the recorded neurons. They would also allow us to obtain insight into the possibility that the neurons are chronically depolarized and thus in depolarization block.
We do report the input resistance in Supplemental Figure 1C, which was unchanged in CNO-treated animals compared to controls. We did not report the resting membrane potential because many of the DA neurons were spontaneously firing. However, we will report the initial membrane potential on first breaking into the cell for the whole cell recordings in the revision, which did not vary between groups. This is still influenced by action potential activity, but is the timepoint in the recording least impacted by dialyzing of the neuron by the internal solution. We observed increased spontaneous action potential activity ex vivo in slices from CNO-treated mice (Figure 1D), thus at least under these conditions these dopamine neurons are not in depolarization block. We also did not see strong evidence of changes in other intrinsic properties of the neurons with whole cell recordings (e.g. Figure S1C). Overall, our electrophysiology experiments are not consistent with the depolarization block model, at least not due to changes in the intrinsic properties of the neurons. Although our ex vivo findings cannot exclude a contribution of depolarization block in vivo, we do show that CNO-treated mice removed from their cages for open field testing continue to have a strong trend for increased activity for approximately 10 days (S1E). This finding is also consistent with increased activity of the DA neurons. We will add discussion of these important considerations in the revision.
It is great that the authors quantified not only TH levels but also the levels of mCherry, co-expressed with the chemogenetic receptor. This could in principle help to distinguish between TH downregulation and true loss of dopamine neuron cell bodies. However, the approach used here has a major caveat in that the number of mCherry-positive dopamine neurons depends on the proportion of dopamine neurons that were infected and expressed the DREADD and this could very well vary between different mice. It is very unlikely that the virus injection allowed to infect 100% of the neurons in the VTA and SNc. This could for example explain in part the mismatch between the number of VTA dopamine neurons counted in panel 2G when comparing TH and mCherry counts. Also, I see that the mCherry counts were not provided at the 2-week time point. If the mCherry had been expressed genetically by crossing the DAT-Cre mice with a floxed fluorescent reported mice, the interpretation would have been simpler. In this context, I am not convinced of the benefit of the mCherry quantifications. The authors should consider either removing these results from the final manuscript or discussing this important limitation.
We thank the reviewer for this insightful comment, and we agree that this is a caveat of our mCherry quantification. Quantitation of the number of mCherry+ DA neurons specifically informs the impact on transduced DA neurons, and mCherry appears to be less susceptible to downregulation versus TH. As the reviewer points out, it carries the caveat that there is some variability between injections. Nonetheless, we believe that it conveys useful complementary data. As suggested, we will discuss this caveat in our revision. Note that mCherry was not quantified at the two-week timepoint because there is no loss of TH+ cells at that time.
Although the authors conclude that there is a global decrease in the number of dopamine neurons after 4 weeks of CNO treatment, the post-hoc tests failed to confirm that the decrease in dopamine number was significant in the SNc, the region most relevant to Parkinson's. This could be due to the fact that only a small number of mice were tested. A "n" of just 4 or 5 mice is very small for a stereological counting experiment. As such, this experiment was clearly underpowered at the statistical level. Also, the choice of the image used to illustrate this in panel 2G should be reconsidered: the image suggests that a very large loss of dopamine neurons occurred in the SNc and this is not what the numbers show. A more representative image should be used.
We agree that the stereology experiments were performed on relatively small numbers of animals. Combined with the small effect size, this may have contributed to the post-hoc tests showing a trend of p=0.1 for both the TH and mCherry dopamine cell counts in the SN at 4 weeks. As part of the planned experiments for our revision, we will perform an additional stereologic analysis to further assess the loss of SNc dopamine neurons. We will also review and ensure the images are representative.
In Figure 3, the authors attempt to compare intracellular calcium levels in dopamine neurons using GCaMP6 fluorescence. Because this calcium indicator is not quantitative (unlike ratiometric sensors such as Fura2), it is usually used to quantify relative changes in intracellular calcium. The present use of this probe to compare absolute values is unusual and the validity of this approach is unclear. This limitation needs to be discussed. The authors also need to refer in the text to the difference between panels D and E of this figure. It is surprising that the fluctuations in calcium levels were not quantified. I guess the hypothesis was that there should be more or larger fluctuations in the mice treated with CNO if the CNO treatment led to increased firing. This needs to be clarified.
We thank the reviewer for this comment. We understand that this method of comparing absolute values is unconventional. However, these animals were tested concurrently on the same system, and a clear effect on the absolute baseline was observed. We will include a caveat of this in our discussion. Panel D of this figure shows the raw, uncorrected photometry traces, whereas panel E shows the isosbestic corrected traces for the same recording. In panel E, the traces follow time in ascending order. We will also include frequency and amplitude data for these recordings.
Although the spatial transcriptomic results are intriguing and certainly a great way to start thinking about how the CNO treatment could lead to the loss of dopamine neurons, the presented results, the focusing of some broad classes of differentially expressed genes and on some specific examples, do not really suggest any clear mechanism of neurodegeneration. It would perhaps be useful for the authors to use the obtained data to validate that a state of chronic depolarization was indeed induced by the chronic CNO treatment. Were genes classically linked to increased activity like cfos or bdnf elevated in the SNc or VTA dopamine neurons? In the striatum, the authors report that the levels of DARP32, a gene whose levels are linked to dopamine levels, are unchanged. Does this mean that there were no major changes in dopamine levels in the striatum of these mice?
We will review the expression of activity-related genes in our dataset, although we must keep in mind that these genes may behave differently in the context of chronic activation as opposed to acutely increased activity. We will also include experiments assessing striatal dopamine levels by HPLC in the revision.
The usefulness of comparing the transcriptome of human PD SNc or VTA sections to that of the present mouse model should be better explained. In the human tissues, the transcriptome reflects the state of the tissue many years after extensive loss of dopamine neurons. It is expected that there will be few if any SNc neurons left in such sections. In comparison, the mice after 7 days of CNO treatment do not appear to have lost any dopamine neurons. As such, how can the two extremely different conditions be reasonably compared?
Our mouse model and human PD progress over distinct timescales, as is the case with essentially all mouse models of neurodegenerative diseases. Nonetheless, in our view there is still great value in comparing gene expression changes in mouse models with those in human disease. It seems very likely that the same pathologic processes that drive degeneration early in the disease continue to drive degeneration later in the disease. Note that we have tried to address the discrepancy in time scales in part by comparing to early PD samples when there is more limited SNc DA neuron loss. Please note the numbers of DA neurons within the areas we have selected for sampling (Figure at right). Therefore, we can indeed use spatial transcriptomics to compare dopamine neurons from mice with initial degeneration and patients where degeneration is ongoing during their disease.
Author response image 1.
Violin plot of DA neuron proportions sampled within the vulnerable SNV (deconvoluted RCTD method used in unmasked tissue sections of the SNV).
Control and early PD subjects.
Comments on the discussion:
In the discussion, the authors state that their calcium photometry results support a central role of calcium in activity-induced neurodegeneration. This conclusion, although plausible because of the very broad pre-existing literature linking calcium elevation (such as in excitotoxicity) to neuronal loss, should be toned down a bit as no causal relationship was established in the experiments that were carried out in the present study.
Our model utilizes hM3Dq-DREADDs that function by increasing intracellular calcium to increase neuronal excitability, and our results show increased Ca2+ by fiber photometry and changes to Ca2+-related genes, strongly suggesting a causal relation and crucial role of calcium in the mechanism of degeneration. However, we agree that we have not experimentally proven this point, as we acknowledged in the text. Additionally, we have planned revision experiments involving chronic isradipine treatment to further test the role of calcium in the mechanism of degeneration in this model.
In the discussion, the authors discuss some of the parallel changes in gene expression detected in the mouse model and in the human tissues. Because few if any dopamine neurons are expected to remain in the SNc of the human tissues used, this sort of comparison has important conceptual limitations and these need to be clearly addressed.
As discussed, we can sample SN DA neurons in early PD (see figure above), and in our view there is great value for such comparisons. We agree that discussion of appropriate caveats is warranted and this will be clearly addressed in the revision.
A major limitation of the present discussion is that it does not discuss the possibility that the observed phenotypes are caused by the induction of a chronic state of depolarization block by the chronic CNO treatment. I encourage the authors to consider and discuss this hypothesis.
As discussed above, our analyses of DA neuron firing in slices and open field testing to date do not support a prominent contribution of depolarization block with chronic CNO treatment. However, we cannot rule out this hypothesis, therefore we will include additional electrophysiology experiments and add discussion of this important consideration.
Also, the authors need to discuss the fact that previous work was only able to detect an increase in the firing rate of dopamine neurons after more than 95% loss of dopamine neurons. As such, the authors need to clearly discuss the relevance of the present model to PD. Are changes in firing rate a driver of neuronal loss in PD, as the authors try to make the case here, or are such changes only a secondary consequence of extensive neuronal loss (for example because a major loss of dopamine would lead to reduced D2 autoreceptor activation in the remaining neurons, and to reduced autoreceptor-mediated negative feedback on firing). This needs to be discussed.
As discussed above, while increases in dopamine neuron activity may be compensatory after loss of neurons, the precise percentage required to induce such compensatory changes is not defined in mice and varies between paradigms, and the threshold level is not known in humans. We also reiterate that a compensatory increase in activity could still promote the degeneration of critical surviving DA neurons, whose loss underlies the substantial decline in motor function that typically occurs over the course of PD. Moreover, there are also multiple lines of evidence to suggest that changes in activity can initiate and drive dopamine neuron degeneration (Rademacher & Nakamura, Exp Neurol 2024). For example, overexpression of synuclein can increase firing in cultured dopamine neurons (Dagra et al., NPJ Parkinsons Dis 2021, PMID: 34408150) while mice expressing mutant Parkin have higher mean firing rates (Regoni et al., Cell Death Dis 2020, PMID: 33173027). Similarly, an increased firing rate has been reported in the MitoPark mouse model of PD at a time preceding DA neuron degeneration (Good et al., FASEB J 2011, PMID: 21233488). We also acknowledge that alterations to dopamine neuron activity are likely complex in PD, and that dopamine neuron health and function can be impacted not just by simple increases in activity, but also by changes in activity patterns and regularity. We will amend our discussion to include the important caveat of changes in activity occurring as compensation, as well as further evidence of changes in activity preceding dopamine neuron death.
There is a very large, multi-decade literature on calcium elevation and its effects on neuronal loss in many different types of neurons. The authors should discuss their findings in this context and refer to some of this previous work. In a nutshell, the observations of the present manuscript could be summarized by stating that the chronic membrane depolarization induced by the CNO treatment is likely to induce a chronic elevation of intracellular calcium and this is then likely to activate some of the well-known calcium-dependent cell death mechanisms. Whether such cell death is linked in any way to PD is not really demonstrated by the present results. The authors are encouraged to perform a thorough revision of the discussion to address all of these issues, discuss the major limitations of the present model, and refer to the broad pre-existing literature linking membrane depolarization, calcium, and neuronal loss in many neuronal cell types.
While our model demonstrates classic excitotoxic cell death pathways, we would like to emphasize both the chronic nature of our manipulation and the progressive changes observed, with increasing degeneration seen at 1, 2, and 4 weeks of hyperactivity in an axon-first manner. This is a unique aspect of our study, in contrast to much of the previous literature which has focused on shorter timescales. Thus, while we will revise the discussion to more comprehensively acknowledge previous studies of calcium-dependent neuron cell death, we believe we have made several new contributions that are not predicted by existing literature. We have shown that this chronic manipulation is specifically toxic to nigral dopamine neurons, and the data that VTA dopamine neurons continue to be resilient even at 4 weeks is interesting and disease-relevant. We therefore do not want to use findings from other neuron types to draw assumptions about DA neurons, which are a unique and very diverse population. We acknowledge that as with all preclinical models of PD, we cannot draw definitive conclusions about PD with this data. However, we reiterate that we strongly believe that drawing connections to human disease is important, as dopamine neuron activity is very likely altered in PD and a clearer understanding of how dopamine neuron survival is impacted by activity will provide insight into the mechanisms of PD.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study reveals that the structural protein vimentin promotes the epithelial-mesenchymal transition in breast cancer cells. Utilizing robust and validated methodologies, the data collected provide a solid foundation for further investigation into metastasis models. This work will be of significant interest to researchers in the field of breast cancer.
-
Reviewer #2 (Public review):
The aim of the investigation was to find out more about the mechanism(s) by which the structural protein vimentin can facilitate the epithelial-mesenchymal transition in breast cancer cells.
The authors focused on a key amino acid of vimentin, C238, its role in the interaction between vimentin and actin microfilaments, and the downstream molecular and cellular consequences. They model the binding between vimentin and actin in silico to demonstrate the potential involvement of C238, due to its location in a rod domain known to bind beta-actin. The phenotype of a non-metastatic breast cancer cell line MCF7, which doesn't express vimentin, could be changed to a metastatic phenotype when mutant C238S vimentin, but not wild-type vimentin, was expressed in the cells. Expression of vimentin was confirmed at the level of mRNA, protein and microscopically. Patterns of expression of vimentin and actin reflected the distinct morphology of the two cell lines. Phenotypic changes were assessed through assay of cell adhesion, proliferation, migration and morphology and were consistent with greater metastatic potential in the C238S MCF7 cells. Changes in the transcriptome of MCF7 cells expressing wild-type and C238S vimentins were compared and expression of Xist long ncRNA was found to be the transcript most markedly increased in the metastatic cells expressing C238S vimentin. Moreover changes in expression of many other genes in the C238S cells are consistent with an epithelial mesenchymal transition. Tumourigenic potential of MCF7 cells carrying C238S but not wild-type, vimentin was confirmed by inoculation of cells into nude mice. This assay is a measure of stem-cell quality of the cells and not a measure of metastasis. It does demonstrate phenotypic changes that could be linked to metastasis.
shRNA was used to down-regulate vimentin or Xist in the MCF7 C238S cells. The description of the data is limited in parts and data sets require careful scrutiny to understand the full picture. Down-regulation of vimentin reversed the morphological changes to some degree, but down-regulation of Xist didn't. Conversely down-regulation of Xist inhibited cell growth, a sign of reversing metastatic potential, but down-regulation of vimentin had no effect on growth. Down-regulation of either did inhibit cell migration, another sign of metastatic reversal. Most of these findings are consistent with previous work based on ectopic expression of wild-type vimentin in MCF7 cells, but the mechanism of inhibition of cell migration by downregulation of Xist remains speculative. More complete knockdown of vimentin or Xist by CRISPR technology may be helpful.
Overall the study describes an intriguing model of metastasis that is worthy of further investigation, especially at the molecular level to unravel the connection between vimentin and metastasis. The identification of a potential role for Xist in metastasis, beyond its normal role in female cells to inactivate one of the X chromosomes, corroborates the work of others demonstrating increased levels in a variety of tumours in women and even in some tumours in men. It would be of great interest to see where in metastatic cells Xist is expressed and what it binds to.
Comments on revisions:
The revised manuscript incorporates changes in presentation of the data modelling interaction between the region of vimentin including C238 and F-actin. There is also inclusion of an extra citation supporting the role for Xist in cancer stem cell differentiation.
-
Author response:
The following is the authors’ response to the original reviews
Reviewer #1 (Public review):
Summary, and Strengths:
The authors and their team have investigated the role of Vimentin Cysteine 328 in epithelial-mesenchymal transition (EMT) and tumorigenesis. Vimentin is a type III intermediate filament, and cysteine 328 is a crucial site for interactions between vimentin and actin. These interactions can significantly influence cell movement, proliferation, and invasion. The team has specifically examined how Vimentin Cysteine 328 affects cancer cell proliferation, the acquisition of stemness markers, and the upregulation of the non-coding RNA XIST. Additionally, functional assays were conducted using both wild-type (WT) and Vimentin Cysteine 328 mutant cells to demonstrate their effects on invasion, EMT, and cancer progression. Overall, the data supports the essential role of Vimentin Cysteine 328 in regulating EMT, cancer stemness, and tumor progression. Overall, the data and its interpretation are on point and support the hypothesis. I believe the manuscript has great potential.
The authors are thankful to the reviewers for carefully reading the manuscript and evaluating the data to make positive comments and supporting our conclusions.
Weaknesses:
Minor issues are related to the visibility and data representation in Figures 2E and 3 A-F
We have revised the figures (Figure 2E and Figure 3A-F) to increase the data visibility.
Reviewer #2 (Public review):
The aim of the investigation was to find out more about the mechanism(s) by which the structural protein vimentin can facilitate the epithelial-mesenchymal transition in breast cancer cells.
The authors focussed on a key amino acid of vimentin, C238, its role in the interaction between vimentin and actin microfilaments, and the downstream molecular and cellular consequences. They model the binding between vimentin and actin in silico to demonstrate the potential involvement of C238, but the outcome is described vaguely.
We have expanded the discussion of these results in the manuscript to more explicitly describe the critical role of C238 in the vimentin-actin interaction. Specifically, we highlight that C238 lies within a region of the vimentin rod domain known to mediate key protein-protein interactions. Our modeling shows that the thiol group of C238 enables specific hydrogen bonding and potential disulfide-mediated interactions with actin, which are disrupted upon mutation to serine. These findings provide mechanistic insight into the functional importance of this residue.
The phenotype of a non-metastatic breast cancer cell line MCF7, which doesn't express vimentin, could be changed to a metastatic phenotype when mutant C238S vimentin, but not wild-type vimentin, was expressed in the cells. Expression of vimentin was confirmed at the level of mRNA, protein, and microscopically. Patterns of expression of vimentin and actin reflected the distinct morphology of the two cell lines. Phenotypic changes were assessed through assay of cell adhesion, proliferation, migration, and morphology and were consistent with greater metastatic potential in the C238S MCF7 cells. Changes in the transcriptome of MCF7 cells expressing wild-type and C238S vimentins were compared and expression of Xist long ncRNA was found to be the transcript most markedly increased in the metastatic cells expressing C238S vimentin. Moreover changes in expression of many other genes in the C238S cells are consistent with an epithelial mesenchymal transition. Tumourigenic potential of MCF7 cells carrying C238S but not wild-type, vimentin was confirmed by inoculation of cells into nude mice. This assay is a measure of the stem-cell quality of the cells and not a measure of metastasis. It does demonstrate phenotypic changes that could be linked to metastasis.
shRNA was used to down-regulate vimentin or Xist in the MCF7 C238S cells. The description of the data is limited in parts and data sets require careful scrutiny to understand the full picture. Down-regulation of vimentin reversed the morphological changes to some degree, but down-regulation of Xist didn't.
This is understandable given the fact that vimentin interacts with actin which is known to determine cell shape. XIST being a non-coding RNA will not have the same effect.
Conversely, down-regulation of XIST inhibited cell growth, a sign of reversing metastatic potential, but down-regulation of vimentin had no effect on growth.
XIST is known to get induced in a number of cancers (see Figure 3E) which is consistent with our observation that its downregulation will inhibit cell growth. However, downregulation of vimentin had no effect on growth which is consistent with our previously published observation that ectopic expression of wildtype vimentin in MCF-7 cells did not influence cell growth (Usman et al Cells 2022, 11(24), 4035; https://doi.org/10.3390/cells11244035).
Down-regulation of either did inhibit cell migration, another sign of metastatic reversal.
We have previously shown that ectopic expression of wildtype vimentin in MCF-7 stimulate cell migration due to downregulation of CDH5 (endothelial cadherins) (Usman et al Cells 2022, 11(24), 4035). Therefore, downregulation of vimentin is expected to inhibit cell migration which is what we observed in this study. Why downregulation of XIST inhibited cell migration is not clear. It is conceivable that XIST downregulation affects Lamin expression which may suppress intercellular interactions to increase cell migration. This hypothesis is supported by the fact that vimentin expression in MCF-7 affects Lamin expression (Usman et al Cells 2022, 11(24), 4035).
The interpretation of this type of experiment is handicapped when full reversal of expression is not achieved, as was the case in this study.
Full reversal of any biological effect is almost impossible to achieve which is because the shRNAs by nature are not 100% effective. This can however be tested using crispr Cas 9 gene editing to completely knockdown a protein (can’t be used for XIST as it is a non-coding RNA). In that case one has to assume that it will have no off-target effect.
Overall the study describes an intriguing model of metastasis that is worthy of further investigation, especially at the molecular level to unravel the connection between vimentin and metastasis. The identification of a potential role for Xist in metastasis, beyond its normal role in female cells to inactivate one of the X chromosomes, corroborates the work of others demonstrating increased levels in a variety of tumours in women and even in some tumours in men. It would be of great interest to see where in metastatic cells Xist is expressed and what it binds to.
The authors fully agree that it is an interesting model of metastasis/oncogenesis that requires further investigation.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides compelling evidence that SLC7A11 may serve as a potential therapeutic target for trastuzumab-resistant HER2-positive breast cancer. While the findings are well-supported by robust data, the study could have been further strengthened by incorporating additional cell line experiments and providing more detailed clarification on patient sample selection. Nevertheless, this valuable work represents a significant contribution and will be of considerable interest to researchers in the field of breast cancer.
-
Reviewer #1 (Public review):
Summary:
Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.
Strengths:
The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab resistant cells utilize cysteine metabolism.
Weaknesses:
However, there are some key aspects that needs to be addressed.
Major:
(1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.
(2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell lines. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.
(3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article will have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.
Minor:
(1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.
(2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.
Comments on revisions:
While the authors acknowledge the limitation of using only a single trastuzumab resistant/sensitive pair, simply stating that additional cell lines will be tested in future work is simply inadequate. The biological heterogeneity of HER2-positive breast cancer demands validation in at least one independent resistant model (e.g., HCC1954 or BT 474R) alongside its parental counterpart. Without demonstrating that SLC7A11 upregulation, cysteine dependency, and sensitivity to Erastin plus trastuzumab extend beyond the original cell line pair, the generalizability and translational relevance of the findings remain uncertain. The authors need to perform and report key functional results (cell viability, apoptosis, and SLC7A11 expression) in an additional resistant and sensitive HER2-positive cell line before this manuscript can be considered robust.
-
Reviewer #2 (Public review):
In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab resistant HER2 positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was insufficiently supported by the data.
Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors' claim of ferroptosis induction is primarily based on lipid peroxidation assays (Figure 3). The rescuing effects of ferroptosis inhibitors on cell viability were missing. The xenograft experiments were also suspicious (Figure 4). Systemic cysteine starvation is known to cause adverse effects, including liver necrosis, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, the authors focus on epigenetic regulations (Figures 5 & 6) without first investigating well-established transcription factors, such as NRF2 and ATF4, which are known to regulate SLC7A11.
To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab sensitive and primary resistant HER2 positive breast cancer patients.
Comments on revisions:
(1) Figure 3: The unit of concentration should be "μM". "μm" means micrometer.
(2) Figure S5: Ferroptosis inhibitors should be used in cell viability assays to exclude the off-target effect of RSL3 and erastin. Note that erastin also targets VDAC, while RSL3 may inhibit other selenoproteins at high concentrations. Cell viability assays are critical for demonstrating ferroptosis and should be included in the main figure rather than relegated to the supplemental materials.
(3) Figure 4B & 4C: the data of "H" group and "Erastin" group are inconsistent. In panel B, the tumor size in the "H" group appears smaller than in the "Erastin" group, while in panel C, the opposite trend is observed.
(4) The catalog numbers for the cystine/cysteine-deficient DMEM (from BIOTREE) and diet (from Xietong Bio) should be provided. This information is essential for readers to identify and verify the specific products used in the study.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
Summary:
Hua et al show how targeting amino acid metabolism can overcome Trastuzumab resistance in HER2+ breast cancer.
Strengths:
The authors used metabolomics, transcriptomics and epigenomics approaches in vitro and in preclinical models to demonstrate how trastuzumab-resistant cells utilize cysteine metabolism.
Thank you for your valuable comments. We would like to extend our appreciation for your efforts. Your constructive suggestion would help improve our research.
Weaknesses:
However, there are some key aspects that needs to be addressed.
Major:
(1) Patient Samples for Transcriptomic Analysis: It is unclear from the text whether tumor tissues or blood samples were used for the transcriptomic analysis. This distinction is crucial, as these two sample types would yield vastly different inferences. The authors should clarify the source of these samples.
Thank you for your valuable comments. In the transcriptomic analysis, we included the data of HER2 positive breast cancer patients who received trastuzumab in I-SPY2 trial (GSE181574). Tumor tissues were used in this dataset. We highlighted the usage of “pre-treatment breast cancer tumors” in Line 309 and included the overview of transcriptomic data analysis in I-SPY2 trial in Figure S1F.
(2) The study only tested one trastuzumab-resistant and one trastuzumab-sensitive cell line. It is unclear whether these findings are applicable to other HER2-positive tumor cell lines, such as HCC1954. The authors should validate their results in additional cell lines to strengthen their conclusions.
Thank you for your valuable comments. We agree with your opinion, and the exploration of multiple cell lines would make our research findings more comprehensive. This is a limitation of our study, and we would continue to improve our design and methods in future experiments.
(3) Relevance to Metastatic Disease: Trastuzumab resistance often arises in patients during disease recurrence, which is frequently associated with metastasis. However, the mouse experiments described in this paper were conducted only in the primary tumors. This article would have more impact if the authors could demonstrate that the combination of Erastin or cysteine starvation with trastuzumab can also improve outcomes in metastasis models.
Thank you for your valuable comments. We agree with your suggestions. The exploration of metastatic disease would make our research more meaningful and help better address clinical key issues. In our future studies, we will continue to investigate the association between the invasive and metastatic capabilities of trastuzumab resistant HER2 positive breast cancer and cysteine metabolism.
Minor:
(1) The figures lack information about the specific statistical tests used. Including this information is essential to show the robustness of the results.
Thank you for your valuable comments. We added statistical information in our figure legends, including Line 849-850, Line 865-867, Line 881-882, Line 898-900, Line 910-911 and Line 923-924.
(2) Figure 3K Interpretation: The significance asterisks in Figure 3K do not specify the comparison being made. Are they relative to the DMSO control? This should be clarified.
Thank you for your valuable comments. We have modified this figure to demonstrate it more clearly. In Figure 3K, the significance was determined by one-way ANOVA and the comparison presented was relative to the DMSO control. It was indicated that the combination of erastin or cysteine starvation and trastuzumab could increase lipid peroxidation, although trastuzumab monotherapy did not induce ferroptosis.
Additionally, the combination of erastin and trastuzumab could result in more lipid peroxidation than erastin alone. Similar results were also found in the combination of cysteine starvation and trastuzumab. These results showed that targeting cysteine metabolism plus trastuzumab could have synergic effects to induce ferroptosis in trastuzumab resistant HER2 positive breast cancer.
Reviewer #2 (Public review):
In this manuscript, Hua et al. proposed SLC7A11, a protein facilitating cellular cystine uptake, as a potential target for the treatment of trastuzumab-resistant HER2-positive breast cancer. If this claim holds true, the finding would be of significance and might be translated to clinical practice. Nevertheless, this reviewer finds that the conclusion was poorly supported by the data.
Notably, most of the data (Figures 2-6) were based on two cell lines - JIMT1 as a representative of trastuzumab-resistant cell line, and SKBR3 as a representative of trastuzumab sensitive cell line. As such, these findings could be cell-line specific while irrelevant to trastuzumab sensitivity at all. Furthermore, the authors claimed ferroptosis simply based on lipid peroxidation (Figure 3). Cell viability was not determined, and the rescuing effects of ferroptosis inhibitors were missing. The xenograft experiments were also suspicious (Figure 4). The description of how cysteine starvation was performed on xenograft tumors was lacking, and the compound (i.e., erastin) used by the authors is not suitable for in vivo experiments due to low solubility and low metabolic stability. Finally, it is confusing why the authors focused on epigenetic regulations (Figures 5 & 6), without measuring major transcription factors (e.g., NRF2, ATF4) which are known to regulate SLC7A11.
To sum up, this reviewer finds that the most valuable data in this manuscript is perhaps Figure 1, which provides unbiased information concerning the metabolic patterns in trastuzumab-sensitive and primary resistant HER2-positive breast cancer patients.
Thank you for your valuable comments. We agree with your suggestions. Your feedback would help enhance the quality of our research.
(1) Our research was mainly conducted in JIMT1 (trastuzumab resistant) and SKBR3 (trastuzumab sensitive), and this is a limitation of our study. The experimental validation using different cell lines will make our research findings more persuasive. In our future research, we will continuously optimize experimental design and methods to make our findings more comprehensive.
(2) The detection of ferroptosis in our research was mainly performed by evaluating the lipid peroxidation. Experiments measuring cell viability and rescuing effects would help provide more evidence.
We utilized CCK8 tests to compare cell viabilities of JIMT1 and SKBR3 in different erastin and RSL3 concentrations, as well as different exposure time of cysteine starvation. It was shown that JIMT1 was more sensitive to erastin and RSL3, but tolerant to cysteine starvation, which was consistent with the previous lipid peroxidation tests. This data was included in Figure S5C-E. We added the description in Line 375-379.
In addition, we also performed experiments to explore the rescuing effects of ferroptosis inhibitor Fer-1. It was indicated that Fer-1 could suppress the lipid peroxidation resulted from erastin, RSL3 and cysteine starvation in both JIMT1 and SKBR3. This provided more evidence that cysteine metabolism played a vital role in modulating HER2 positive breast cancer ferroptosis. This data was included in Figure S5G and S5H. We added the description to Line 387-391.
(3) In xenograft experiments, the cysteine starvation was performed by feeding cystine/cysteine-deficient diet (Xietong Bio). We added details of this diet on Line 236-237 in Methods.
We agree with your opinion on the role of erastin in experiments in vivo. We have tried to optimize drug dissolution and other conditions by referring to previous relevant literature. We would continue to improve our experimental design and methods.
(4) Epigenetic modifications have been recognized as crucial factors in drug resistance formation. An increasing number of studies have emphasized the importance of epigenetic changes in regulating the abnormal expression of oncogenes and tumor suppressor genes related to drug resistance. Currently, the role of epigenetic changes in the development of trastuzumab resistance in HER2 positive breast cancer is still in exploration. We tried to investigate the dysregulation of histone modifications and DNA methylation in trastuzumab resistant HER2 positive breast cancer. Our findings indicated that targeting H3K4me3 and DNA methylation could decrease SLC7A11 expression and induce ferroptosis. This would provide more evidence in exploring trastuzumab resistance mechanisms. We have provided a detailed discussion on Line 598-607.
We would like to extend our appreciation for your constructive suggestions and continue to improve our research in future experiments.
Recommendations for the authors:
Reviewer #2 (Recommendations for the authors):
(1) Line 334: it would be helpful to clarify that JIMT1 cells are trastuzumab-resistant while SKBR3 cells are trastuzumab sensitive, especially for those not familiar with breast cancer cell lines.
Thank you for your valuable recommendations. We added the description of trastuzumab sensitive SKBR3 and trastuzumab resistant JIMT1 on Line 334-335.
(2) Figure 3: the concentrations of erastin and RSL3 should be indicated.
Thank you for your valuable recommendations. In Figure 3, the concentration of erastin was 10μm and RSL3 was 1μm. We added these details in the figure legends on Line 872-873.
(3) Figure 3: lipid peroxidation does not necessarily mean ferroptosis. Cell viability data and rescuing effects of ferroptosis inhibitors should be shown.
Thank you for your valuable recommendations. As we mentioned above, we utilized CCK8 tests to compare cell viabilities of JIMT1 and SKBR3 in different erastin and RSL3 concentrations, as well as different exposure time of cysteine starvation. It was consistent with lipid peroxidation tests that JIMT1 was more sensitive to erastin and RSL3, but tolerant to cysteine starvation. This data was included in Figure S5C-E. We added the description in Line 375-379.
As described above, we also performed experiments to explore the rescuing effects of ferroptosis inhibitor Fer-1. It was indicated that Fer-1 could suppress the lipid peroxidation resulted from erastin, RSL3 and cysteine starvation in both JIMT1 and SKBR3. This provided more evidence that cysteine metabolism played a vital role in modulating HER2 positive breast cancer ferroptosis. This data was included in Figure S5G and S5H. We added the description to Line 387-391.
(4) Figure 3H: how cysteine starvation was performed should be clarified in the Methods section.
Thank you for your valuable recommendations. We performed cell culture with cysteine starvation by utilizing cystine/cysteine-deficient DMEM (BIOTREE) and 1% penicillin streptomycin at 37℃ with 5% CO2. We added details of this diet on Line 141-143 in Methods.
(5) Figure 4: the meaning of "H" should be clarified.
Thank you for your valuable recommendations. H was indicated as trastuzumab. We clarified the meaning of “H” in the figure legends on Line 898.
(6) Figure 4B & 4C: the data of "H" group and "Erastin" group are inconsistent.
Thank you for your valuable recommendations. In the vivo experiments, the tumor volume changes were analyzed using a paired approach, comparing the tumor size of each individual mouse before and after treatment. We noticed the confusion caused and added more details about our vivo experiments on Line 240 in Methods and Line 892-893 in figure legends.
(7) Figure 4: how cysteine starvation was performed should be clarified in the Methods section.
Thank you for your valuable recommendations. We performed cysteine starvation by utilizing cystine/cysteine-deficient diet (Xietong Bio). We added details of this diet on Line 236-237 in Methods.
We have also corrected some grammatical errors in the manuscript and We would like to extend our great appreciation to all editors and reviewers for their invaluable contributions.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study addresses the structural basis of voltage-activation of BK channels using atomistic simulations of several microseconds, to assess conformational changes that underlie both voltage-sensing and gating of the pore. Simulated effects of voltage on the movement of charged amino acids appear solid as they are generally consistent qualitatively and quantitatively with previous experimental and structural results, providing a potentially valuable way to calculate the contribution of individual charges to voltage-sensitivity. Simulations of conformational changes and interactions associated with channel opening and K+ conduction are likely incomplete owing to the timescale of the simulation and theoretical limitations in simulating K+ and water movement, but nonetheless provide helpful initial predictions and a framework for future improvement. This paper will likely be of interest to ion channel biologists and biophysicists focused on voltage-dependent channel gating mechanisms.
-
Reviewer #1 (Public review):
Summary:
This study provides new insight into the non-canonical voltage-gating mechanism of BK channels through prolonged (10 μs) MD simulations of the Slo1 transmembrane domain conformation and K+ conduction in response to high imposed voltages (300, 750 mV). The results support previous conclusions based on functional and structural data and MD simulations that the voltage-sensor domain (VSD) of Slo1 undergoes limited conformational changes compared to Kv channels, and predicts gating charge movement comparable in magnitude to experimental results. The gating charge calculations further indicate that R213 and R210 in S4 are the main contributors owing to their large side chain movements and the presence of a locally focused electric field, consistent with recent experimental and MD simulation results by Carrasquel-Ursulaez et al.,2022. Most interestingly, changes in pore conformation and K+ conduction driven by VSD activation are resolved, providing information regarding changes in VSD/pore interaction through S4/S5/S6 segments proposed to underly electromechanical coupling.
Strengths:
Include that the prolonged timescale and high voltage of the simulation allow apparent equilibration in the voltage-sensor domain (VSD) conformational changes and at least partial opening of the pore. The study extends the results of previous MD simulations of VSD activation by providing quantitative estimates of gating charge movement, showing how the electric field distribution across the VSD is altered in resting and activated states, and testing the hypothesis that R213 and R210 are the primary gating charges by steered MD simulations. The ability to estimate gating charge contributions of individual residues in the WT channel is useful as a comparison to experimental studies based on mutagenesis which have yielded conflicting results that could reflect perturbations in structure. Use of dynamic community analysis to identify coupling pathways and information flow for VSD-pore (electromechanical) coupling, as well as analysis of state-dependent S4/S5/S6 interactions that could mediate coupling, provides useful predictions extending beyond what has been experimentally tested.
Weaknesses:
Include that a truncated channel (lacking the C-terminal gating ring) was used for simulations, which is known to have reduced single channel conductance and reduced electromechanical coupling compared to the full-length channel. In addition, as VSD activation in BK channels is much faster than opening, the timescale of simulations was likely insufficient to achieve a fully open state, as supported by differences in the degree of pore expansion in replicate simulations, which are also smaller than observed in Ca-bound open structures of the full-length channel. Taken together, these limitations suggest that the analysis regarding coupling pathways and interactions is incomplete. In addition, while the simulations convincingly demonstrate voltage-dependent channel opening as evidenced by pore expansion, and conduction of K+ and water through the pore, single channel conductance is underestimated by at least an order of magnitude, as in previous studies of other K+ channels. These quantitative discrepancies suggest that MD simulations may not yet be sufficiently advanced to provide insight into mechanisms underlying the extraordinarily large conductance of BK channels.
-
Reviewer #2 (Public review):
Summary:
This manuscript addresses the structural basis of voltage-activation of BK channels using computational approaches. Although a number of experimental studies using gating current and patch-clamp recording have analyzed voltage-activation in terms of observed charge movements and the apparent energetic coupling between voltage-sensor movement and channel opening, the structural changes that underlie this phenomenon have been unclear. The present studies use a reduced molecular system comprising the transmembrane portion of the BK channel (i.e., the cytosolic domain was deleted), embedded in a POPC membrane, with either 0 or 750 mV applied across the membrane. This system enabled acquisition of long simulations of 10 microseconds, to permit tracking of conformational changes of the channel. The authors' principal findings were that the side chains of R210 and R213 rapidly moved toward the extracellular side of the membrane (by 8 - 10 Å), with greater displacements than any of the other charged transmembrane residues. These movements appeared tightly coupled to the movement of the pore-lining helix, pore hydration, and ion permeation. The authors estimate that R210 and R213 contribute 0.25 and 0.19 elementary charges per residue to the gating current, which is roughly consistent with estimates based on electrophysiological measurements that used the full-length channel.
Strengths:
The methodologies used in this work are sound, and these studies certainly contribute to our understanding of voltage-gating of BK channels. An intriguing observation is the strongly coupled movement of the S4, S5, and S6 helices that appear to underlie voltage-dependent opening. Based on Figures 2a-d, the substantial movements of the R210 and R213 side chains occur nearly simultaneously to the S6 movement (between 4 - 5 usec of simulation time). This seems to provide support for a "helix-packing" mechanism of voltage gating in the so-called "non-domain-swapped" voltage-gated K channels.
Weaknesses:
The main limitation is that these studies used a truncated version of the BK channel, and there are likely to be differences in VSD-pore coupling in the context of the full-length channels that will not be resolved in the present work. Nonetheless, the authors provide a strong rationale for their use of the truncated channel, and the results presented will provide a good starting point for future computational studies of this channel.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study presents a valuable investigation into cell-specific microstructural development in the neonatal rat brain using diffusion-weighted magnetic resonance spectroscopy. The evidence supporting the core claims is solid, with innovative in vivo data acquisition and modeling, although some conclusions would benefit from stronger validation and methodological justification. The work will be of interest to researchers studying brain development and biophysical imaging methods.
-
Reviewer #1 (Public review):
In this work, Ligneul and coauthors implemented diffusion-weighted MRS in young rats to follow longitudinally and in vivo the microstructural changes occurring during brain development. Diffusion-weighted MRS is here instrumental in assessing microstructure in a cell-specific manner, as opposed to the claimed gold-standard (manganese-enhanced MRI) that can only probe changes in brain volume. Differential microstructure and complexification of the cerebellum and the thalamus during rat brain development were observed non-invasively. In particular, lower metabolite ADC with increasing age were measured in both brain regions, reflecting increasing cellular restriction with brain maturation. Higher sphere (representing cell bodies) fraction for neuronal metabolites (total NAA, glutamate) and total creatine and taurine in the cerebellum compared to the thalamus were estimated, reflecting the unique structure of the cerebellar granular layer with a high density of cell bodies. Decreasing sphere fraction with age was observed in the cerebellum, reflecting the development of the dendritic tree of Purkinje cells and Bergmann glia. From morphometric analyses, the authors could probe non-monotonic branching evolution in the cerebellum, matching 3D representations of Purkinje cells expansion and complexification with age. Finally, the authors highlighted taurine as a potential new marker of cerebellar development.
From a technical standpoint, this work clearly demonstrates the potential of diffusion-weighted MRS at probing microstructure changes of the developing brain non-invasively, paving the way for its application in pathological cases. Ligneul and coauthors also show that diffusion-weighted MRS acquisitions in neonates are feasible, despite the known technical challenges of such measurements, even in adult rats. They also provide all necessary resources to reproduce and build upon their work, which is highly valuable for the community.
From a biological standpoint, claims are well supported by the microstructure parameters derived from advanced biophysical modelling of the diffusion MRS data.
Specific strengths:
(1) The interpretation of dMRS data in terms of cell-specific microstructure through advanced biophysical modelling (e.g. the sphere fraction, modelling the fraction of cell bodies versus neuronal or astrocytic processes) is a strong asset of the study, going beyond the more commonly used signal representation metrics such as the apparent diffusion coefficient, which lacks specificity to biological phenomena.<br /> (2) The fairly good data quality despite the complexity of the experimental framework should be praised: diffusion-weighted MRS was acquired in two brain regions (although not in the same animals) and longitudinally, in neonates, including data at high b-values and multiple diffusion times, which altogether constitutes a large-scale dataset of high value for the diffusion-weighted MRS community.<br /> (3) The authors have shared publicly data and codes used for processing and fitting, which will allow one to reproduce or extend the scope of this work to disease populations, and which goes in line with the current effort of the MR(S) community for data sharing.
Specific weaknesses:
Ligneul and coauthors have convincingly addressed and included my comments in their revised manuscript.
I believe the following conceptual concerns, which are inherent to the nature of the study and do not require further adjustments of the manuscript, remain:
(1) Metabolite compartmentation in one cell type or the other has often been challenged and is currently impossible to validate in vivo. Here, Ligneul and coauthors did not use this assumption a priori and supported their claims also with non-MR literature (eg. for Taurine), but the interpretation of results in that direction should be made with care.
(2) Longitudinal MR studies of the developing brain make it difficult to extract parameters with an "absolute" meaning. Indirect assumptions used to derive such parameters may change with age and become confounding factors (brain structure, cell distribution, concentrations normalizing metabolites (here macromolecules), relaxation times...). While findings of the manuscript are convincing and supported with literature, the true underlying nature of such changes might be difficult to access.
(3) Diffusion MRI in addition to diffusion MRS would have been complementary and beneficial to validate some of the signal contributions, but was unfeasible in the time constraints of experiments on young animals.
-
Reviewer #2 (Public review):
This second revision has partially addressed criticisms previously raised; however, substantial inadequacies, particularly concerning rigorous validation and model justification, remain unresolved. While recognizing evident strength, novelty, and technical complexity of this work, the authors have yet to fully resolve key major concerns explicitly pointed out during revision in a satisfactory manner. As currently written, the manuscript does not yet provide sufficiently robust validation, methodological rigour, or clarity required for complete acceptance in a top-tier scientific journal.
Summary of Authors' Aim:
In this revised version, the authors aimed to address prior reviewer critiques harshly pinpointing the need for greater clarity in the manuscript's logical flow, rigorous external validation, clearer explanation of methodological normalization choices, and deeper elaboration of diffusion MRI method relevance and potential translation. The authors present a diffusion-weighted MRS approach paired with complex biophysical modelling to elucidate differential developmental trajectories of cellular structures in cerebellum and thalamus in rat neonates, providing a novel, non-invasive avenue for monitoring cellular microstructure.
Major Comments:
Rigorous Validation (Reviewer #1 - point R1.1, Reviewer #2 - point R2.2):
The major concern previously raised and reiterated here is the insufficient external cross-validation of the dMRS-derived interpretations about cellular changes, including the particularly speculative interpretation that taurine undergoes compartment switching between neuronal and glial compartments in the thalamus. The authors acknowledge this important shortcoming (R1.1, R2.2) but attempt to mitigate these concerns merely through additional contextual comparisons from existing literature (page 23, lines 877-878, Figure S11, Table S2). While better contextualization is welcome, the modified manuscript still falls notably short of the level of rigour necessary to validate such striking switches in compartmentalization. To justify claims of metabolites changing cellular compartments, explicit verification against independent molecular/histological data, ideally with additional immunohistochemical staining for cellular markers (e.g., glial fibrillary acidic protein, NeuN), is necessary. The mere presence of literature correlations (such as the reported visual comparisons to morphometric reconstructions, page 24, lines 883-884) does not constitute rigorous validation at the required standard for high-impact publication. The revised manuscript remains fundamentally weakened without such validation. To properly improve, the authors must consider incorporating independent ex vivo experiments or, if this is no longer feasible, extensively temper their compartment-switching claims, acknowledging explicitly and prominently the speculative nature of current interpretations.
Normalization of Metabolite Concentrations (Reviewer #1 - point R1.3):
The authors clearly responded to a reviewer wish for justification of metabolite normalisation to macromolecular concentrations (page 13, lines 493-503, Figure S2). However, the rationale provided remains only partially convincing. While the authors appropriately acknowledge the unusual nature of their methodological choice and possible confounding factors, they opt to supplement rather than substitute this approach with a more standard method (normalisation by water) in the main body of the manuscript. The additional supplementary Figure S2 is helpful, yet the conclusions derived with macromolecular normalization still remain potentially confounded by age-dependent macromolecular changes (Tkac et al., 2003). The justification given in the revised manuscript remains vague, unsatisfactory, and somewhat contradictory-authors accept macromolecules changes likely with age, yet largely overlook this effect. At least, the comparison between normalization by macromolecules and water should be explicitly discussed in the main text, and conclusions drawn from macromolecular normalization must be cautiously framed.
Choice and Justification of Biophysical Model (Reviewer #1 - point R1.4):
The reviewers questioned model assumptions, particularly ignoring macroscopic anisotropy effects due to white matter presence, myelination, and fibre orientation dispersion in the cerebellar voxel. Authors provided newly included DTI data and acknowledged this limitation explicitly (R1.4, Figure S8, page 25, lines 921-924). However, the addition of these poor-quality DTI data with limited interpretability paradoxically weakens rather than strengthens the manuscript as a whole, since the authors now present unclear supplementary results with little additional interpretative value. Recognizing poor data quality in this scenario, although intellectually honest, does not substantially increase the current robustness of their chosen model nor improve justification. To address this fully, either higher-quality data should be collected to robustly probe anisotropy or fibre dispersion effects, or the authors must much further restrict their interpretations in view of this clear limitation. Currently, the solution proposed is incomplete and insufficient to clarify the consequences of their chosen model.
Logical Flow and Clarity (Reviewer #2 - points R2.1 and R2.3):
The authors attempted to respond to reviewer comments on logical flow and accessibility (page 3, introduction restructuring). While the manuscript readability has improved, the introduction and discussion remain overly intricate, and at times, detail-oriented without clear links into central claims. In particular, the biological rationale for choosing the specific metabolite markers (especially tCho, Ins, Tau, etc.) and their known relevance must be further streamlined and simplified to increase accessibility and directness. Although some helpful restructuring was carried out, further careful paragraph-level revision for logical flow and readability remains necessary.
Translation to Human Studies (Reviewer #2 - point R2.4):
The authors have extended contextual discussion on translational potential regarding taurine as a developmental marker in humans (pages 24-25, lines 906-917). However, mention remains vague and cursory, without presenting sufficiently solid arguments nor drawing from human developmental studies adequately. Translational potential must be assessed within the realistic limitations inherent in clinical translation of MRS studies, particularly given the technical complexities clearly identified even in preclinical studies of this paper. Discussion remains relatively superficial, and if retained, must be expanded to fully discuss realistic human translational hurdles and requirements.
-
Author response:
The following is the authors’ response to the original reviews
Summary of revisions:
Thanks to the careful review and comments from the reviewers, we restructured the introduction and the discussion to improve clarity and better contextualise findings. We notably discuss further the f<sub>sphere</sub> decrease observations in the cerebellum and the Tau-specific findings (Tau being a possible marker for Purkinje cells development and Tau switching compartment in the thalamus). We added material in Supplementary Information to support these discussion points. We added a figure to show the metabolic profiles normalised by water or by macromolecules and a figure and table related to a rough approximation of f<sub>sphere</sub>, leaning on existing literature. We report the DTI results for thoroughness.
Public Reviews:
Reviewer #1 (Public Review):
In this work, Ligneul and coauthors implemented diffusion-weighted MRS in young rats to follow longitudinally and in vivo the microstructural changes occurring during brain development. Diffusion-weighted MRS is here instrumental in assessing microstructure in a cell-specific manner, as opposed to the claimed gold-standard (manganese-enhanced MRI) that can only probe changes in brain volume. Differential microstructure and complexification of the cerebellum and the thalamus during rat brain development were observed noninvasively. In particular, lower metabolite ADC with increasing age were measured in both brain regions, reflecting increasing cellular restriction with brain maturation. Higher sphere (representing cell bodies) fraction for neuronal metabolites (total NAA, glutamate) and total creatine and taurine in the cerebellum compared to the thalamus were estimated, reflecting the unique structure of the cerebellar granular layer with a high density of cell bodies. Decreasing sphere fraction with age was observed in the cerebellum, reflecting the development of the dendritic tree of Purkinje cells and Bergmann glia. From morphometric analyses, the authors could probe non-monotonic branching evolution in the cerebellum, matching 3D representations of Purkinje cells expansion and complexification with age. Finally, the authors highlighted taurine as a potential new marker of cerebellar development.
From a technical standpoint, this work clearly demonstrates the potential of diffusion-weighted MRS at probing microstructure changes of the developing brain non-invasively, paving the way for its application in pathological cases. Ligneul and coauthors also show that diffusionweighted MRS acquisitions in neonates are feasible, despite the known technical challenges of such measurements, even in adult rats. They also provide all necessary resources to reproduce and build upon their work, which is highly valuable for the community.
From a biological standpoint, claims are well supported by the microstructure parameters derived from advanced biophysical modelling of the diffusion MRS data. The assumption of metabolite compartmentation, forming the basis of cell-specific microstructure interpretation of dMRS data, remains debated and should be considered with care (Rae, Neurochem Res, 2014, https://doi.org/10.1007/s11064-013-1199-5). External cross-validation of some of the authors' claims, in particular taurine in the thalamus switching from neurons to astrocytes during brain development, would be a highly valuable addition to this study.
R1.1: We understand the reviewer's concerns. Metabolic compartmentation is not a one-toone correspondence. Although we interpret the results in the light of metabolic compartmentation, our results are not driven by this assumption. We could not perform a direct cross-validation of the taurine switch in the thalamus, but we now clarify in the discussion why the dMRS results themselves indicate a switch, and we integrate our results better with existing literature on taurine. We now discuss this in more detail for the cerebellar results too.
Specific strengths:
(1) The interpretation of dMRS data in terms of cell-specific microstructure through advanced biophysical modelling (e.g. the sphere fraction, modelling the fraction of cell bodies versus neuronal or astrocytic processes) is a strong asset of the study, going beyond the more commonly used signal representation metrics such as the apparent diffusion coefficient, which lacks specificity to biological phenomena.
(2) The fairly good data quality despite the complexity of the experimental framework should be praised: diffusion-weighted MRS was acquired in two brain regions (although not in the same animals) and longitudinally, in neonates, including data at high b-values and multiple diffusion times, which altogether constitutes a large-scale dataset of high value for the diffusion-weighted MRS community.
(3) The authors have shared publicly data and codes used for processing and fitting, which will allow one to reproduce or extend the scope of this work to disease populations, and which goes in line with the current effort of the MR(S) community for data sharing.
Specific weaknesses:
(1) This work lacks an introduction and a discussion about diffusion MRI, which is already a validated technique to assess brain development non-invasively. Although water lacks cellspecificity compared to metabolites, several studies have reported a decrease in water ADC and increased fractional anisotropy with brain maturation, associated with the myelination process and decreased water content (overview in Hüppi, Chapt. 30 of "Diffusion MRI: Theory, Methods, and Applications", Oxford University Press, 2010). Interestingly, the same observations are found in this work (decreased ADC with age for most metabolites in both brain regions), which should have been commented on. Moreover, the authors could have reported water diffusion properties in addition to metabolites', as I believe the water signal, used for coil combination and/or Eddy currents corrections, is usually naturally acquired during diffusion-weighted MRS scans.
R1.2: Thank you for these helpful suggestions. We have now improved our introduction of the various modalities, and we contextualise the study in light of previous DTI findings in the as suggested by the reviewer. We agree with the reviewer that the comparison with previous human DTI is relevant, and we now mention it at the beginning of the discussion. However, the very different nature of the dMRS signal compared to dMRI (intracellular and absence of exchange for metabolites) prevents us from drawing any strong conclusions.
(2) It is unclear why the authors have normalized metabolite concentrations (measured from low b-values diffusion-weighted MRS spectra) to the macromolecule concentrations. First, it is not specified whether in vivo macromolecules were acquired at each age or just at one time point. Second, such ratios are not standard practice in the MRS community so this choice should have been explained. Third, the macromolecule content was reported to change with age (Tkac et al., Magn Reson Med, 2003), therefore a change in metabolite to macromolecule ratio with age cannot be interpreted unequivocally.
R1.3: We agree with the reviewer that this needed further explanations. We now clarify in the Results section “Metabolic profile changes with age” the reasoning behind choosing macromolecules for normalisation. We also added in the Supplementary Information the metabolite concentrations change with age when normalising by water, and a direct comparison with MM normalisation (Figure S2).
(3) Some discussion is missing about the choice of the analytical biophysical model (although a few are compared in Supplementary Materials), in particular: is a model of macroscopic anisotropy relevant in cerebellum, made of a large fraction of oriented white matter tracks, and does the model remain valid at different ages given white matter maturation and the ongoing myelination process?
R1.4: We agree with the reviewer that this is a valid concern. We actually acquired some standard DTI at the end of the acquisition sessions (where possible) having in mind the fibre dispersion estimation. However, data could not be acquired in all animals, and the data quality was poor (see Figure S8, the experimental conditions would have required further optimisation). We now add a couple of sentences at the beginning and in the end of discussion to address this limitation, and we include the DTI data in Supplementary Information.
Reviewer #2 (Public Review):
Summary:
The authors set out to non-invasively track neuronal development in rat neonates, which they achieved with notable success. However, the direct relationship between the results and broader conclusions regarding developmental biology and potential human implications is somewhat overstretched without further validation.
Strengths:
If adequately revised and validated, this work could have a significant impact on the field, providing a non-invasive tool for longitudinal studies of brain development and neurodevelopmental disorders in preclinical settings.
Weaknesses:
(1) Consistency and Logical Flow:
The manuscript suffers from a lack of strategic flow in some sections. Specifically, transitions between major findings and methodological discussions need refinement to ensure a logical progression of ideas. For example, the jump from the introduction of developmental trajectories and the technicalities of MRS (Magnetic Resonance Spectroscopy) processing on page 3 could benefit from a bridging paragraph that explicitly states the study's hypotheses based on existing literature gaps.
R2.1: Thank you for this general feedback (along with your point (3)) that helped us restructure the introduction and the discussion to improve the clarity and flow.
(2) Scientific Rigour:
While the novel application of diffusion-weighted MRS is commendable, there's a notable gap in the rigorous validation of this approach against gold-standard histological or molecular techniques. Particularly, the assertions regarding the sphere fraction and morphological changes inferred from biophysical modelling mandates direct validation to solidify the claims made. A study comparing these in vivo findings with ex vivo confirmation in at least a subset of samples would significantly enhance the reliability of these conclusions.
R2.2: We agree with the reviewer that this would have been a great addition to the manuscript. Although we could not run new experiments to address these flaws, we now discuss the results more quantitatively, leaning on existing literature (addition of Figure S11 and Table S2). This helps us understand the results around Tau in both regions better, and illustrate the R<sub>sphere</sub> trend.
(3) Clarity and Novelty:
- The manuscript often delves deeply into technical specifics at the expense of accessibility to readers not deeply familiar with MRS technology. The introduction and discussions would benefit from a clearer elucidation of why these specific metabolite markers were chosen and their known relevance to neuronal and glial cells, placing this in the context of what is novel compared to existing literature.
- The novelty aspect could be reinforced by a more structured discussion on how this method could change the current understanding or practices within neurodevelopmental research, compared to the current state of the art.
R2.3: See answer to (1). By restructuring the introduction and the discussion, we hope to have addressed this point. We now discuss how these findings compare to the state of the art (notably added comparison with dMRI research). Along with the next comment, we better discuss potential implications of these findings for neurodevelopmental research.
(4) Completeness:
- The Discussion section requires expansion to offer a more comprehensive interpretation of how these findings impact the broader field of neurodevelopment and psychiatric disorders. Specifically, the implications for human studies or clinical translation are touched upon but not fully explored.
- Further, while supplementary material provides necessary detail on methodology, key findings from these analyses should be summarized and discussed in the main text to ensure the manuscript stands complete on its own.
R2.4: Thank you for these helpful suggestions. We now integrate the findings better into the existing literature. We notably discuss how the results might translate to humans.
(5) Grammar, Style, Orthography:
There are sporadic grammatical and typographical errors throughout the text which, while minor, detract from the overall readability. For example, inconsistencies in metabolite abbreviations (e.g., tCr vs Cr+PCr) should be standardized.
R2.5: Thank you for the careful review. This has been corrected.
(6) References and Additional Context:
The current reference list is extensive but lacks integration into the narrative. Direct comparisons with existing studies, especially those with conflicting or supportive findings, are scant. More dedicated effort to contextualize this work within the existing body of knowledge would be beneficial.
R2.6: Because the nature of this work is novel, it is difficult to find directly conflicting/similar works. However, we now integrate the findings into the broader literature.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Minor comments:
Thank you for the careful review, we have addressed most of the minor comments, except for the last one, which we discuss below.
- Some figures appear blurred in the printed PDF- Introduction: "constrained and hindered by cell membranes," - maybe use "restricted" instead of "constrained", like everywhere else in the text
- Introduction: "(typically ~8cm3 vs ~8mm3 in dMRI in humans)" - here I suggest to put the rat brain sizes instead to help the reader understand how small the voxel was at P5 in this study, thus explaining the challenges
- Fig 1 - numbers 1 and 2 on panel A,B should be clarified and they do not match 1 and 2 on panel C, which is confusing- Fig 2 - I am guessing the large dots are the mean and small are individual data points? Please clarify
- Please specify "Relative CRLB" rather than just "CRLB", in supp. mat as well
- Fig 3 - title of panel B, I would change "signal" into "concentration"
- Fig 3 - end of caption: "and levelled to get Signal(tCr,P30)/Signal(MM,P30)=8", I think "in the thalamus" is missing
- The results section "Biophysical modelling underlines different developmental trajectories of cell microstructure between the cerebellum and the thalamus" is sometimes unprecise, e.g.: "Cerebellum: The sphere fraction and the radius estimated from tNAA diffusion properties vary with age." but the tNAA sphere fraction seems to vary more with age in the thalamus according to table 1 "Cerebellum: fsphere decreases from 0.63 (P10) to 0.41 (P30), but R is stable" this is for tCr I presume
- Table 1 - "pvalues" please add "before multiple comparison correction"
- Figure 5 - Panel B, the L-segment subpanel is unclear -which metabolites is it referring to? Why does Tau have a * in panel A?
- Update Ref 37 to the journal version
- Methods: "A STELASER (Ligneul et al., MRM 2017) sequence", add numbered reference instead
- Please specify that the DIVE toolbox uses Gaussian phase distribution approximation, it is important for the dMRS reader given that your diffusion gradient length is long and cannot be neglected, and that the SGP approximation does not apply.
The Gaussian phase distribution approximation and the SGP approximation are two different concepts. The gradient duration ∂ (7 ms) is short compared to the gradient separation ∆ (100 ms), but it could still be considered too long for the SGP approximation to hold. However, the gradient duration is accounted for in DIVE in any case.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This valuable study demonstrates that silencing of inhibitory interneurons in zebra finch HVC, a premotor nucleus critical for song production, disrupts song. However, song naturally recovers in a way that is surprisingly independent of LMAN, a distinct premotor nucleus required for normal song plasticity. The authors provide solid evidence that disruption is associated with microglial activation, activation of MHCI, synaptic changes, and altered neural dynamics in HVC. However, the manuscript would benefit from a clearer narrative structure, contextualization of the microglial results, and quantitative analyses to fully characterize song syntax and recovery after LMAN lesions.
-
Reviewer #1 (Public review):
Summary:
This study by Torok et al. takes a creative approach to studying circuit perturbations in a sensorimotor region for vocalization control, in a songbird species, the zebra finch. By expressing the light chain of tetanus toxin in neurons in a sensorimotor region HVC, the authors constrain neural firing and study the resulting degradation and then recovery of song, after a protracted (> 70-day) period. Recording data suggest a form of synaptic homeostasis emergent in both HVC and RA as a result of the profound loss of (inhibitory?) tone in HVC. The methods to analyze changes in song are particularly strong here, using dimension reduction and visualization techniques. Single-cell sequencing data showed accompanying changes in microglia abundance, as well as several other markers that were not observed in control viral injections. LFP analyses in birds during the tetanus onset phase showed clear dysregulation of typical voltage deflections and spectral power, each of which showed recovery in parallel with song recovery. Lastly, the authors present data indicating that the anterior forebrain region LMAN is not critical for the song degradation process, pointing instead to the direct relationship between HVC and RA in song plasticity in adults. The methods are generally well established, but my main concerns regard the validation of the viral construct, the lack of direct confirmation of tetanus toxin on inhibitory neurons or E/I balance in HVC, and a missed opportunity to look at song syllable sequence degradation and recovery.
Strengths:
The species under investigation is the premier model for the neural basis of vocal learning, and the telencephalic brain regions investigated are well mapped out for their control of vocal learning behavior. The methods for electrophysiology recording and analysis, song analysis, scRNAseq, and in situ hybridization pose no concern as they are well established for this group of co-authors.
Weaknesses:
The introduction lays out a case for pursuing long-term E/I imbalances, vis-à-vis transient perturbations that have shown effects on the behavior. However, the rationale is not clearly stated. Why should the reader care that "prolonged E/I imbalances" may occur? Do they occur naturally or in some disease states (as alluded to in the first paragraph)? Without this rationale, the reader is left with an impression that the experiments were done because of a technical capability rather than a conceptual thrust.
The cited works for the statement the "AAV viral vector expressing TeNT undre the human dlx promoter, which is selective for HVC inhibitory interneurons" (reference 5 Kosche et al., 2016; and reference 10 Vallentin et al 2016) do not substantiate the targeting of this dlx5 promoter for interneurons in zebra finch HVC. Neither of these cited studies used viral vectors, and so this is a misattribution of the dlx5 promoter as targeting HVC inhibitory interneurons. However, the original development of this enhancer by Gord Fishell and others did have solid expression in HVC (Dimidschstein et al., 2016, Nature Neuroscience), and the enhancer was used to successfully target inhibitory neurons in nearby nidopallium NCM (Spool et al., 2022, Curr Biol). Citing these two studies would improve the standing of this viral approach. Nevertheless, the specific construct used here is not the same as the published studies mentioned above (AAV9-dlx-TeNT). The authors therefore need to show expression of the virus using some histological confirmation to cement the idea that they are indeed targeting inhibitory interneurons with this manipulation. The methods statement "a single injection (~100 nL) in the center of HVC was sufficient to label enough cells" is not convincing in the absence of quantified photomicrographs.
The authors present no physiological confirmation of TeNT on E/I balance directly, and so we don't have a clear picture of how/whether HVC interneurons are physiologically altered by this manipulation. That said, the Npix recordings show that there was a tremendous increase in gamma power following TeNT manipulation, which subsides as the protracted song recovery unfolds. This finding is somewhat counterintuitive, given that gamma oscillations are typically driven by inhibitory neurons in many systems (including songbird pallium) while the TeNT manipulation is purported to cause *reductions* in inhibitory neurotransmitter release within HVC. Some interpretation of these incongruent results would be useful in the Discussion.
The degradation and recovery of song is based mainly on the measures of duration of syllables and inter-syllable intervals, but HVC is also a key locus for song syllable sequence coding. The supplementary figures show some changes in sequences. It would improve the interpretation of both the degradation and recovery of the song to know whether syllable sequences (iiiABCCDDEF) truly recovered or were morphed in some way (e.g., iiiCDDDBEF). The PCA analyses (that the authors conducted) for these two potential outcomes would likely be very similar, but the actual songs would differ greatly under these two scenarios in terms of syllable sequence. From the representative spectrograms, it appears that the song syllable sequence does indeed recover well in these examples (perhaps less so in Supplementary Figure 3). A simple Markov-chain analysis of the syllable sequences across birds in the study would provide important confirmation of these insights.
-
Reviewer #2 (Public review):
This article addresses the question of how complex behavior is maintained despite perturbations in underlying motor circuits. Using zebra finch song production as a model system, the authors employ a genetic approach to perturb activity in GABAergic neurons within the vocal control nucleus HVC. Specifically, they use AAV to deliver the tetanus toxin light chain (TeNT) under the interneuron-specific DLX promoter, with the goal of silencing interneurons. This manipulation causes rapid degradation of song, followed by recovery over several weeks.
The authors characterize the recovery using a combination of transcriptomic analysis, electrophysiology, and lesion studies. Notably, the recovery does not require the lMAN, which is typically considered critical for vocal learning and plasticity. The authors speculate that homeostatic mechanisms within the motor pathway - potentially involving microglial remodeling -may mediate this recovery.
The strength of the study lies in the striking behavioral effects - both degradation and recovery - resulting from a specific circuit perturbation, and the use of complementary approaches (gene expression, neurophysiology, behavior, and lesions) to link circuit changes to behavior. The approach is creative, and the findings are intriguing. More detailed comments are provided below that may help enhance the manuscript's value to the community.
(1) In Figure 1b, the authors show changes in the relative abundance of cell types following TeNT expression in HVC. The most prominent change, as noted by the authors, is an increase in microglia. However, there are also apparent changes in the proportions of other cell types-particularly decreases in neurons and radial glia. How do the authors interpret the observed reductions in GABAergic and glutamatergic cells, as well as radial glia? Are these decreases statistically significant? Given the magnitude of these changes, could they reflect sampling differences (e.g., inclusion of tissue outside HVC) or neuronal cell death? Alternatively, is it possible that the absolute number of mature neurons remains constant, and increases in other cell types shift the relative proportions? The authors should clarify how to interpret the Y-axis of this plot. It appears to reflect relative abundance rather than absolute cell numbers, which has important implications for interpretation.
(2) The authors appear to define their own cell type clusters and labels, rather than using standard classifications (e.g., Colquitt et al. 2021; Colquitt et al. 2023). This makes cross-study comparisons difficult. For example, Colquitt describes four classes of putative immature neurons (pre2-pre4, GABA-pre). In contrast, the authors refer to "neuroblasts" in Figure 1b. Are these equivalent to pre2-pre4 and/or to "GABA-pre"? What about "migrating neuroblasts" in Supplementary Figure 11? The authors could consider using the standard nomenclature, or if they disagree with that classification, explain why an alternative scheme is warranted.
(3) The transcriptomic data are underexplored. Many genes appear differentially expressed (e.g., in Figure 1c), however, the main text contains little discussion of differential gene expression beyond MHC I and B2M. It would be useful to discuss whether transcriptomic data support or rule out any other specific mechanistic hypotheses for recovery.
(4) The authors attribute increased microglial markers to interneuron silencing rather than inflammation from viral injection, based on control virus results (lines 143-146). However, is it plausible that TeNT expression itself, or batch variability, could drive differences in inflammation? The authors could address these alternatives with additional evidence or discussion.
-
Reviewer #3 (Public review):
Summary:
This manuscript investigates at behavioral and mechanistic levels the recovery of zebra finch song production after a genetically targeted insult to HVC, a vocal premotor nucleus known to generate stereotyped neural sequences that drive the correspondingly stereotyped song. This study is a close follow up to past work, published in Nature Neuroscience last year (Wang et al, 2024), in which custom lentiviruses were used to deliver a persistently active sodium channel, NacBAC or TeNT to block synaptic release, specifically to the excitatory projection neurons in HVC. In this past work, these manipulations resulted in rapid degradation of song, followed by a slow recovery that, remarkably, did not require practice. Song recovery was associated with synaptic remodeling that appeared to homeostatically bring the affected neurons back to a normal firing regime. This past paper was important because it clearly demonstrated behaviorally and mechanistically how neural plasticity can restore a learned behavior without practice, showing that dominant reinforcement learning models of birdsong are not the full story.
This past work sets the context for the current paper, which instead targets the inhibitory neuronal population in HVC for silencing via viral-mediated expression of TeNT. Again, this sophisticated targeting of HVC interneurons resulted in rapid degradation of song, followed by a much slower but seemingly full recovery.
Strengths:
Overall, this paper has several strengths. First, it provides yet another convincing example of non-canonical vocal learning in the zebra finch because LMAN (a nucleus required for trial and error song learning) is not required for song recovery. Second, its targeting of interneurons clarifies the extent to which inhibition in HVC is essential for vocal patterning (not surprising but important to show). Third, by using RNAseq of HVC at the time of peak song disruption, it zeroes in on specific genetic/cellular activations associated with a lack of inhibition (e.g., microglial activation and MHC1 expression), opening up new avenues for future study. Using in vivo electrophysiology it also characterizes some gross circuit-level abnormalities in HVC-RA transmission and during sleep.
Weaknesses:
Yet the paper also has several areas for improvement, primarily:
Main issues
(1) Narrative-level confusion, a mix of results, many hanging threads
The arc of this paper is very hard to follow, new experiments arise without a clear setup or connection to past ones. Concepts jump around unpredictably. The reading experience would be dramatically improved if there were a clear single line of logic going through the entire paper, which could be accomplished by inserting a paragraph at the end of the intro section that walks the reader step-by-step through what they are going to see. I don't recommend this for all papers - but this paper requires it, in my opinion, because we have such an unusual combination of experimental approaches, outcomes, and data formats (behavior, RNA seq, targeted tests of microglial activation in the setting of adult impairment and song development, electrophysiology during sleep. It's very difficult for me to tie this all together into a crisp narrative that sticks with me days after reading the paper. Instead, it feels like some disconnected factoids. Examples:<br /> a) Characterization of degradation and slow recovery (much slower than targeting of projection neurons form past work (Wang et al, 2024).<br /> b) Activation of microglia and MHC1 during the degraded period; microglia return to normal at recovery.<br /> c) Developmenta profile of microglia expression.<br /> e) Sleep replay in HVC is perturbed during the degraded state. Mostly returns to normal following recovery, but *some* aspects are still abnormal.<br /> f) Detailed ephys analysis of HVC excitability and RA suppression, invoking ideas that HVC drives RA inhibition.<br /> g) LMAN lesions do not block degradation or recovery.
There are at least three threads of this paper - it therefore reads like three different papers stitched together into one - united only by the method of HVC interneuron targeting. In my view, a pretty major overhaul is required, even if it means cutting out specific details and figures that distract from the paper's message (for example there is a whole sub-section analyzing HVC impact on RA that vaguely invokes ideas of HVC engagement of RA
(2) Interpretation of microglia is confusing and unresolved
Microglia activation is measured at peak song disruption, and returns to normal following recovery. To test if this phenomenon is associated with learning or degradation, the authors measure microglia during development.
"The increased inhibitory tone in HVC and the number of microglia could induce synaptic changes that contribute to degraded song production. Alternatively, the rise in microglia could be part of the recovery response to produce synaptic changes needed to regain the song following perturbation."
This is a great if/then statement on how to interpret the microglial activation at the core of the paper. But it remains unresolved. Is there a causal experiment that could distinguish these possibilities?
(3) The quantification of song dynamics during the recovery process in LMAN lesioned birds is required to support claims. Perhaps the most interesting claim of the paper - that recovery happens without LMAN, is not sufficiently supported by data analyses. This is a major problem.
The same analysis used in the LMAN-intact degradation/recovery dataset should be used for the LMAN dataset. At present, there are no quantification, only example spectrograms. Also, Supplementary Figure 4 and Supplementary Figure 5 are identical, suggesting a lack of proofreading in this part of the manuscript. For example the reader cannot even ascertain if the key aspect of song degradation - the production of exceedingly long syllables - is occurring in the LMAN lesioned animals.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The manuscript represents a fundamental advance in designing peptide inhibitors targeting Cdc20, a key activator and substrate-recognition subunit of the APC/C ubiquitin ligase. Supported by compelling biophysical and cellular evidence, the study lays a strong foundation for future developments in degron-based therapeutics. The revised manuscript has been strengthened by additional clarifications and data that address prior reviewer concerns. The work provides a robust framework for developing tools to manipulate protein degradation and will be of broad interest to researchers in protein engineering, cell cycle regulation, and targeted protein degradation.
-
Reviewer #1 (Public review):
Summary:
In this manuscript, the authors Eapen, et al. investigated the peptide inhibitors of Cdc20. They applied a rational design approach, substituting residues found in the D-box consensus sequences to better align the peptides with the Cdc20-degron interface. In the process, the authors designed and tested a series of more potent binders, including ones that contain unnatural amino acids, and verified binding modes by elucidating the Cdc-20-peptide structures. The authors further showed that these peptides can engage with Cdc20 in the cellular context, and can inhibit APC/CCdc20 ubiquitination activity. Finally, the authors demonstrated that these peptides could be used as portable degron motifs that drive the degradation of a fused fluorescent protein.
Strengths:
This manuscript is clear and straightforward to follow. The investigation of different peptide variations was comprehensive and well-executed. This work provided the groundwork for the development of peptide drug modalities to inhibit degradation or applying peptides as portable motifs to achieve targeted degradation. Both of which are impactful. The additional points provided by the authors in response to reviewers further strengthened the manuscript and enhanced its clarity.
Weaknesses:
None, the authors have addressed all my comments, and I have no additional suggestions.
-
Reviewer #3 (Public review):
Summary:
Eapen and coworkers use a rational design approach to generate new peptide-inspired ligands at the D-box interface of cdc20. These new peptides serve as new starting points for blocking APC/C in the context of cancer, as well as manipulating APC/C for targeted protein degradation therapeutic approaches.
Strengths:
The characterization of new peptide-like ligands is generally solid and multifaceted, including binding assays, thermal stability enhancement in vitro and in cells, X-ray crystallography, and degradation assays.
Comments on revisions:
I am satisfied with the changes in response to the first round of review.
-
Author response:
The following is the authors’ response to the original reviews
Public reviews:
Reviewer #1 (Public review):
Summary:
In this manuscript, the authors Eapen et al. investigated the peptide inhibitors of Cdc20. They applied a rational design approach, substituting residues found in the D-box consensus sequences to better align the peptides with the Cdc20-degron interface. In the process, the authors designed and tested a series of more potent binders, including ones that contain unnatural amino acids, and verified binding modes by elucidating the Cdc-20-peptide structures. The authors further showed that these peptides can engage with Cdc20 in the cellular context, and can inhibit APC/CCdc20 ubiquitination activity. Finally, the authors demonstrated that these peptides could be used as portable degron motifs that drive the degradation of a fused fluorescent protein.
Strengths:
This manuscript is clear and straightforward to follow. The investigation of different peptide variations was comprehensive and well-executed. This work provided the groundwork for the development of peptide drug modalities to inhibit degradation or apply peptides as portable motifs to achieve targeted degradation. Both of which are impactful.
Weaknesses:
A few minor comments:
(1) In my opinion, more attention to the solubility issue needs to be discussed and/or tested. On page 10, what is the solubility of D2 before a modification was made? The authors mentioned that position 2 is likely solvent exposed, it is not immediately clear to me why the mutation made was from one hydrophobic residue to another. What was the level of improvement in solubility? Are there any affinity data associated with the peptide that differ with D2 only at position 2?
The reviewer is correct that we have not done any detailed solubility characterisation; we refer only to observations rather than quantitative analysis. We wrote that we reverted from Leu to Ala due to solubility - we have clarified this statement (page 11) to say that that we reverted to Ala, as it was the residue present in D1, for which we observed a measurable affinity by SPR and saw a concentration-dependent response in the thermal shift analysis. We do not have any peptides or affinity data that explore single-site mutations with the parental peptide of D2. D2 is included in the paper because of its link to the consensus D-box sequence and thus was the logical path to the investigations into positions 3 and 7 that come later in the manuscript.
(2) I'm not entirely convinced that the D19 density not observed in the crystal structure was due to crystal packing. This peptide is peculiar as it also did not induce any thermal stabilization of Cdc20 in the cellular thermal shift assay. Perhaps the binding of this peptide could be investigated in more detail (i.e., NMR?) Or at least more explanation could be provided.
This section has been clarified (page 16). The lack of observed density was likely due to the relatively low affinity of D19 and also to the lack of binding of the three C-terminal residues in the crystal, and consequently it has a further reduced affinity. The current wording in the manuscript puts greater emphasis on this second aspect being a D19-specific issue, even though it applies to all four soaked peptides. The extent of peptide-induced thermal stabilisations observed by TSA and CETSA is different, with the latter experiment consistently showing smaller shifts. This observation may be due to the more complex medium (cell lysate vs. purified protein) and/or different concentrations of the proteins in solution. In the CETSA, we over-expressed a HiBiT-tagged Cdc20, which is present in addition to any endogenously expressed Cdc20. Although we did not investigate it, the near identical D-box binding sites on Cdc20 and Cdh1 would suggest that there will be cross-specificity, which could further influence the CETSA experiments.
The section now reads:
“We therefore assume that this is the reason for the lack of observed density in this region of the peptides D20 and D21 (Fig. S3E and S3F, respectively). We believe that it causes a reduction in binding affinities of all peptides in crystallo, given the evidence from SPR highlighting a role of position 7 in the interaction (Table 1). Interestingly, the observed electron density of the peptide correlates with Cdc20 binding affinity: D21 and D20, having the highest affinities, display the clearest electron density allowing six amino acids to be modeled, whereas D7 shows relatively poor density permitting modelling of only four residues. For D19, the lack of density observed likely reflects its intrinsically weaker affinity compared to the other peptides, in addition to losing the interactions from position 7 due to crystal packing.”
Reviewer #2 (Public review):
Summary:
The authors took a well-characterised (partly by them), important E3 ligase, in the anaphase-promoting complex, and decided to design peptide inhibitors for it based on one of the known interacting motifs (called D-box) from its substrates. They incorporate unnatural amino acids to better occupy the interaction site, improve the binding affinity, and lay foundations for future therapeutics - maybe combining their findings with additional target sites.
Strengths:
The paper is mostly strengths - a logical progression of experiments, very well explained and carried out to a high standard. The authors use a carefully chosen variety of techniques (including X-ray crystallography, multiple binding analyses, and ubiquitination assays) to verify their findings - and they impressively achieve their goals by honing in on tight-binders.
Weaknesses:
Some things are not explained fully and it would be useful to have some clarification. Why did the authors decide to model their inhibitors on the D-box motif and not the other two SLiMs that they describe?
For completeness, in addition to the D-box we did originally construct peptides based on the ABBA and KEN-box motifs, but they did not show any shift in melting temperature of cdc20 in the thermal shift assay whereas the D-box peptides did; consequently, we focused our efforts on the D-box peptides. Moreover, there is much evidence from the literature that points to the unique importance of the D-box motif in mediating productive interactions of substrates with the APC/C (i.e. those leading to polyubiquitination & degradation). One of the clearest examples is a study by Mark Hall’s lab (described in Qin et al. 2016), which tested the degradation of 15 substrates of yeast APC/C in strains carrying alleles of Cdh1 in which the docking sites for D-box, KEN or ABBA were mutated. They observed that whereas degradation of all 15 substrates depended on D-box binding, only a subset required the KEN binding site on Cdh1 and only one required the ABBA binding site. A more recent study from David Morgan’s lab (Hartooni et al. 2022) looking at binding affinities of different degron peptides concluded that KEN motif has very low affinity for Cdc20 and is unlikely to mediate degradation of APC/C-Cdc20 substrates. Engagement of substrate with the D-box receptor is therefore the most critical event mediating APC/C activity and the interaction that needs to be blocked for most effective inhibition of substrate degradation.
We have added the following text to the Results section “Design of D-box peptides” (page 10):
“We focused on D-box peptides, as there is much evidence from the literature that points to the unique importance of the D-box motif in mediating productive interactions of substrates with the APC/C (i.e. those leading to polyubiquitination & degradation). One of the clearest examples is a study that tested the degradation of 15 substrates of yeast APC/C in strains carrying alleles of Cdh1 in which the docking sites for D-box, KEN or ABBA were mutated ((Qin et al. 2017)). They observed that, whereas degradation of all 15 substrates depended on D-box binding, only a subset required the KEN binding site on Cdh1 and only one required the ABBA binding site. A more recent study (Hartooni et al. 2022) of binding affinities of different degron peptides concluded that KEN motif has very low affinity for Cdc20 and is unlikely to mediate degradation of APC/C-Cdc20 substrates. Engagement of substrate with the D-box receptor is therefore the most critical event mediating APC/C activity and the interaction that needs to be blocked for most effective inhibition of substrate degradation.”
What exactly do they mean when they say their 'observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast 'pseudo-substrate' inhibitor Acm1, acts to impede polyubiquitination of the bound protein'? It's an interesting thing to think about, and probably the paper they cite explains it more but I would like to know without having to find that other paper.
Interesting results from a number of labs (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011, Qin et al. 2019) have shown that mutation of degron SLiMs in Acm1 that weaken interaction with the APC/C have the unexpected consequence of converting Acm1 from APC/C inhibitor to APC/C substrate. A necessary conclusion of these studies is that the outcome of degron binding (i.e. whether the binder functions as substrate or inhibitor) depends on factors other than D-box affinity and that D-box affinity can counteract them. One idea is that if a binder interacts too tightly, this removes some flexibility required for the polyubiquitination process. The most recent study on this question (Qin et al.2019) specifically pins the explanation for the inhibitory function of the high affinity D-box in Acm1 on its ‘D-box Extension’ (i.e. residues 8-12) preventing interaction with APC10. In our current study, the binding affinity of peptides is measured against Cdc20. In cellular assays however, the D-box must also engage APC10 for degradation to occur. It may be that the peptide binding most strongly to the D-box pocket on Cdc20 is less able to bind to APC10 and therefore less effective in triggering APC10-dependent steps in the polyubiquitination pathway. The important Hartooni et al. paper from David Morgan’s lab confirms that even though the binding of D-box residues to APC10 is very weak on its own, it can contribute 100X increase in affinity of a peptide by adding cooperativity to the interaction of D-box with co-activator. Re Figure 6 and the fact that we did look at peptide binding in cells, these experiments were done in unsynchronised cells, so most Cdc20 would not be bound to APC/C.
We have modified the text (page 18) from:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast ‘pseudo-substrate’ inhibitor Acm1, acts to impede polyubiquitination of the bound protein (Qin et al. 2019). Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. As shown in Qin et al., mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Qin et al. 2019). Overall, our results support the conclusions that all the D-box peptides engage productively with the APC/C and that the highest affinity interactors act as inhibitors rather than functional degrons of APC/C.”
to:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with conclusions from other studies that affinity of degron binding does not necessarily correlate with efficiency of degradation. Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. A number of studies of a yeast ‘pseudo-substrate’ inhibitor Acm1, have shown that mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011) through a mechanism that governs recruitment of APC10 (Qin et al. 2019). Our study does not consider the contribution of APC10 to binding of our peptides to APC/C<sup>Cdc20</sup> complex, but since there is strong cooperativity provided by this additional interaction (Hartooni et al. 2022) we propose this as the critical factor in determining the ability of the different peptides to mediate degradation of associated mNeon.”
Reviewer #3 (Public review):
Summary:
Eapen and coworkers use a rational design approach to generate new peptide-inspired ligands at the D-box interface of cdc20. These new peptides serve as new starting points for blocking APC/C in the context of cancer, as well as manipulating APC/C for targeted protein degradation therapeutic approaches.
Strengths:
The characterization of new peptide-like ligands is generally solid and multifaceted, including binding assays, thermal stability enhancement in vitro and in cells, X-ray crystallography, and degradation assays.
Weaknesses:
One important finding of the study is that the strongest binders did not correlate with the fastest degradation in a cellular assay, but explanations for this behavior were not supported experimentally. Some minor issues regarding experimental replicates and details were also noted.
Interesting results from a number of labs (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011, Qin et al. 2019) have shown that mutation of degron SLiMs in Acm1 that weaken interaction with the APC/C have the unexpected consequence of converting Acm1 from APC/C inhibitor to APC/C substrate. A necessary conclusion of these studies is that the outcome of degron binding (i.e. whether the binder functions as substrate or inhibitor) depends on factors other than D-box affinity and that D-box affinity can counteract them. One idea is that if a binder interacts too tightly, this removes some flexibility required for the polyubiquitination process. The most recent study on this question (Qin et al.2019) specifically pins the explanation for the inhibitory function of the high affinity D-box in Acm1 on its ‘D-box Extension’ (i.e. residues 8-12) preventing interaction with APC10. In our current study, the binding affinity of peptides is measured against Cdc20. In cellular assays however, the D-box must also engage APC10 for degradation to occur. It may be that the peptide binding most strongly to the D-box pocket on Cdc20 is less able to bind to APC10 and therefore less effective in triggering APC10-dependent steps in the polyubiquitination pathway. The important Hartooni et al. paper from David Morgan’s lab confirms that even though the binding of D-box residues to APC10 is very weak on its own, it can contribute 100X increase in affinity of a peptide by adding cooperativity to the interaction of D-box with co-activator. Re Figure 6 and the fact that we did look at peptide binding in cells, these experiments were done in unsynchronised cells, so most Cdc20 would not be bound to APC/C.
We have modified the text (page 18) from:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with the idea that high-affinity binding at degron binding sites on APC/C, such as in the case of the yeast ‘pseudo-substrate’ inhibitor Acm1, acts to impede polyubiquitination of the bound protein (Qin et al. 2019). Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. As shown in Qin et al., mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Qin et al. 2019). Overall, our results support the conclusions that all the D-box peptides engage productively with the APC/C and that the highest affinity interactors act as inhibitors rather than functional degrons of APC/C.”
to:
“However, we found the opposite effect: D2 and D3 showed increased rates of mNeon degradation compared to D1 and D19 (Fig. 8C,D). This observation is consistent with conclusions from other studies that affinity of degron binding does not necessarily correlate with efficiency of degradation. Indeed, there is no evidence that Hsl1, which is the highest affinity natural D-box (D1) used in our study, is degraded any more rapidly than other substrates of APC/C in yeast mitosis. A number of studies of a yeast ‘pseudo-substrate’ inhibitor Acm1, have shown that mutation of the high affinity D-box in Acm1 converts it from inhibitor to substrate (Choi et al. 2008, Enquist-Newman et al. 2008, Burton et al. 2011) through a mechanism that governs recruitment of APC10 (Qin et al. 2019). Our study does not consider the contribution of APC10 to binding of our peptides to APC/C<sup>Cdc20</sup> complex, but since there is strong cooperativity provided by this additional interaction (Hartooni et al. 2022) we propose this as the critical factor in determining the ability of the different peptides to mediate degradation of associated mNeon.”
Recommendations for the authors:
Reviewer #1 (Recommendations for the authors):
(1) On page 12 (towards the end), the author stated D10 contained an A3P mutation, they meant P3A right? 'To test this hypothesis, we proceeded to synthesise D10, a derivative of D4 containing an A3P single point mutation.'
We thank the reviewer for spotting this typo, which we have corrected.
(2) Have the authors considered other orthogonal approaches to cross-examine/validate binding affinities? That said, I do not think extra experiments are necessary.
We did not explore further orthogonal approaches due to the challenges of producing sufficient amounts of the Cdc20 protein. Due to the low affinities of many peptides for Cdc20, many techniques would have required more protein than we were able to produce. We believe that the qualitative TSA combined with the SPR is sufficient to convince the readers; indeed there is a correlation between SPR-determined binding affinities and the thermal shifts: For the natural amino acid-containing peptides (Table 1) D19 has the highest affinity and causes the largest thermal shift in the Cdc20 melting temperature, D10 has the lowest affinity and causes the smallest thermal shift, and D1, D3, D4, and D5 and all rank in the middle by both techniques. For those peptides containing unnatural amino acids (Table 2), again higher affinities are reflected in larger thermal shifts.
Reviewer #2 (Recommendations for the authors):
The data seem fine to me. I would appreciate a little more detail on the points mentioned in the public review. Also a thorough reread, maybe by a disinterested party as there are various typos that could be corrected - all in all an excellent clear paper that encompasses a lot of work.
A colleague has carefully checked the manuscript, and typos have been corrected.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This is an interesting study that adds useful new data addressing how different DAG pools influence cellular signaling. The study dissects how the enzyme Dip2 modulates the minor lipid signaling DAG pool, which is distinct from the lipid metabolism DAG pool utilized in membrane production. Overall the analysis is solid and broadly supports the claims.
-
Reviewer #1 (Public review):
Summary:
The study dissects distinct pools of diacylglycerol (DAG), continuing a line of research on the central concept that there is a major lipid metabolism DAG pool in cells, but also a smaller signaling DAG pool. It tests the hypothesis that the second pool is regulated by Dip2, which influences Pkc1 signaling. The group shows that stressed yeast increase specific DAG species C36:0 and 36:1, and propose this promotes Pkc1 activation via Pck1 binding 36:0. The study also examines how perturbing the lipid metabolism DAG pool via various deletions such as lro1, dga1, and pah1 deletion impacts DAG and stress signaling. Overall this is an interesting study that adds new data to how different DAG pools influence cellular signaling.
Strengths:
The study nicely combined lipidomic profiling with stress signaling biochemistry and yeast growth assays.
Weaknesses:
One suggestion to improve the study is to examine the spatial organization of Dip2 within cells, and how this impacts its ability to modulate DAG pools. Dip2 has previously been proposed to function at mitochondria-vacuole contacts (Mondal 2022). Examining how Dip2 localization is impacted when different DAG pools are manipulated such as by deletion Pah1 (also suggested to work at yeast contact sites such as the nucleus-vacuole junction), or with Lro1 or Dga1 deletion would broaden the scope of the study.
Comments on revisions:
The revision addresses several of the concerns raised previously. Most importantly, it softens several conclusions that more clearly delineates limitations of the study. The study has yet to address how Dip2 and Pkc1 crosstalk, but new text addresses this limitation. There is also more analysis of Dip2 localization in other conditions where cell DAG pools are elevated (ie a LRO1 and DGA1 double KO, as well as PAH1 KO). Loss of these proteins elevates ER DAG, but Dip2 remains mitochondrially associated. This may imply DAG specificity, or that changes to DAG pools globally does not impact Dip2 import into mitochondria.
-
Reviewer #2 (Public review):
Summary:
The authors use yeast genetics, lipidomic and biochemical approaches to demonstrate the DAG isoforms (36:0 and 36:1) can specifically activate PKC. Further, these DAG isoforms originate from PI and PI(4,5)P2. The authors propose that the Psi1-Plc1-Dip2 functions to maintain a normal level of specific DAG species to modulate PKC signalling.
Strengths:
Data from yeast genetics are clear and strong. The concept is potentially interesting and novel.
Weaknesses: More evidence is needed to support the central hypothesis. The authors may consider the following:
(1) Figure 2: the authors should show/examine C36:1 DAG. Also, some structural evidence would be highly useful here. What is the structural basis for the assertion that the PKC C1 domain can only be activated by C36:0/1 DAG but not other DAGs? This is a critical conclusion of this work and clear evidence is needed.
(2) Does Dip2 colocalize with Plc1 or Pkc1? Does Dip2 reach the plasma membrane upon Plc activation?
Comments on revisions:
The authors have addressed my concerns.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public review):
Summary:
The study dissects distinct pools of diacylglycerol (DAG), continuing a line of research on the central concept that there is a major lipid metabolism DAG pool in cells, but also a smaller signaling DAG pool. It tests the hypothesis that the second pool is regulated by Dip2, which influences Pkc1 signaling. The group shows that stressed yeast increase specific DAG species C36:0 and 36:1, and propose this promotes Pkc1 activation via Pck1 binding 36:0. The study also examines how perturbing the lipid metabolism DAG pool via various deletions such as lro1, dga1, and pah1 deletion impacts DAG and stress signaling. Overall this is an interesting study that adds new data to how different DAG pools influence cellular signaling.
Strengths:
The study nicely combined lipidomic profiling with stress signaling biochemistry and yeast growth assays.
We thank the reviewer for finding this study of interest and appreciating our multi-pronged approach to prove our hypothesis that a distinct pool of DAGs regulated by Dip2 activate PKC signalling.
Weaknesses:
One suggestion to improve the study is to examine the spatial organization of Dip2 within cells, and how this impacts its ability to modulate DAG pools. Dip2 has previously been proposed to function at mitochondria-vacuole contacts (Mondal 2022). Examining how Dip2 localization is impacted when different DAG pools are manipulated such as by deletion Pah1 (also suggested to work at yeast contact sites such as the nucleus-vacuole junction), or with Lro1 or Dga1 deletion would broaden the scope of the study.
We thank the reviewer for the suggestion to trace the localization of Dip2 in the absence of various DAG-acting enzymes. To address this, we generated Dip2-GFP knock-in (KI) in Δpah1, Δlro1 and Δdga1 strains, confirming successful integration by western blotting using an anti-GFP antibody. We then performed microscopy to examine the localization of Dip2. Since Dip2 is a mitochondria-vacuole contact site protein that predominantly localizes to mitochondria (approximately 60% puncta of Dip2 localize to mitochondria) (Mondal et al. 2022), we co-stained the cells with MitoTracker red to visualize mitochondria.
Consistent with our previous findings, Dip2 colocalizes with the MitoTracker red in WT (Figure 3-figure supplement 2 A). As suggested by the reviewer, we deleted PAH1, which converts phosphatidic acid to DAGs and is also known to work at the nucleus-vacuole junction. On examining whether absence of PAH1 influences the localization of Dip2, we found that there is no change in Dip2’s spatial organization. This could also be due to no observable change in the DAG species on deleting PAH1, as noted in our lipidomic studies (Figure 4. figure supplement 2A). These observations suggest that in a homeostatic condition, Pah1 does not affect the DAG pool acted upon by Dip2 and therefore has no influence on Dip2’s subcellular localization. This data has been incorporated in the revised manuscript (line no. 286-289) and Figure 4-figure supplement 2D-E.
Similarly, we probed for the localization of Dip2 in LRO1 and DGA1 knock out strains. These enzymes are responsible for converting bulk DAGs to TAGs. We have previously shown that Dip2 is selective for only C36:0 and C36:1 and does not act on the bulk DAGs (Mondal et al. 2022). Both Lro1 and Dga1 are endoplasmic reticulum (ER) resident proteins and the bulk DAG accumulation in their knockouts is shown to be in the ER (Li et al. 2020), not influencing the mitochondrial DAG pool. On tracing Dip2’s localization in these knockouts, we found that Dip2 remains in the mitochondria (Figure 3-figure supplement 2, Figure 4. figure supplement 2D,E). These results suggest that Dip2 localization is not influenced by bulk DAG accumulation, reinforcing its specificity toward selective DAGs, which are likely to be present at mitochondria and mitochondria-vacuole contact sites. We have added this data in the revised manuscript (line no. 240-246) with Figure 3. figure supplement 2.
Reviewer #2 (Public review):
Summary:
The authors use yeast genetics, lipidomic and biochemical approaches to demonstrate the DAG isoforms (36:0 and 36:1) can specifically activate PKC. Further, these DAG isoforms originate from PI and PI(4,5)P2. The authors propose that the Psi1-Plc1-Dip2 functions to maintain a normal level of specific DAG species to modulate PKC signalling.
Strengths:
Data from yeast genetics are clear and strong. The concept is potentially interesting and novel.
We would like to thank the reviewer for the positive comments on our work and finding the study novel and interesting.
Weaknesses:
More evidence is needed to support the central hypothesis. The authors may consider the following:
(1) Figure 2: the authors should show/examine C36:1 DAG. Also, some structural evidence would be highly useful here. What is the structural basis for the assertion that the PKC C1 domain can only be activated by C36:0/1 DAG but not other DAGs? This is a critical conclusion of this work and clear evidence is needed.
We thank the reviewer for the insightful comments. We were unable to include C36:1 DAG in our in vitro DAG binding assays because it is not commercially available. We have now explicitly mentioned it in the revised manuscript (Line no. 186).
We agree with the reviewer that PKC activated by C36:0 and C36:1 DAGs is a critical conclusion of our work. While we understand that there is no obvious structural explanation as to how the DAG binding C1 domain of PKC attains the acyl chain specificity for DAGs, our conclusion that yeast Pkc1 is selective for C36:0 and C36:1 DAGs, is supported by a combination of robust in vitro and in vivo data:
(1) In Vitro Evidence: The liposome binding assays demonstrate that the Pkc1 C1 domain binds only to the selective DAG and does not interact with bulk DAGs.
(2) In Vivo Evidence: Lipidomic analyses of wild-type cells subjected to cell wall stress reveal increased levels of C36:0 and C36:1 DAGs, while levels of bulk DAGs remain unaffected.
These findings collectively indicate that Pkc1 neither binds nor is activated by bulk DAGs, reinforcing its specificity for C36:0 and C36:1 DAGs.
Moreover, the structural basis of this selectivity would require either a specific DAG-bound C1 domain structure of Pkc1, which is difficult owing to the flexibility of the longer acyl chains present in C36:0 and C36:1 DAGs. In addition, capturing the full-length Pkc1 structure that might provide deeper insights has been challenging for several other groups. Also, we hypothesize that the DAG selectivity by Pkc1 is more of a membrane phenomenon wherein these DAGs might create a specific microdomain or form a particular curvature that is sensed by Pkc1. Investigating this would require extensive structural and biophysical studies, that are beyond the scope of the current work but are planned for future research.
(2) Does Dip2 colocalize with Plc1 or Pkc1?
As shown in our previous study (Mondal et al. 2022) and in the above section (Figure 3. figure supplement 2(A-B)), Dip2 predominantly localizes to the mitochondria. Pkc1, on the other hand, is known to be found in the cytosol, plasma membrane and bud site (Andrews and Stark 2000). We also checked the localization of Pkc1, co-stained with mitotracker-red and observed no significant overlap between the two, confirming that Pkc1 does not colocalize with Dip2 (Author response image 1).
Author response image 1.
Live cell microscopy for tracing Pkc1 localization. (A) Microscopy image panel showing DIC image (left), fluorescence for (A) Pkc1 tagged with GFP, mitotracker-red for staining mitochondria and the merged image for both the fluorophores (right). Scale bar represents 5 µm. (B) Line scan plotted for the fluorescence intensity of Pkc1-GFP along with mitotracker-red across the line shown in the merged panel.
Moreover, as suggested by the reviewer, we also checked the localization of Plc1 and found that Plc1 is present in cytosol and shows a partial colocalization with the mitochondria (Figure 4-figure supplement 3A-B). As some puncta of Dip2 also colocalize with the vacuoles, we checked whether Plc1 also follows such localization pattern. We costained Plc1-GFP with FM4-64, a vacuolar membrane dye and observed that Plc1 partially localizes to vacuoles as well (Figure 4-figure supplement 3C-D). This is also observed in a previous study where Plc1 was found in a subcellular fractionation of isolated yeast vacuoles and total cell lysate (Jun, Fratti, and Wickner 2004). We also checked similar to Dip2, whether Plc1 also localizes to the Mitochondria-vacuole contact site by using tri-colour imaging with FM4-64 for vacuole, DAPI for mitochondria and GFP tagged Plc1. We were not able to trace Dip2 and Plc1 simultaneously as we could not generate a strain endogenously tagged with two different colours even after several attempts. However, from our observations, we can conclude that Plc1 partially localizes to mitochondria and vacuole and might be locally producing the selective DAGs to be acted upon by Dip2. We have incorporated this data in the revised manuscript (line no. 301-304) with Figure 4-figure supplement 3.
For probing the localization of Dip2 upon Plc1 activation, we used cell wall stress- a condition inducing Plc1 activation for selective DAG production (this study). Under this condition, we probed the localization of Dip2 by fluorescent microscopy and found that Dip2 does not move to the plasma membrane but remains localized to mitochondria (Figure. 1. figure supplement 3). This result has been added in the revised manuscript (line no. 153-160) with Figure. 1-figure supplement 3.
This raises intriguing questions regarding the spatial regulation of Pkc1 by Dip2. Since Dip2’s localization remains unaffected, whether the selective DAGs, presumably at the mitochondria, move to the plasma membrane for Pkc1 activation or the Pkc1 translocates to the mitochondria needs further exploration. Addressing these possibilities will require a combination of genetic approaches, organellar lipidomics, and advanced microscopy, which we aim to explore in future studies.
References:
Andrews, P. D., and M. J. Stark. 2000. “Dynamic, Rho1p-Dependent Localization of Pkc1p to Sites of Polarized Growth.” Journal of Cell Science 113 ( Pt 15): 2685–93. doi:10.1242/jcs.113.15.2685.
Jun, Youngsoo, Rutilio A. Fratti, and William Wickner. 2004. “Diacylglycerol and Its Formation by Phospholipase C Regulate Rab- and SNARE-Dependent Yeast Vacuole Fusion*.” Journal of Biological Chemistry 279(51): 53186–95. doi:10.1074/jbc.M411363200.
Li, Dan, Shu-Gao Yang, Cheng-Wen He, Zheng-Tan Zhang, Yongheng Liang, Hui Li, Jing Zhu, et al. 2020. “Excess Diacylglycerol at the Endoplasmic Reticulum Disrupts Endomembrane Homeostasis and Autophagy.” BMC Biology 18(1): 107. doi:10.1186/s12915-020-00837-w.
Mondal, Sudipta, Priyadarshan Kinatukara, Shubham Singh, Sakshi Shambhavi, Gajanan S Patil, Noopur Dubey, Salam Herojeet Singh, et al. 2022. “DIP2 Is a Unique Regulator of Diacylglycerol Lipid Homeostasis in Eukaryotes.” eLife 11: e77665. doi:10.7554/eLife.77665.
-
-
-
eLife Assessment
This study reveals a neural signature of a common behavioural phenomenon: serial dependence, whereby estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The study provides solid evidence that this phenomenon arises primarily during working memory maintenance. The pervasiveness of serial dependencies across modalities and species makes these findings important for researchers interested in perceptual decision-making across subfields.
-
Reviewer #1 (Public review):
This study uses MEG to test for a neural signature of the trial history effect known as 'serial dependence.' This is a behavioral phenomenon whereby stimuli are judged to be more similar than they really are, in feature space, to stimuli that were relevant in the recent past (i.e., the preceding trials). This attractive bias is prevalent across stimulus classes and modalities, but a neural source has been elusive. This topic has generated great interest in recent years, and I believe this study makes a unique contribution to the field.
Specifically, while previous neuroimaging studies have found apparent reactivations of previous information, or repulsive biases that may indirectly relate to serial dependence, here Fischer at al. find an attractive bias in neural activity patterns that aligns with the direction of the behavioral effect. Moreover, the data show that the bias emerges later in a trial, after perceptual encoding, which speaks to an ongoing debate about whether such biases are perceptual or decisional.
The revised preprint thoroughly addresses many of the initial concerns, but the results are still open to interpretation. For instance, the model training/testing regime allows that some training data timepoints may be inherently noisier than others (e.g., delay period more so than encoding), and potentially more (or differently) susceptible to bias. The S1 and S2 epochs show no attractive bias, but they may also be based on more high fidelity training sets (i.e., encoding), and therefore less susceptible to the bias that is evident in the retrocue epoch. So, the results could reflect that serial dependence is indeed a post-perceptual process, or it may instead be that the WM representations, as detected with these MEG analyses, become noisier and more subject to reveal the attractive bias over time.
The results are intriguing, but the study was not powered to examine whether there is any feature-specificity to the neural bias (e.g., whether it matches the behavioral pattern that biases are amplified within a particular range of feature distances between stimuli). Nor do analyses get at temporally precise information about when attractive and repulsive biases appear, which would help to better reconcile the work with previous findings. As in, the reconstructions average across coarse trial epochs. The S1 and S2 reconstructions show no attractive bias, and appear to show subtle repulsion, but if the timing were examined more precisely, we might see repulsion magnified at earlier timepoints that shift toward attraction at later time points, thereby counteracting the effect. That is to say that the averaging approach, across feature values and timepoints, still leaves these important theoretical questions unresolved.
Nonetheless, the work marks an important step in identifying the neurophysiological bases of serial dependence. Ideally, all of the data, including the eye-tracking, would be made available so that others might try to address some of these follow-up questions.
-
Reviewer #2 (Public review):
Summary:
The study aims to probe the neural correlates of visual serial dependence - the phenomenon that estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The authors utilize an established retro-cue working memory task together with magnetoencephalography, which allows to probe neural representations of motion direction during encoding and retrieval (retro-cue) periods of each trial. The main finding is that neural representations of motion direction are not systematically biased during the encoding of motion stimuli, but are attracted towards the motion direction of the previous trial's target during the retrieval (retro-cue period), just prior to the behavioral response. By demonstrating a neural signature of attractive biases in working memory representations, which align with attractive behavioral biases, this study highlights the importance of post-encoding memory processes in visual serial dependence.
Strengths:
The main strength of the study is its elegant use of a retro-cue working memory task together with high temporal resolution MEG, enabling to probe neural representations related to stimulus encoding and working memory. The behavioral task elicits robust behavioral serial dependence and replicates previous behavioral findings by the same research group. The careful neural decoding analysis benefits from a large number of trials per participant, considering the slow-paced nature of the working memory paradigm. This is crucial in a paradigm with considerable trial-by-trial behavioral variability (serial dependence biases are typically small, relative to the overall variability in response errors). While the current study is broadly consistent with previous studies showing that attractive biases in neural responses are absent during stimulus encoding (prev. studies reported repulsive biases), to my knowledge, it is the first study showing attractive biases in current stimulus representations during working memory. The study also connects to previous literature showing reactivations of previous stimulus representations, although the link between reactivations and biases remains somewhat vague in the current manuscript. Together, the study reveals an interesting avenue for future studies investigating the neural basis of visual serial dependence.
Weaknesses:
The main weakness of the current manuscript is that the authors could have done more analyses to address the concern that their neural decoding results are driven by signals related to eye movements. The authors show that participants' gaze position systematically depended on the current stimuli's motion directions, which, together with previous studies on eye movement-related confounds in neural decoding, justifies such a concern. The authors seek to rule out this confound by showing that the consistency of stimulus-dependent gaze position does not correlate with (a) the neural reconstruction fidelity and (b) the attractive shift in reconstructed motion direction. However, the authors' approach of quantifying stimulus-dependent eye movements only considers gaze angle and not gaze amplitude, and thus potentially misses important features of eye movements that could manifest in the MEG data. Moreover, it is unclear whether the gaze consistency metric should correlate with attractive history biases in neural decoding, if there were a confound. These two concerns could be potentially addressed by (1) directly decoding stimulus motion direction from x-y gaze coordinates and relating this decoding performance to neural reconstruction fidelity, and (2) investigating whether gaze coordinates themselves are history-dependent and are attracted to the average gaze position associated with the previous trials' target stimulus. If the authors could show that (2) is not the case, I would be much more convinced that their main finding is not driven by eye movement confounds.
The sample size (n = 10) is definitely at the lower end of sample sizes in this field. The authors collected two sessions per participant, which partly alleviates the concern. However, given that serial dependencies can be very variable across participants, I believe that future studies should aim for larger sample sizes.
It would have been great to see an analysis in source space. As the authors mention in their introduction, different brain areas, such as PPC, mPFC and dlPFC have been implicated in serial biases. This begs the question which brain areas contribute to the serial dependencies observed in the current study? For instance, it would be interesting to see whether attractive shifts in current representations and pre-stimulus reactivations of previous stimuli are evident in the same or different brain areas.
-
Reviewer #2 (Public review):
Summary:
The study aims to probe the neural correlates of visual serial dependence - the phenomenon that estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The authors utilize an established retro-cue working memory task together with magnetoencephalography, which allows to probe neural representations of motion direction during encoding and retrieval (retro-cue) periods of each trial. The main finding is that neural representations of motion direction are not systematically biased during the encoding of motion stimuli, but are attracted towards the motion direction of the previous trial's target during the retrieval (retro-cue period), just prior to the behavioral response. By demonstrating a neural signature of attractive biases in working memory representations, which align with attractive behavioral biases, this study highlights the importance of post-encoding memory processes in visual serial dependence.
Strengths:
The main strength of the study is its elegant use of a retro-cue working memory task together with high temporal resolution MEG, enabling to probe neural representations related to stimulus encoding and working memory. The behavioral task elicits robust behavioral serial dependence and replicates previous behavioral findings by the same research group. The careful neural decoding analysis benefits from a large number of trials per participant, considering the slow-paced nature of the working memory paradigm. This is crucial in a paradigm with considerable trial-by-trial behavioral variability (serial dependence biases are typically small, relative to the overall variability in response errors). While the current study is broadly consistent with previous studies showing that attractive biases in neural responses are absent during stimulus encoding (prev. studies reported repulsive biases), to my knowledge, it is the first study showing attractive biases in current stimulus representations during working memory. The study also connects to previous literature showing reactivations of previous stimulus representations, although the link between reactivations and biases remains somewhat vague in the current manuscript. Together, the study reveals an interesting avenue for future studies investigating the neural basis of visual serial dependence.
Weaknesses:
The main weakness of the current manuscript is that the authors could have done more analyses to address the concern that their neural decoding results are driven by signals related to eye movements. The authors show that participants' gaze position systematically depended on the current stimuli's motion directions, which, together with previous studies on eye movement-related confounds in neural decoding, justifies such a concern. The authors seek to rule out this confound by showing that the consistency of stimulus-dependent gaze position does not correlate with (a) the neural reconstruction fidelity and (b) the attractive shift in reconstructed motion direction. However, the authors' approach of quantifying stimulus-dependent eye movements only considers gaze angle and not gaze amplitude, and thus potentially misses important features of eye movements that could manifest in the MEG data. Moreover, it is unclear whether the gaze consistency metric should correlate with attractive history biases in neural decoding, if there were a confound. These two concerns could be potentially addressed by (1) directly decoding stimulus motion direction from x-y gaze coordinates and relating this decoding performance to neural reconstruction fidelity, and (2) investigating whether gaze coordinates themselves are history-dependent and are attracted to the average gaze position associated with the previous trials' target stimulus. If the authors could show that (2) is not the case, I would be much more convinced that their main finding is not driven by eye movement confounds.
The sample size (n = 10) is definitely at the lower end of sample sizes in this field. The authors collected two sessions per participant, which partly alleviates the concern. However, given that serial dependencies can be very variable across participants, I believe that future studies should aim for larger sample sizes.
It would have been great to see an analysis in source space. As the authors mention in their introduction, different brain areas, such as PPC, mPFC and dlPFC have been implicated in serial biases. This begs the question which brain areas contribute to the serial dependencies observed in the current study? For instance, it would be interesting to see whether attractive shifts in current representations and pre-stimulus reactivations of previous stimuli are evident in the same or different brain areas.
-
Author response:
The following is the authors’ response to the original reviews
Public Reviews:
Reviewer #1 (Public Review):
This study uses MEG to test for a neural signature of the trial history effect known as 'serial dependence.' This is a behavioral phenomenon whereby stimuli are judged to be more similar than they really are, in feature space, to stimuli that were relevant in the recent past (i.e., the preceding trials). This attractive bias is prevalent across stimulus classes and modalities, but a neural source has been elusive. This topic has generated great interest in recent years, and I believe this study makes a unique contribution to the field. The paper is overall clear and compelling, and makes effective use of data visualizations to illustrate the findings. Below, I list several points where I believe further detail would be important to interpreting the results. I also make suggestions for additional analyses that I believe would enrich understanding but are inessential to the main conclusions.
(1) In the introduction, I think the study motivation could be strengthened, to clarify the importance of identifying a neural signature here. It is clear that previous studies have focused mainly on behavior, and that the handful of neuroscience investigations have found only indirect signatures. But what would the type of signature being sought here tell us? How would it advance understanding of the underlying processes, the function of serial dependence, or the theoretical debates around the phenomenon?
Thank you for pointing this out. Our MEG study was designed to address two questions: 1) we asked whether we could observe a direct neural signature of serial dependence, and 2) if so, whether this signature occurs at the encoding or post-encoding stage of stimulus processing in working memory. This second question directly concerns the current theoretical debate on serial dependence.
Previous studies have found only indirect signatures of serial dependence such as reactivations of information from the previous trial or signatures of a repulsive bias, which were in contrast to the attractive bias in behavior. Thus, it remained unclear whether an attractive neural bias can be observed as a direct reflection of the behavioral bias. Moreover, previous studies observed the neuronal repulsion during early visual processes, leading to the proposal that neural signals become attracted only during later, post-encoding processes. However, these later processing stages were not directly accessible in previous studies. To address these two questions, we combined MEG recordings with an experimental paradigm with two items and a retro-cue. This design allowed to record neural signals during separable encoding and post-encoding task phases and so to pinpoint the task phase at which a direct neural signature of serial dependence occurred that mirrored the behavioral effect.
We have slightly modified the Introduction to strengthen the study motivation.
(1a) As one specific point of clarification, on p. 5, lines 91-92, a previous study (St. JohnSaaltink et al.) is described as part of the current study motivation, stating that "as the current and previous orientations were either identical or orthogonal to each other, it remained unclear whether this neural bias reflected an attraction or repulsion in relation to the past." I think this statement could be more explicit as to why/how these previous findings are ambiguous. The St. John-Saaltink study stands as one of very few that may be considered to show evidence of an early attractive effect in neural activity, so it would help to clarify what sort of advance the current study represents beyond that.
Thank you for this comment. In the study by St. John-Saaltink et al. (2016), two gratings oriented at 45° and 135° were always presented to either the left or right side of a central fixation point in a trial (90° orientation difference). As only the left/right position of the 45° and 135° gratings varied across trials, the target stimulus in the current trial was either the same or differed by exactly 90° from the previous trial. In consequence, this study could not distinguish whether the observed bias was attractive or repulsive, which concerned both the behavioral effect and the V1 signal. Furthermore, the bias in the V1 signal was partially explained by the orientation that was presented at the same position in the previous trial, which could reflect a reactivation of the previous orientation rather than an actual altered orientation.
We have changed the Introduction accordingly.
References:
St. John-Saaltink E, Kok P, Lau HC, de Lange FP (2016) Serial Dependence in Perceptual Decisions Is Reflected in Ac6vity Pa9erns in Primary Visual Cortex. Journal of Neuroscience 36: 6186–6192.
(1b) The study motivation might also consider the findings of Ranieri et al (2022, J. Neurosci) Fornaciai, Togoli, & Bueti (2023, J. Neurosci), and Lou& Collins (2023, J. Neurosci) who all test various neural signatures of serial dependence.
Thank you. As all listed findings showed neural signatures revealing a reactivation of the previous stimulus or a response during the current trial, we have added them to the paragraph in the Introduction referring to this class of evidence for the neural basis for serial dependence.
(2) Regarding the methods and results, it would help if the initial description of the reconstruction approach, in the main text, gave more context about what data is going into reconstruction (e.g., which sensors), a more conceptual overview of what the 'reconstruction' entails, and what the fidelity metric indexes. To me, all of that is important to interpreting the figures and results. For instance, when I first read, it was unclear to me what it meant to "reconstruct the direction of S1 during the S2 epoch" (p. 10, line 199)? As in, I couldn't tell how the data/model knows which item it is reconstructing, as opposed to just reporting whatever directional information is present in the signal.
(2a) Relatedly, what does "reconstruction strength" reflect in Figure 2a? Is this different than the fidelity metric? Does fidelity reflect the strength of the particular relevant direction, or does it just mean that there is a high level of any direction information in the signal? In the main text explain what reconstruction strength and what fidelity is?
Thank you for pointing this out. We applied the inverted encoding model method to MEG data from all active sensors (271) within defined time-windows of 100 ms length. MEG data was recorded in two sessions on different days. Specifically, we constructed an encoding model with 18 motion direction-selective channels. Each channel was designed to show peak sensitivity to a specific motion direction, with gradually decreasing sensitivity to less similar directions. In a training step, the encoding model was fiCed to the MEG data of one session to obtain a weight matrix that indicates how well the sensor activity can be explained by the modeled direction. In the testing step, the weight matrix was inverted and applied to the MEG data of the other session, resulting in a response profile of ‘reconstruction strengths’, i.e., how strongly each motion direction was present in a trial. When a specific motion direction was present in the MEG signal, the reconstruction strengths peaked at that specific direction and decreased with increasing direction difference. If no information was present, reconstruction strengths were comparable across all modeled directions, i.e., the response profile was flat. To integrate response profiles across trials, single trial profiles were aligned to a common center direction (i.e., 180°) and then averaged.
To quantify the accuracy of each IEM reconstruction, i.e., how well the response profile represents a specific motion direction relative to all other directions we computed the ‘reconstruction fidelity’. Fidelity was obtained by projecting the polar vector of the reconstruction at every direction angle (in steps of 1°) onto the common center (180°) and averaging across all direction angles (Rademaker et al 2019, Sprague, Ester & Serences, 2016). As such, ‘reconstruction fidelity’ is a summary metric with fidelity greater than zero indicating an accurate reconstruction.
How does the model know which direction to reconstruct? Our modelling procedure was informed about the stimulus in question during both the training and the testing step. Specifically, we informed our model during the training step about e.g., the current S2. Then, we fit the model to training data from the S2 epoch and applied it to testing data from the S2 epoch. Crucially, during the testing step the motion direction in question, i.e., current S2, becomes relevant again. For example, when S2 was 120°, the reconstructions were shifted by 60° in order to align with the common center, i.e., 180°. In addition, we also tested whether we could reconstruct the motion direction of S1 during the S2 epoch. Here, we used again the MEG data from the S2 epoch but now for S1 training. i.e., the model was informed about S1 direction. Accordingly, the recentering step during testing was done with regard to the S1 direction. Similarly, we also reconstructed the motion direction of the previous target (i.e., the previous S1 or S2), e.g., during the S2 epoch.
Together, the multi-variate pattern of MEG activity across all sensors during the S2 epoch could contain information about the currently presented direction of S2, the direction of the preceding S1 and the direction of the target stimulus from the previous trial (i.e., either previous S1 or previous S2) at the same time. An important exception from this regime was the cross-reconstruction analysis (Appendix 1—figure 2). Here we trained the encoding model on the currently relevant item (S1 during the S1 epoch, S2 during the S2 epoch and the cued item during the retro-cue epoch) of one MEG session and reconstructed the previous target on the other MEG session.
Finally, to examine shifts of the neural representation, single-trial reconstructions were assigned to two groups, those with a previous target that was oriented clockwise (CW) in relation to the currently relevant item and those with a previous target that was oriented counter-clockwise (CCW). The CCW reconstructions were flipped along the direction space, hence, a negative deviation of the maximum of the reconstruction from 180° indicated an attraction toward the previous target, whereas a positive deviation indicated a repulsion. Those reconstructions were then first averaged within each possible motion direction and then across them to account for different presentation numbers of the directions, resulting in one reconstruction per participant, epoch and time point. To examine systematic shifts, we then tested if the maximum of the reconstruction was systematically different from the common center (180°). For display purposes, we subtracted the reconstructed maximum from 180° to compute the direction shifts. A positive shift thus reflected attraction and a negative shift reflected repulsion.
We have updated the Results accordingly.
References:
Rademaker RL, Chunharas C, Serences JT (2019) Coexisting representations of sensory and mnemonic information in human visual cortex. Nature Neuroscience. 22: 1336-1344.
Sprague TC, Ester EF, Serences JT (2016) Restoring Latent Visual Working Memory Representations in Human Cortex. Neuron. 91: 694-707
(3) Then in the Methods, it would help to provide further detail still about the IEM training/testing procedure. For instance, it's not entirely clear to me whether all the analyses use the same model (i.e., all trained on stimulus encoding) or whether each epoch and timepoint is trained on the corresponding epoch and timepoint from the other session. This speaks to whether the reconstructions reflect a shared stimulus code across different conditions vs. that stimulus information about various previous and current trial items can be extracted if the model is tailored accordingly.
As reported above, our modeling procedure was informed about same stimulus during both the training and the testing step, except for the cross-reconstruction analysis.
Regarding the training and testing data, the model was always trained on data from one session and tested on data from the other session, so that each MEG session once served as the training data set and once as the test data set, hence, training and test data were independent. Importantly, training and testing was always performed in an epoch- and time point-specific way: For example, the model that was trained on the first 100-ms time bin from the S1 epoch of the first MEG session was tested on the first 100-ms time bin from the S1 epoch of the second MEG session.
Specifically, when you say "aim of the reconstruction" (p. 31, line 699), does that simply mean the reconstruction was centered in that direction (that the same data would go into reconstructing S1 or S2 in a given epoch, and what would differentiate between them is whether the reconstruction was centered to the S1 or S2 direction value)?
As reported above, during testing the reconstruction was centered at the currently relevant direction. The encoding model was trained with the direction labels of S1, S2 or the target item, corresponding to the currently relevant direction, i.e., S1 in S1 epochs, S2 in S2 epochs and target item (S1 or S2) in the retro-cue epoch. The only exception was the reconstruction of S1 during the S2 epoch. Here the encoding model was trained on the S1 direction, but with data from the S2 epoch and then applied to the S2 epoch data and recentered to the S1 direction. So here, S1 and S2 were indeed trained and tested separately for the same epoch.
(4) I think training and testing were done separately for each epoch and timepoint, but this could have important implications for interpreting the results. Namely if the models are trained and tested on different time points, and reference directions, then some will be inherently noisier than others (e.g., delay period more so than encoding), and potentially more (or differently) susceptible to bias. For instance, the S1 and S2 epochs show no attractive bias, but they may also be based on more high-fidelity training sets (i.e., encoding), and therefore less susceptible to the bias that is evident in the retrocue epoch.
Thanks for pointing this out. Training and testing were performed in an epoch- and time point-specific way. Thus, potential differences in the signal-to-noise ratio between different task phases could cause quality differences between the corresponding reconstructed MEG signals. However, we did not observe such differences. Instead, we found comparable time courses of the reconstruction fidelities and the averaged reconstruction strengths between epochs (Figure 2b and 2c, respectively). Fig. 2b, e.g., shows that reconstruction fidelity for motion direction stimuli built up slowly during the stimulus presentation, reaching its maximum only after stimulus offset. This observation may contrast to different stimulus materials with faster build-ups, like the orientation of a Gabor.
We agree with the reviewer that, regardless of the comparable but not perfectly equal reconstruction fidelities, there are good arguments to assume that the neural representation of the stimulus during its encoding is typically less noisy than during its post-encoding processing and that this difference could be one of the reasons why serial dependence emerged in our study only during the retro-cue epoch. However, the argument could also be reversed: a biased representation, which represents a small and hard-to-detect neural effect, might be easier to observe for less noisy data. So, the fact that we found a significant bias only during the potentially “noisier” retro-cue epoch makes the effect even more noteworthy.
We mentioned the limitation related to our stimulus material already at the end of the Discussion. We have now added a new paragraph to the Discussion to address the two opposing lines of reasoning.
(4) I believe the work would benefit from a further effort to reconcile these results with previous findings (i.e., those that showed repulsion, like Sheehan & Serences), potentially through additional analyses. The discussion attributes the difference in findings to the "combination of a retro-cue paradigm with the high temporal resolution of MEG," but it's unclear how that explains why various others observed repulsion (thought to happen quite early) that is not seen at any stage here. In my view, the temporal (as well as spatial) resolution of MEG could be further exploited here to better capture the early vs. late stages of processing. For instance, by separately examining earlier vs. later time points (instead of averaging across all of them), or by identifying and analyzing data in the sensors that might capture early vs. late stages of processing. Indeed, the S1 and S2 reconstructions show subtle repulsion, which might be magnified at earlier time points but then shift (toward attraction) at later time points, thereby counteracting any effect. Likewise, the S1 reconstruction becomes biased during the S2 epoch, consistent with previous observations that the SD effects grow across a WM delay. Maybe both S1 and S2 would show an attractive bias emerging during the later (delay) portion of their corresponding epoch? As is, the data nicely show that an attractive bias can be detected in the retrocue period activity, but they could still yield further specificity about when and where that bias emerges.
We are grateful for this suggestion. Before going into detail, we would like to explain our motivation for choosing the present analysis approach that included averaging time points within an epoch of interest.
Our aim was to detect a neuronal signature of serial dependence which is manifested as an attractive shift of about 3.5° degrees within the 360° direction space. To be able to detect such a small effect in the neural data and given the limited resolution of the reconstruction method and the noisy MEG signals, we needed to maximize the signal-to-noise ratio. A common method to obtain this is by averaging data points. In our study we asked subjects to perform 1022 trials, down-sampled the MEG data from the recorded sampling rate of 1200 Hz to 10 Hz (one data point per 100 ms) that we used for the estimation of reconstruction fidelity and calculated the final neural shift estimates by averaging time points that showed a robust reconstruction fidelity, thus representing interpretable data points.
Our procedure to maximize the signal-to-noise ratio was successful as we were able to reliably reconstruct the presented and remembered motion direction in all epochs (Figure 1a and 1b in the manuscript). However, the reconstruction did not work equally well for all time points within each epoch. In particular, there were time points with a non-significant reconstruction fidelity. In consequence, for the much smaller neural shift effect we did not expect to observe reliable time-resolved results, i.e., when considering each time point separately. Instead, we used the reconstruction results to define the time window in order to calculate the neural shift, i.e., we averaged across all time points with a significant reconstruction fidelity.
Author response image 1 depicts the neural shift separately for each time point during the retro-cue epoch. Importantly, the gray parts of the time courses indicate time points where the reconstruction of the presented or cued stimulus was not significant. This means that the reconstructed maxima at those time points were very variable/unreliable and therefore the neural shifts were hardly interpretable.
Author response image 1.
Time courses of the reconstruction shift reveal a tendency for an attractive bias during the retrocue phase. Time courses of the neural shift separately for each time point during the S1 (left panel), S2 (middle panel) and retro-cue epochs (right panel). Gray lines indicate time points with non-significant reconstruction fidelities and therefore very variable and non-interpretable neural reconstruction shifts. The colored parts of the lines correspond to the time periods of significant reconstruction fidelities with interpretable reconstruction shifts. Error bars indicate the middle 95% of the resampling distribution. Time points with less than 5% (equaling p < .05) of the resampling distribution below 0° are indicated by a colored circle. N = 10.
First, the time courses in the Author response image 1 show that the neural bias varied considerably between subjects, as revealed by the resampling distributions, at given time points. In this resampling procedure, we drew 10 participants in 10.000 iterations with replacement and calculated the reconstruction shift based on the mean reconstruction of the resampled participants. The observed variability stresses the necessity to average the values across all time points that showed a significant reconstruction fidelity to increase the signal-to-noise ratio.
Second, despite this high variability/low signal-to-noise ratio, Author response image 1 (right panel) shows that our choice for this procedure was sensible as it revealed a clear tendency of an attractive shift at almost all time points between 300 through 1500 ms after retro-cue onset with only a few individual time-points showing a significant effect (uncorrected for multiple comparisons). It is worth to mention that this time course did not overlap with the time course of previous target cross-reconstruction (Appendix 1—figure 2, right panel), as there was no significant target cross-reconstruction during the retro-cue epoch with an almost flat profile around zero. Also, there was no overlap with previous target decoding in the retro-cue epoch (Figure 5 in the manuscript). Here, the previous target was reactivated significantly only at early time points of 200 and 300 ms post cue onset (i.e., at time points with a non-significant reconstruction fidelity and therefore no interpretable neural shift), while the nominally highest values of the attractive neural shift were visible at later time points that also showed a significant reconstruction fidelity (Figure 2b in the manuscript).
Third, Author response image 1 (left and middle panel) shows the time courses of the neural shift during the S1 and S2 epochs. While no neural shift could be observed for S1, during the S2 epoch the time-resolved analysis indicated an initial attractive shift followed by a (nonsignificant) tendency for a repulsive shift. After averaging neural shifts across time points with a significant reconstruction fidelity, there was no significant effect with an overall tendency for repulsion, as reported in the paper. The attractive part of the neural shift during the S2 epoch was nominally strongest at very early time points (at 100-300 ms after S2 onset) and overlapped perfectly with the reactivation of the previous target as shown by the cross-reconstruction analysis (Appendix 1—figure 2, middle panel). This overlap suggests that the neural attractive shift did not reflect an actual bias of the early S2 representation, but rather a consequence of the concurrent reactivation of the previous target in the same neural code as the current representation. Finally, this neural attractive shift during S2 presentation did not correlate with the behavioral error (single trial-wise correlation: no significant time points during S2 epoch) or the behavioral bias (subject-wise correlation). In contrast, for the retro-cue epoch, we observed a significant correlation between the neural attractive shift and behavior.
Together, the time-resolved results show a clear tendency for an attractive neural bias during the retro-cue phase, thus supporting our interpretation that the attractive shift during the retro-cue phase reflects a direct neuronal signature of serial dependence. However, these additional analyses also demonstrated a large variability between participants and across time points, warranting a cautious interpretation. We conclude that our initial approach of averaging across time points was an appropriate way of reducing the high level of noise in the data and revealed the reported significant and robust attractive neural shift in the retrocue phase.
(5) A few other potentially interesting (but inessential considerations): A benchmark property of serial dependence is its feature-specificity, in that the attractive bias occurs only between current and previous stimuli that are within a certain range of similarity to each other in feature space. I would be very curious to see if the neural reconstructions manifest this principle - for instance, if one were to plot the trialwise reconstruction deviation from 0, across the full space of current-previous trial distances, as in the behavioral data. Likewise, something that is not captured by the DoG fivng approach, but which this dataset may be in a position to inform, is the commonly observed (but little understood) repulsive effect that appears when current and previous stimuli are quite distinct from each other. As in, Figure 1b shows an attractive bias for direction differences around 30 degrees, but a repulsive one for differences around 170 degrees - is there a corresponding neural signature for this component of the behavior?
We appreciate the reviewer's idea to split the data. However, given that our results strongly relied on the inclusion of all data points, i.e., including all distances in motion direction between the current S1, S2 or target and the previous target and requiring data averaging, we are concerned that our study was vastly underpowered to be able to inform whether the attractive bias occurs only within a certain range of inter-stimulus similarity. To address this important question, future studies would require neural measurements with much higher signal-to-noise-ratio than the present MEG recordings with two sessions per participant and 1022 trials in total.
Reviewer #2 (Public Review):
Summary:
The study aims to probe the neural correlates of visual serial dependence - the phenomenon that estimates of a visual feature (here motion direction) are attracted towards the recent history of encoded and reported stimuli. The authors utilize an established retro-cue working memory task together with magnetoencephalography, which allows to probe neural representations of motion direction during encoding and retrieval (retro-cue) periods of each trial. The main finding is that neural representations of motion direction are not systematically biased during the encoding of motion stimuli, but are attracted towards the motion direction of the previous trial's target during the retrieval (retro-cue period), just prior to the behavioral response. By demonstrating a neural signature of attractive biases in working memory representations, which align with attractive behavioral biases, this study highlights the importance of post-encoding memory processes in visual serial dependence.
Strengths:
The main strength of the study is its elegant use of a retro-cue working memory task together with high temporal resolution MEG, enabling to probe neural representations related to stimulus encoding and working memory. The behavioral task elicits robust behavioral serial dependence and replicates previous behavioral findings by the same research group. The careful neural decoding analysis benefits from a large number of trials per participant, considering the slow-paced nature of the working memory paradigm. This is crucial in a paradigm with considerable trial-by-trial behavioral variability (serial dependence biases are typically small, relative to the overall variability in response errors). While the current study is broadly consistent with previous studies showing that attractive biases in neural responses are absent during stimulus encoding (previous studies reported repulsive biases), to my knowledge it is the first study showing attractive biases in current stimulus representations during working memory. The study also connects to previous literature showing reactivations of previous stimulus representations, although the link between reactivations and biases remains somewhat vague in the current manuscript. Together, the study reveals an interesting avenue for future studies investigating the neural basis of visual serial dependence.
Weaknesses:
(1) The main weakness of the current manuscript is that the authors could have done more analyses to address the concern that their neural decoding results are driven by signals related to eye movements. The authors show that participants' gaze position systematically depended on the current stimuli's motion directions, which together with previous studies on eye movement-related confounds in neural decoding justifies such a concern. The authors seek to rule out this confound by showing that the consistency of stimulus-dependent gaze position does not correlate with (a) the neural reconstruction fidelity and (b) the repulsive shift in reconstructed motion direction. However, both of these controls do not directly address the concern. If I understand correctly the metric quantifying the consistency of stimulus-dependent gaze position (Figure S3a) only considers gaze angle and not gaze amplitude. Furthermore, it does not consider gaze position as a function of continuous motion direction, but instead treats motion directions as categorical variables. Therefore, assuming an eye movement confound, it is unclear whether the gaze consistency metric should strongly correlate with neural reconstruction fidelity, or whether there are other features of eye movements (e.g., amplitude differences across participants, and tuning of gaze in the continuous space of motion directions) which would impact the relationship with neural decoding. Moreover, it is unclear whether the consistency metric, which does not consider history dependencies in eye movements, should correlate with attractive history biases in neural decoding. It would be more straightforward if the authors would attempt to (a) directly decode stimulus motion direction from x-y gaze coordinates and relate this decoding performance to neural reconstruction fidelity, and (b) investigate whether gaze coordinates themselves are history-dependent and are attracted to the average gaze position associated with the previous trials' target stimulus. If the authors could show that (b) is not the case, I would be much more convinced that their main finding is not driven by eye movement confounds.
The reviewer is correct that our eye-movement analysis approach considered gaze angle (direction) and not gaze amplitude. We considered gaze direction to be the more important feature to control for when investigating the neural basis of serial dependence that manifests, given the stimulus material used in our study, as a shift/deviation of angle/direction of a representation towards the previous target motion direction. To directly relate gaze direction and MEG data to each other we equaled the temporal resolution of the eye tracking data to match that of the MEG data. Specifically, our analysis procedure of gaze direction provided a measure indicating to which extent the variance of the gaze directions was reduced compared with random gaze direction patterns, in relation to the specific stimulus direction within each 100 ms time bin. Importantly, this procedure was able to reveal not only systematic gaze directions that were in accordance with the stimulus direction or the opposite direction, but also picked up all stimulus-related gaze directions, even if the relation differed across participants or time.
Our analysis approach was highly sensitive to detect stimulus-related gaze directions during all task phases (Appendix 1—figure 3). As expected, we found systematic gaze directions when S1 and S2 were presented on the screen, and they were reduced thereafter, indicating a clear relationship between stimulus presentation and eye movement. Systematic gaze directions were also present in the retro-cue phase where no motion direction was presented. Here they showed a clearly different temporal dynamic as compared to the S1 and S2 phases. They appeared at later time points and with a higher variability between participants, indicating that they coincided with retrieving the target motion direction from working memory.
To relate gaze directions with MEG results, we calculated Spearman rank correlations. We found that there was no systematic relationship at any time point between the stimulus related reconstruction fidelity and the amount of stimulus-related gaze direction. Even more, the correlation varied strongly from time point to time point revealing its random nature. In addition to the lack of significant correlations, we observed clearly distinct temporal profiles for gaze direction (Appendix 1—figure 3a and Appendix 1—figure 3b) and the reconstruction fidelities (Figure 2b in the manuscript, Appendix 1—figure 3c), in particular in the critical retro-cue phase.
We favored this analysis approach over one that directly decoded stimulus motion direction from x-y gaze coordinates, as we considered it hardly feasible to compute an inverted encoding model with only two eye-tracker channels as an input (in comparison to 271 MEG sensors), and to our knowledge, this has not been done before. Other decoding methods have previously been applied to x-y gaze coordinates. However, in contrast to the inverted encoding model, they did not provide a measure of the representation shift which would be crucial for our investigation of serial dependence.
We appreciate the suggestion to conduct additional analyses on eye tracking data (including different temporal and spatial resolution and different features) and their relation to MEG data. However, the first author, who ran all the analyses, has in the meantime left academia. Unfortunately, we currently do not have sufficient resources to perform additional analyses.
While the presented eye movement control analysis makes us confident that our MEG finding was not crucially driven by stimulus-related gaze directions, we agree with the reviewer that we cannot completely exclude that other eye movement-related features could have contributed to our MEG findings. However, we would like to stress that whatever that main source for the observed MEG effect was (shift of the neuronal stimulus representation, (other) features of gaze movement, or shift of the neuronal stimulus representation that leads to systematic gaze movement), our study still provided clear evidence that serial dependence emerged at a later post-encoding stage of object processing in working memory. This central finding of our study is hard to observe with behavioral measures alone and is not affected by the possible effects of eye movements.
We have slightly modified our conclusion in the Results and Appendix 1. Please see also our response to comment 1 from reviewer 3.
(2) I am not convinced by the across-participant correlation between attractive biases in neural representations and attractive behavioral biases in estimation reports. One would expect a correlation with the behavioral bias amplitude, which is not borne out. Instead, there is a correlation with behavioral bias width, but no explanation of how bias width should relate to the bias in neural representations. The authors could be more explicit in their arguments about how these metrics would be functionally related, and why there is no correlation with behavioral bias amplitude.
We are grateful for this suggestion. We correlated the individual neuronal shift with the two individual parameter fits of the behavior shift, i.e., amplitude (a) and tuning width (w). We found a significant correlation between the individual neural bias and the w parameter (r = .70, p = .0246) but not with the a parameter (r = -.35, p = .3258) during the retro-cue period (Appendix 1—figure 1). This indicates that a broader tuning width of the individual bias (as reflected by a smaller w parameter) was associated with a stronger individual neural attraction.
It is important to note that for the calculation of the neural shift, all trials entered the analysis to increase the signal-to-noise ratio, i.e., it included many trials where current and previous targets were separated by, e.g., 100° or more. These trials were unlikely to produce serial dependence. Subjects with a more broadly tuned serial dependence had more interitem differences that showed a behavioral attraction and therefore more trials affected by serial dependence that entered the calculation of the neural shift. In contrast, individual differences in the amplitude (a) parameter were most likely too small, and higher individual amplitude did not involve more trials as compared to smaller amplitude to affect the neural bias in a way to be observed in a significant correlation.
We have added this explanation to Appendix 1.
(3) The sample size (n = 10) is definitely at the lower end of sample sizes in this field. The authors collected two sessions per participant, which partly alleviates the concern. However, given that serial dependencies can be very variable across participants, I believe that future studies should aim for larger sample sizes.
We want to express our appreciation for raising this issue. We apologize that we did not explicitly explain and justifythe choice for the sample size used in our paper, in particular, as we had in fact performed a formal a-priori power analysis.
At the time of the sample size calculation, there were no comparable EEG or MEG studies to inform our power calculation. Thus, we based our calculation merely on the behavioral effect reported in the literature and, in particular, observed in a behavioral study from our lab that included four different experiments with overall more than 100 participants with 1632 trials each (see Fischer et al., 2020), in which the behavioral serial dependence effect (target vs. nontarget) was very robust. Based on the contrast between target and non-target with an effect size of 1.359 in Experiment 1, a power analysis with 80% desired power led to a small, estimated sample size of 6 subjects.
However, we expected that the detection of the neural signature of this effect would require more participants. Therefore, we based our power calculation on a much smaller behavioral effect, i.e. the modulation of serial dependence by the context-feature congruency that we observed in our previous study (Fischer et al., 2020). In particular, we focused on Experiment 1 of the previous study that used color as the feature for retro-cueing, as we planned to use exactly the same paradigm for the MEG study. In contrast to the serial dependence effect, its modulation by color resulted in a more conservative power estimate: Based on an effect size of 0.856 in that experiment, a sample size of n = 10 should yield a power of 80% with two MEG sessions per subject.
At the time when we conducted our study, two other studies were published that investigated serial dependence on the neural level. Both studies included a smaller number of data points than our study: Sheehan & Serences (2022) recorded about 840 trials in each of 6 participants, resulting in fewer data points both on the participant and on the trial level. Hajonides et al. (2023) measured 20 participants with 400 trials each, again resulting in fewer datapoints than our study (10 participants with 1022 trials each). Taken together, our a-priori sample size estimation resulted in comparable if not higher power as compared to other similar studies, making us feel confident that the estimated sample was sufficient to yield reliable results.
We have now included this description and the results of this power analysis in the Materials and Methods section.
Despite this, we fully agree with the reviewer that our study would profit from higher power. With the knowledge of the results from this study, future projects should attempt to increase substantially the signal-to-noise-ratio by increasing the number of trials in particular, in order to observe, e.g., robust time-resolved effects (see our comments to review 1).
References:
Fischer C, Czoschke S, Peters B, Rahm B, Kaiser J, Bledowski C (2020) Context information supports serial dependence of multiple visual objects across memory episodes. Nature Communication 11: 1932.
Sheehan TC, Serences JT (2022) Attractive serial dependence overcomes repulsive neuronal adaptation PLOS Biology 20: e3001711.
Hajonides JE, Van Ede F, Stokes MG, Nobre AC, Myers NE (2023) Multiple and Dissociable Effects of Sensory History on Working-Memory Performance Journal of Neuroscience 43: 2730–2740.
(4) It would have been great to see an analysis in source space. As the authors mention in their introduction, different brain areas, such as PPC, mPFC, and dlPFC have been implicated in serial biases. This begs the question of which brain areas contribute to the serial dependencies observed in the current study. For instance, it would be interesting to see whether attractive shifts in current representations and pre-stimulus reactivations of previous stimuli are evident in the same or different brain areas.
We appreciate this suggestion. As mentioned above, we currently do not have sufficient resources to perform a MEG source analysis.
Reviewer #3 (Public Review):
Summary:
This study identifies the neural source of serial dependence in visual working memory, i.e., the phenomenon that recall from visual working memory is biased towards recently remembered but currently irrelevant stimuli. Whether this bias has a perceptual or postperceptual origin has been debated for years - the distinction is important because of its implications for the neural mechanism and ecological purpose of serial dependence. However, this is the first study to provide solid evidence based on human neuroimaging that identifies a post-perceptual memory maintenance stage as the source of the bias. The authors used multivariate pattern analysis of magnetoencephalography (MEG) data while observers remembered the direction of two moving dot stimuli. After one of the two stimuli was cued for recall, decoding of the cued motion direction re-emerged, but with a bias towards the motion direction cued on the previous trial. By contrast, decoding of the stimuli during the perceptual stage was not biased.
Strengths:
The strengths of the paper are its design, which uses a retrospective cue to clearly distinguish the perceptual/encoding stage from the post-perceptual/maintenance stage, and the rigour of the careful and well-powered analysis. The study benefits from high within participant power through the use of sensitive MEG recordings (compared to the more common EEG), and the decoding and neural bias analysis are done with care and sophistication, with appropriate controls to rule out confounds.
Weaknesses:
A minor weakness of the study is the remaining (but slight) possibility of an eye movement confound. A control analysis shows that participants make systematic eye movements that are aligned with the remembered motion direction during both the encoding and maintenance phases of the task. The authors go some way to show that this eye gaze bias seems unrelated to the decoding of MEG data, but in my opinion do not rule it out conclusively. They merely show that the strengths of the gaze bias and the strength of MEGbased decoding/neural bias are uncorrelated across the 10 participants. Therefore, this argument seems to rest on a null result from an underpowered analysis.
Our MEG as well eye-movement analysis showed that they were sensitive to pick up robustly stimulus-related effects, both for presented and remembered motion directions. When relating both signals to each other by correlating MEG reconstruction strength with gaze direction, we found a null effect, as pointed out by the reviewer. Importantly, there was also a null effect when the shift of the reconstruction (representing our main finding) was correlated with gaze direction. Furthermore, an examination of the individual time courses of gaze direction and individual MEG reconstruction strength revealed that the lack of a relationship between MEG and gaze data did not rest on a singular observation but was present across all time points. Even more, the temporal profile of the correlation varied strongly from time point to time point revealing its random nature and indicating that there was no hint of a pattern that just failed to reach significance. Taking these observations together, our MEG findings were unlikely to be explained by eye position.
Nevertheless, we agree with the reviewer that there is general problem of interpreting a null effect with a limited number of observations (and an analysis approach that focused on one out of many possible features of the gaze movement). Thus, we admit that there is a (slight) possibility that eye movements contributed to the observed MEG effects. This possibility, however, did not affect our novel finding that serial dependence occurred during the postencoding stage of object processing in working memory.
Please see also our response to point 1 from reviewer 2.
Impact:
This important study contributes to the debate on serial dependence with solid evidence that biased neural representations emerge only at a relatively late post-perceptual stage, in contrast to previous behavioural studies. This finding is of broad relevance to the study of working memory, perception, and decision-making by providing key experimental evidence favouring one class of computational models of how stimulus history affects the processing of the current environment.
Recommendations for the authors:
Reviewer #1 (Recommendations For The Authors):
Minor concerns:
The significance statement opens "Our perception is biased towards sensory input from the recent past." This is a semantic point, but it seems a somewhat odd statement, given there is so much debate about whether serial dependence is perceptual vs. decisional, and that the current work indeed claims that it emerges at a late, post-encoding stage.
Thank you for this point. We agree. “Visual cognition is biased towards sensory input from the recent past.” would be a more appropriate statement. According to the Journal's guidelines, however, the paragraph with the Significant Statement will be not included in the final manuscript.
It would be preferable for data and code to be available at review so that reviewers might verify some procedural points for clarity.
Code and preprocessed data used for the presented analyses are now available on OSF via http://osf.io/yjc93/. Due to storage limitations, only the preprocessed MEG data for the main IEM analyses focusing on the current direction are uploaded. For access to additional data, please contact the authors.
For instance, I could use some clarification on the trial sequence. The methods first say the direction was selected randomly, but then later say each direction occurred equally often, and there were restrictions on the relationships between current and previous trial items. So it seems it couldn't have truly been random direction selection - was the order selected randomly from a predetermined set of possibilities?
For the S1/S2 stimuli in a trial the dots moved fully coherent in a direction randomly drawn from a pool of directions between 5° and 355° spaced 10° from one another, therefore avoiding cardinal directions. Across trials, there was a predetermined set of possible differences in motion direction between the current and the previous target. This set included 18 motion direction differences, ranging from -170° to 180°, in steps of 10°. Trial sequences were balanced in a way that each of these differences occurred equally often during a MEG session.
I could also use some additional assurance the sample size (participants or data points) is sufficient for the analysis approach deployed here.
We performed a formal a-priori power analysis to justify our choice for the sample size. Please see our response to reviewer 2, point 3, where we explained the procedure of the apriori power analysis in detail. We have now included this description and the results of this power analysis in the Materials and Methods.
Did you consider a decoding approach, instead of reconstruction, to test what information predominates the signal, in an unbiased way?
Thank you for this argument. With our analysis approach based on the inverted encoding model, we believe to be unbiased, since we first reconstructed whether the MEG signal contained information about the presented and remembered motion direction. Only in the next step, we tested whether this reconstructed signal showed an offset and if so, whether this offset was biased towards or away from the previous target. A decoding approach aims to answer classification questions and is not suitable to reveal the actual shifts of the neural information. In our study, we could decode, e.g., the current direction or the previous target, but this would not answer the question of whether and at which stage of object processing the current representation was biased towards the past. Moreover, in a decoding approach to reveal which information predominates in the signal, we would have to classify different options (e.g. current information vs previous), thereby biasing the possible set of results more than in our chosen analysis.
I think the claim of a "direct" neural signature may come off as an overstatement when the spatial and temporal aspects of the attractive bias are still so coarsely specified here.
Thank you for pointing this out. We agree that the term “direct neural signature” can be seen as an overstatement when it is interpreted to indicate a narrowly defined activity of a brain region (ideally via “direct” invasive recordings) that reflects serial dependence. Our definition of the term “direct” referred to the observation of an attractive shift in a neural representation of the current target motion direction item towards the previous target. This was in contrast to previous “indirect” evidence for the neural basis of serial dependence based on either repulsive shifts of neural representations that were opposite to the attractive bias in behavior or on a reactivation of previous information in the current trial without presenting evidence for the actual neural shift. With this definition in mind, we consider the title of our study a valid description of our findings.
Reviewer #2 (Recommendations For The Authors):
I was wondering why the authors chose a bootstrap test for their neural bias analysis instead of a permutation test, similar to the one they used for their behavioral analysis. As far as I know, bootstrap tests do not provide guaranteed type-1 error rate control. The procedure for the permutation test would be quite straightforward here, randomly permuting the sign of each participant's neural shift and recording the group-average shift in a permutation distribution. This test seems more adequate and more consistent with the behavioral analysis.
Thank you for this comment. We adapted a resampling approach (bootstrapping) that was similar to that by Ester et al. (2020) who also investigated categorical biases and also applied a reconstruction method (Inverted Encoding Model) to assess significance of a bias of the reconstructed orientation against zero in a certain direction. The bootstrapping method relied on a) detecting an offset against zero and b) evaluating the robustness of the observed effect across participants. In contrast, a permutation approach, as suggested by the reviewer, assesses whether an empirical neural shift is more extreme than the permutation distribution. The permutation approach seems more suited to assess the magnitude of the shift which in our study was not a priority. Therefore, we reasoned that the bootstrapping for our inference statistics was better suited to assess the direction of the neural shift and its robustness across participants.
We have added this additional information to the Materials and Methods:
References:
Ester EF, Sprague TC, Serences JT (2020) Categorical biases in human occipitoparietal cortex. Journal of Neuroscience 40:917–931.
The manuscript could be improved by more clearly spelling how the training and testing data were labelled, particularly for the reactivation analyses. If I understood correctly, in the first reactivation analysis the authors train and test on current trial data, but label both training and testing data according to the previous trial's motion direction. In the second analysis, they label the training data according to the current motion direction, but label the testing data according to the previous motion direction. Is that correct?
Yes, this is correct. Please see also our response to reviewer 1, point 2 and 3, for a detailed description.
I was surprised to see that the shift in the reconstructed direction is about three times larger than the behavioral attraction bias. Would one not expect these to be comparable in magnitude? It would be helpful to address and discuss this in the discussion section.
Thank you for pointing this out. We agree with the reviewer that as both measures provided an identical metric (angle degree), one would expect that their magnitudes should be directly comparable. However, we speculate that these magnitudes inform only about the direction of the bias and their significant difference from zero, thus they operate on different scales and are not directly comparable. For example, Hallenbeck et al. (2022) showed that fMRI-based reconstructed orientation bias and behavioral bias correlated on both individual and group level, despite strong magnitude differences. This is in line with our observation and supports the speculation that the magnitudes of neural and behavioral biases operate on different scales and, thus, are not directly comparable.
We have updated to the Discussion accordingly.
References:
Hallenbeck GE, Sprague TC, Rahmati M, Sreenivasan KK, Curtis CE (2022) Working memory representations in visual cortex mediate distraction effects Nature Communications 12: 471.
Reviewer #3 (Recommendations For The Authors):
(1) It may be worth showing that the gaze bias towards the current/cued stimulus is not biased towards the previous target. One option might be to run the same analysis pipeline used for the MEG decoding but on the eye-tracking data. Another could be to remove all participants with significant gaze bias, but given the small sample size, this might not be feasible.
We appreciate this suggestion. However, as mentioned above, we currently do not have sufficient resources to conduct additional analyses on the eye tracking data.
(2) Minor typo: Figure 3c - bias should be 11.7º, not -11.7º.
Corrected. Thank you!
Note on data/code availability: The authors state that preprocessed data and analysis code will be made available on publication, but are not available yet.
Code and preprocessed data used for the present analyses are now available on OSF via http://osf.io/yjc93/. Due to storage limitations, only the preprocessed MEG data for the main IEM analyses focusing on the current direction are uploaded. For access to additional data, please contact the authors.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
The authors show that an automated approach using artificial neural networks, which focuses on behaviourally relevant dimensions, can predict human similarity data up to a certain level of granularity. This study has the potential to be a valuable contribution to the broader field of cognitive computational neuroscience, as it provides a tool for the automated collection of similarity judgments under certain conditions. However, as of now, the significance of this method is somewhat limited because of its inability to generalise beyond between-category distinctions and the limited model evaluation. In terms of broader implications, the degree to which this work provides insights into DNN-brain alignment and a better understanding of the functional organisation of the visual system is supported by incomplete evidence.
-
Reviewer #1 (Public review):
Summary:
This manuscript addresses the challenge of understanding and capturing the similarity among large numbers of visual images. The authors show that an automated approach using artificial neural networks that focuses upon the embedding of similarity through behaviorally relevant dimensions can predict human similarity data up to a certain level of granularity.
Strengths:
The manuscript starts with a very useful introduction that sets the stage with an insightful Figure 1. The methods are state of the art and well thought off, and the data are compelling. The authors demonstrate the added value of their approach in several directions, resulting in a manuscript that is highly relevant for different domains. The authors also explore its limitations (e.g., granularity).
Weaknesses:
Although this manuscript and the work it describes are already of high quality, I see several ways in which it could be further improved. Below I rank these suggestions tentatively in order of importance.
Predictions obtain correlations above 0.80, often close to correlations of 0.90. The performance of DimPred is not trivial, given how much better it performs relative to classic RSA and feature reweighting. Yet, the ceiling is not sufficiently characterized. What is the noise ceiling in the main and additional similarity sets that are used? If the noise ceiling is higher than the prediction correlations, then can the authors try to find the stimulus pairs for which the approach systematically fails to capture similarity? Or is the mismatch very distributed across the full stimulus set?
Also in the section on p. 8-p.9, it is crucial to provide information on the noise ceiling of the various datasets.
This consideration of noise ceiling brings me to another consideration. Arguments have been made that a focus on overall prediction accuracy might mask important differences in underlying processes that can be demonstrated in more specific, experimental situations (Bowers et al., 2023). Can the authors exclude the possibility that their automatic approach would fail dramatically in specifically engineered situations? Some examples can be found in the 2024 challenge of the BrainScore platform. How can future users of this approach know whether they are in such a situation or not?
The authors demonstrated one limitation of the DimPred approach to capture fine-grained similarity among highly similar stimuli. The implications of this finding were not clear to me from the Abstract etc, because it is not sufficiently highlighted in the summaries that in this case DimPred performs even worse, and much worse, than more simple approaches like feature reweighting and even than classic RSA. I would discuss this outcome more in detail. With hindsight, this problem might not be so surprising given that DimPred relies upon the embedding with a few tens dimensions that mostly capture between-category differences. To me, this seems like a more fundamental limitation than a mere problem of granularity or lack of data, as suggested in the abstract.
The DimPred approach is based on the dimensions of a similarity embedding derived from human behavior. What is important here is (i) that DimPred is based upon an approach that tries to capture latent dimensions; or (ii) that these dimensions are behaviorally relevant? There are a lot of dimension-focused approaches. Generic ones are PCA, MDS, etc. More domain-specific approaches in cogneuro include the following: (i) for two-dimensional shape representations, good results have been obtained with image-computable dimensions of various levels of complexity (Morgenstern et al., 2021, PLOS Comput. Biol.); (ii) another dimension-focused approach has focused upon identifying dimensions that are universal across networks & human representations (Chen & Bonner, 2024, arXiv). Would such generic or more specific approaches work as well as DimPred?
-
Reviewer #2 (Public review):
In this paper, the authors successfully incorporated the 49 dimensions found in a human similarity judgment task to better train DNNs to perform accurate human-like object similarity judgments. The results of the model performance are impressive but I am not totally convinced that the present modeling approach may bring new insights regarding the mental and neural representations of visual objects in the human brain. I have a few thoughts that I would like the authors to consider.
(1) Can the authors provide a detailed description of what these off-the-shelf DNNs are trained on? For models trained on visual images only, because semantic information was never present during training, it is not surprising they fail to capture such information, even with additional DimPred training. For the CLIP models, because visual-sematic associations were included during training, it again comes as no surprise that these models can do better even without DimPred training. Similarly, the results of homogenous image sets are not particularly surprising. In this regard, I am finding the paper reports many obvious results. Better motivations should be used to justify why particular models and analyses were performed, what predictions can be made, and how the results may be informative beyond what we already know.
(2) I am curious as to what DimPred training is doing exactly. If you create an arbitrary similarity structure (i.e., not the one derived from human similarity judgment) by, e.g., shuffling the values during training or creating 49 arbitrary dimensions, can the models be trained to follow this new arbitrary structure? In other words, do the models intrinsically contain a human-like structure, but we just have to find the right parameters to align them with the human structure or do we actually impose/force the human similarity structure onto the model with DimPred training?
Is it also an issue that you are including more parameters during DimPred training and that increased parameters alone can increase performance?
(3) There is very little information on how Figure 8 is generated. I couldn't find in the Methods any detailed descriptions of how the values were calculated. Are results from both the category-insensitive and category-sensitive embedding obtained from the same OpenCLIP-RN50x64? Figure 8 reports the relative improvement. What do the raw activation maps look like for the category-insensitive and category-sensitive embedding? I am surprised that the improvement is seen primarily in the early visual cortex (EVC) and higher visual areas but not more extensively in association areas sensitive to semantics. Why should EVC show such large improvements, given that category information is stored elsewhere?
Related to this point, how do other DNN models account for human brain fMRI responses in the present study? Many prior studies have documented the similarities and differences between DNN and human fMRI visual object representations. Do category-sensitive CLIP models outperform other DNN models? It is important to report the full results. Even though category-sensitive CLIP models outperform category-insensitive CLIP ones, if the overall model performance is low compared to the other DNNs, the results would not be very meaningful/impressive. I am just wondering if, in the process of achieving better human-like similarity judgment performance, these models lose some of the ability to account for visual object representations in the human ventral visual cortex.
(4) I am wondering how precisely the present results may yield new insights into the mental and neural representations of visual objects in the human brain. Prior human studies have already identified 49 dimensions that can capture human similarity judgment. Beyond predicting performance for new pairs of objects, how would the present modeling approach help us understand more about the human brain? The authors discussed this, but I am not sure the arguments are convincing.
-
Reviewer #3 (Public review):
Summary:
The authors compare how well their automatic dimension prediction approach (DimPred) can support similarity judgements and compare it to more standard RSA approaches. The authors show that the DimPred approach does better when assessing out-of-sample heterogeneous image sets, but worse for out-of-sample homogeneous image sets. DimPred also does better at predicting brain-behaviour correspondences compared to an alternative approach. The work appears to be well done, but I'm left unsure what conclusions the authors are drawing.
In the abstract, the authors write: "Together, our results demonstrate that current neural networks carry information sufficient for capturing broadly-sampled similarity scores, offering a pathway towards the automated collection of similarity scores for natural images". If that is the main claim, then they have done a reasonable job supporting this conclusion. However the importance of automating this process for broadly-sampled object categories is not made so clear.
But the authors also highlight the importance that similarity judgements have been for theories of cognition and brain, such as in the first paragraph of the paper they write: "Similarity judgments allow us to improve our understanding of a variety of cognitive processes, including object recognition, categorization, decision making, and semantic memory6-13. In addition, they offer a convenient means for relating mental representations to representations in the human brain14,15 and other domains16,17". The fact that the authors also assess how well a CLIP model using DimPred can predict brain activation suggests that their work is not just about automating similarity judgements, but highlighting how their approach reveals that ANNs are more similar to brains than previously assessed.
My main concern is with regards to the claim that DimPred is revealing better similarities between ANNs and brains (a claim that the authors may not be making, but this should be clarified). The fact that predictions are poor for homogenous images is problematic for this claim, and I expect their DimPred scores would be very poor under many conditions, such as when applied to line drawings of objects, or a variety of addition out-of-sample stimuli that are easily identified by humans. The fact that so many different models get such similar prediction scores (Fig 3) also raises questions as to the inferences you can make about ANN-brain similarity based on the results. Do the authors want to claim that CLIP models are more like brains?
With regards to the brain prediction results, why is the DimPred approach doing so much better in V1? I would not think the 49 interpretable categories are encoded in V1, and the ability to predict would likely reflect a confound rather than V1 encoding these categories (e.g., if a category was "things that are burning" then DNN might predict V1 activation based on the encoding of colour).
In addition, more information is needed on the baseline model, as it is hard to appreciate whether we should be impressed by the better performance of DimPred based on what is provided: "As a baseline, we fit a voxel encoding model of all 49 dimensions. Since dimension scores were available only for one image per category36, for the baseline model, we used the same value for each image of the same category and estimated predictive performance using cross-validation". Is it surprising that predictions are not good with one image per category? Is this a reasonable comparison?
Relatedly, what was the ability of the baseline model to predict? (I don't think that information was provided). Did the authors attempt to predict outside the visual brain areas? What would it mean if predictions were still better there?
Minor points:
The authors write: "Please note that, for simplicity, we refer to the similarity matrix derived from this embedding as "ground-truth", even though this is only a predicted similarity". Given this, it does not seem a good idea to use "ground truth" as this clarification will be lost in future work citing this article.
It would be good to have the 49 interpretable dimensions listed in the supplemental materials rather than having to go to the original paper.
Strengths:
The experiments seem well done.
Weaknesses:
It is not clear what claims are being made.
-