Author response:
We would like to thank the three Reviewers for their thoughtful comments and detailed feedback. We are pleased to hear that the Reviewers found our paper to be “providing more direct evidence for the role of signals in different frequency bands related to predictability and surprise” (R1), “well-suited to test evidence for predictive coding versus alternative hypotheses” (R2), and “timely and interesting” (R3).
We perceive that the reviewers have an overall positive impression of the experiments and analyses, but find the text somewhat dense and would like to see additional statistical rigor, as well as in some cases additional analyses to be included in supplementary material. We therefore here provide a provisional letter addressing revisions we have already performed and outlining the revision we are planning point-by-point. We begin each enumerated point with the Reviewer’s quoted text and our responses to each point are made below.
Reviewer 1:
(1) Introduction:
The authors write in their introduction: "H1 further suggests a role for θ oscillations in prediction error processing as well." Without being fleshed out further, it is unclear what role this would be, or why. Could the authors expand this statement?”
We have edited the text to indicate that theta-band activity has been related to prediction error processing as an empirical observation, and must regrettably leave drawing inferences about its functional role to future work, with experiments designed specifically to draw out theta-band activity.
(2) Limited propagation of gamma band signals:
Some recent work (e.g. https://www.cell.com/cell-reports/fulltext/S2211-1247(23)00503-X) suggests that gamma-band signals reflect mainly entrainment of the fast-spiking interneurons, and don't propagate from V1 to downstream areas. Could the authors connect their findings to these emerging findings, suggesting no role in gamma-band activity in communication outside of the cortical column?”
We have not specifically claimed that gamma propagates between columns/areas in our recordings, only that it synchronizes synaptic current flows between laminar layers within a column/area. We nonetheless suggest that gamma can locally synchronize a column, and potentially local columns within an area via entrainment of local recurrent spiking, to update an internal prediction/representation upon onset of a prediction error. We also point the Reviewer to our Discussion section, where we state that our results fit with a model “whereby θ oscillations synchronize distant areas, enabling them to exchange relevant signals during cognitive processing.” In our present work, we therefore remain agnostic about whether theta or gamma or both (or alternative mechanisms) are at play in terms of how prediction error signals are transmitted between areas.
(3) Paradigm:
While I agree that the paradigm tests whether a specific type of temporal prediction can be formed, it is not a type of prediction that one would easily observe in mice, or even humans. The regularity that must be learned, in order to be able to see a reflection of predictability, integrates over 4 stimuli, each shown for 500 ms with a 500 ms blank in between (and a 1000 ms interval separating the 4th stimulus from the 1st stimulus of the next sequence). In other words, the mouse must keep in working memory three stimuli, which partly occurred more than a second ago, in order to correctly predict the fourth stimulus (and signal a 1000 ms interval as evidence for starting a new sequence).
A problem with this paradigm is that positive findings are easier to interpret than negative findings. If mice do not show a modulation to the global oddball, is it because "predictive coding" is the wrong hypothesis, or simply because the authors generated a design that operates outside of the boundary conditions of the theory? I think the latter is more plausible. Even in more complex animals, (eg monkeys or humans), I suspect that participants would have trouble picking up this regularity and sequence, unless it is directly task-relevant (which it is not, in the current setting). Previous experiments often used simple pairs (where transitional probability was varied, eg, Meyer and Olson, PNAS 2012) of stimuli that were presented within an intervening blank period. Clearly, these regularities would be a lot simpler to learn than the highly complex and temporally spread-out regularity used here, facilitating the interpretation of negative findings (especially in early cortical areas, which are known to have relatively small temporal receptive fields).
I am, of course, not asking the authors to redesign their study. I would like to ask them to discuss this caveat more clearly, in the Introduction and Discussion, and situate their design in the broader literature. For example, Jeff Gavornik has used much more rapid stimulus designs and observed clear modulations of spiking activity in early visual regions. I realize that this caveat may be more relevant for the spiking paper (which does not show any spiking activity modulation in V1 by global predictability) than for the current paper, but I still think it is an important general caveat to point out.”
We appreciate the Reviewer’s concern about working memory limitations in mice. Our paradigm and training followed on from previous paradigms such as Gavornik and Bear (2014), in which predictive effects were observed in mouse V1 with presentation times of 150ms and interstimulus intervals of 1500ms. In addition, we note that Jamali et al. (2024) recently utilized a similar global/local paradigm in the auditory domain with inter-sequence intervals as long as 28-30 seconds, and still observed effects of a predicted sequence (https://elifesciences.org/articles/102702). For the revised manuscript, we plan to expand on this in the Discussion section.
That being said, as the Reviewer also pointed out, this would be a greater concern had we not found any positive findings in our study. However, even with the rather long sequence periods we used, we did find positive evidence for predictive effects, supporting the use of our current paradigm. We agree with the reviewer that these positive effects are easier to interpret than negative effects, and plan to expand upon this in the Discussion when we resubmit.
(4) Reporting of results:
I did not see any quantification of the strength of evidence of any of the results, beyond a general statement that all reported results pass significance at an alpha=0.01 threshold. It would be informative to know, for all reported results, what exactly the p-value of the significant cluster is; as well as for which performed tests there was no significant difference.”
For the revised manuscript, we can include the p-values after cluster-based testing for each significant cluster, as well as show data that passes a more stringent threshold of p<0.001 (1/1000) or p<0.005 (1/200) rather than our present p<0.01 (1/100).
(5) Cluster test:
The authors use a three-dimensional cluster test, clustering across time, frequency, and location/channel. I am wondering how meaningful this analytical approach is. For example, there could be clusters that show an early difference at some location in low frequencies, and then a later difference in a different frequency band at another (adjacent) location. It seems a priori illogical to me to want to cluster across all these dimensions together, given that this kind of clustering does not appear neurophysiologically implausible/not meaningful. Can the authors motivate their choice of three-dimensional clustering, or better, facilitating interpretability, cluster eg at space and time within specific frequency bands (2d clustering)?”
We are happy to include a 3D plot of a time-channel-frequency cluster in the revised manuscript to clarify our statistical approach for the reviewer. We consider our current three-dimensional cluster-testing an “unsupervised” way of uncovering significant contrasts with no theory-driven assumptions about which bounded frequency bands or layers do what.
Reviewer 2:
Sennesh and colleagues analyzed LFP data from 6 regions of rodents while they were habituated to a stimulus sequence containing a local oddball (xxxy) and later exposed to either the same (xxxY) or a deviant global oddball (xxxX). Subsequently, they were exposed to a controlled random sequence (XXXY) or a controlled deterministic sequence (xxxx or yyyy). From these, the authors looked for differences in spectral properties (both oscillatory and aperiodic) between three contrasts (only for the last stimulus of the sequence).
(1) Deviance detection: unpredictable random (XXXY) versus predictable habituation (xxxy)
(2) Global oddball: unpredictable global oddball (xxxX) versus predictable deterministic (xxxx), and
(3) "Stimulus-specific adaptation:" locally unpredictable oddball (xxxY) versus predictable deterministic (yyyy).
They found evidence for an increase in gamma (and theta in some cases) for unpredictable versus predictable stimuli, and a reduction in alpha/beta, which they consider evidence towards the "predictive routing" scheme.
While the dataset and analyses are well-suited to test evidence for predictive coding versus alternative hypotheses, I felt that the formulation was ambiguous, and the results were not very clear. My major concerns are as follows:”
We appreciate the reviewer’s concerns and outline how we will address them below:
(1) The authors set up three competing hypotheses, in which H1 and H2 make directly opposite predictions. However, it must be noted that H2 is proposed for spatial prediction, where the predictability is computed from the part of the image outside the RF. This is different from the temporal prediction that is tested here. Evidence in favor of H2 is readily observed when large gratings are presented, for which there is substantially more gamma than in small images. Actually, there are multiple features in the spectral domain that should not be conflated, namely (i) the transient broadband response, which includes all frequencies, (ii) contribution from the evoked response (ERP), which is often in frequencies below 30 Hz, (iii) narrow-band gamma oscillations which are produced by large and continuous stimuli (which happen to be highly predictive), and (iv) sustained low-frequency rhythms in theta and alpha/beta bands which are prominent before stimulus onset and reduce after ~200 ms of stimulus onset. The authors should be careful to incorporate these in their formulation of PC, and in particular should not conflate narrow-band and broadband gamma.”
We have clarified in the manuscript that while the gamma-as-prediction hypothesis (our H2) was originally proposed in a spatial prediction domain, further work (specifically Singer (2021)) has extended the hypothesis to cover temporal-domain predictions as well.
To address the reviewer’s point about multiple features in the spectral domain: Our analysis has specifically separated aperiodic components using FOOOF analysis (Supp. Fig. 1) and explicitly fit and tested aperiodic vs. periodic components (Supp. Figs 1&2). We did not find strong effects in the aperiodic components but did in the periodic components (Supp. Fig. 2), allowing us to be more confident in our conclusions in terms of genuine narrow-band oscillations. In the revised manuscript, we will include analysis of the pre-stimulus time window to address the reviewer’s point (iv) on sustained low frequency oscillations.
(2) My understanding is that any aspect of predictive coding must be present before the onset of stimulus (expected or unexpected). So, I was surprised to see that the authors have shown the results only after stimulus onset. For all figures, the authors should show results from -500 ms to 500 ms instead of zero to 500 ms.
In our revised manuscript we will include a pre-stimulus analysis and supplementary figures with time ranges from -500ms to 500ms. We have only refrained from doing so in the initial manuscript because our paradigm’s short interstimulus interval makes it difficult to interpret whether activity in the ISI reflects post-stimulus dynamics or pre-stimulus prediction. Nonetheless, we can easily show that in our paradigm, alpha/beta-band activity is elevated in the interstimulus activity after the offset of the previous stimulus, assuming that we baseline to the pre-trial period.
(3) In many cases, some change is observed in the initial ~100 ms of stimulus onset, especially for the alpha/beta and theta ranges. However, the evoked response contributes substantially in the transient period in these frequencies, and this evoked response could be different for different conditions. The authors should show the evoked responses to confirm the same, and if the claim really is that predictions are carried by genuine "oscillatory" activity, show the results after removing the ERP (as they had done for the CSD analysis).
We have included an extra sentence in our Materials and Methods section clarifying that the evoked potential/ERP was removed in our existing analyses, prior to performing the spectral decomposition of the LFP signal. We also note that the FOOOF analysis we applied separates aperiodic components of the spectral signal from the strictly oscillatory ones.
In our revised manuscript we will include an analysis of the evoked responses as suggested by the reviewer.
(4) I was surprised by the statistics used in the plots. Anything that is even slightly positive or negative is turning out to be significant. Perhaps the authors could use a more stringent criterion for multiple comparisons?
As noted above to Reviewer 1 (point 4), we are happy to include supplemental figures in our resubmission showing the effects on our results of setting the statistical significance threshold with considerably greater stringency.
(5) Since the design is blocked, there might be changes in global arousal levels. This is particularly important because the more predictive stimuli in the controlled deterministic stimuli were presented towards the end of the session, when the animal is likely less motivated. One idea to check for this is to do the analysis on the 3rd stimulus instead of the 4th? Any general effect of arousal/attention will be reflected in this stimulus.
In order to check for the brain-wide effects of arousal, we plan to perform similar analyses to our existing ones on the 3rd stimulus in each block, rather than just the 4th “oddball” stimulus. Clusters that appear significantly contrasting in both the 3rd and 4th stimuli may be attributable to arousal. We will also analyze pupil size as an index of arousal to check for arousal differences between conditions in our contrasts, possibly stratifying our data before performing comparisons to equalize pupil size within contrasts. We plan to include these analyses in our resubmission.
(6) The authors should also acknowledge/discuss that typical stimulus presentation/attention modulation involves both (i) an increase in broadband power early on and (ii) a reduction in low-frequency alpha/beta power. This could be just a sensory response, without having a role in sending prediction signals per se. So the predictive routing hypothesis should involve testing for signatures of prediction while ruling out other confounds related to stimulus/cognition. It is, of course, very difficult to do so, but at the same time, simply showing a reduction in low-frequency power coupled with an increase in high-frequency power is not sufficient to prove PR.
Since many different predictive coding and predictive processing hypotheses make very different hypotheses about how predictions might encoded in neurophysiological recordings, we have focused on prediction error encoding in this paper.
For the hypothesis space we have considered (H1-H3), each hypothesis makes clearly distinguishable predictions about the spectral response during the time period in the task when prediction errors should be present. As noted by the reviewer, a transient increase in broadband frequencies would be a signature of H3. Changes to oscillatory power in the gamma band in distinct directions (e.g., increasing or decreasing with prediction error) would support either H1 and H2, depending on the direction of change. We believe our data, especially our use of FOOOF analysis and separation of periodic from aperiodic components, coupled to the three experimental contrasts, speaks clearly in favor of the Predictive Routing model, but we do not claim we have “proved” it. This study provides just one datapoint, and we will acknowledge this in our revised Discussion in our resubmission.
(7) The CSD results need to be explained better - you should explain on what basis they are being called feedforward/feedback. Was LFP taken from Layer 4 LFP (as was done by van Kerkoerle et al, 2014)? The nice ">" and "<" CSD patterns (Figure 3B and 3F of their paper) in that paper are barely observed in this case, especially for the alpha/beta range.
We consider a feedforward pattern as flowing from L4 outwards to L2/3 and L5/6, and a feedback pattern as flowing in the opposite direction, from L1 and L6 to the middle layers. We will clarify this in the revised manuscript.
Since gamma-band oscillations are strongest in L2/3, we re-epoched LFPs to the oscillation troughs in L2/3 in the initial manuscript. We can include in the revised manuscript equivalent plots after finding oscillation troughs in L4 instead, as well as calculating the difference in trough times within-band between layers to quantify the transmission delay and add additional rigor to our feedforward vs. feedback interpretation of the CSD data.
(8) Figure 4a-c, I don't see a reduction in the broadband signal in a compared to b in the initial segment. Maybe change the clim to make this clearer?
We are looking into the clim/colorbar and plot-generation code to figure out the visibility issue that the Reviewer has kindly pointed out to us.
(9) Figure 5 - please show the same for all three frequency ranges, show all bars (including the non-significant ones), and indicate the significance (p-values or by *, **, ***, etc) as done usually for bar plots.
We will add the requested bar-plots for all frequency ranges, though we note that the bars given here are the results of adding up the spectral power in the channel-time-frequency clusters that already passed significance tests and that adding secondary significance tests here may not prove informative.
(10) Their claim of alpha/beta oscillations being suppressed for unpredictable conditions is not as evident. A figure akin to Figure 5 would be helpful to see if this assertion holds.
As noted above, we will include the requested bar plot, as well as examining alpha/beta in the pre-stimulus time-series rather than after the onset of the oddball stimulus.
(11) To investigate the prediction and violation or confirmation of expectation, it would help to look at both the baseline and stimulus periods in the analyses.
We will include for the Reviewer’s edification a supplementary figure showing the spectrograms for the baseline and full-trial periods to look at the difference between baseline and prestimulus expectation.
Reviewer 3:
Summary:
In their manuscript entitled "Ubiquitous predictive processing in the spectral domain of sensory cortex", Sennesh and colleagues perform spectral analysis across multiple layers and areas in the visual system of mice. Their results are timely and interesting as they provide a complement to a study from the same lab focussed on firing rates, instead of oscillations. Together, the present study argues for a hypothesis called predictive routing, which argues that non-predictable stimuli are gated by Gamma oscillations, while alpha/beta oscillations are related to predictions.
Strengths:
(1) The study contains a clear introduction, which provides a clear contrast between a number of relevant theories in the field, including their hypotheses in relation to the present data set.
(2) The study provides a systematic analysis across multiple areas and layers of the visual cortex.”
We thank the Reviewer for their kind comments.
Weaknesses:
(1) It is claimed in the abstract that the present study supports predictive routing over predictive coding; however, this claim is nowhere in the manuscript directly substantiated. Not even the differences are clearly laid out, much less tested explicitly. While this might be obvious to the authors, it remains completely opaque to the reader, e.g., as it is also not part of the different hypotheses addressed. I guess this result is meant in contrast to reference 17, by some of the same authors, which argues against predictive coding, while the present work finds differences in the results, which they relate to spectral vs firing rate analysis (although without direct comparison).
We agree that in this manuscript we should restrict ourselves to the hypotheses that were directly tested. We have revised our abstract accordingly, and softened our claim to note only that our LFP results are compatible with predictive routing.
(2) Most of the claims about a direction of propagation of certain frequency-related activities (made in the context of Figures 2-4) are - to the eyes of the reviewer - not supported by actual analysis but glimpsed from the pictures, sometimes, with very little evidence/very small time differences to go on. To keep these claims, proper statistical testing should be performed.
In our revised manuscript, we will either substantiate (with quantification of CSD delays between layers) or soften the claims about feedforward/feedback direction of flow within the cortical column.
(3) Results from different areas are barely presented. While I can see that presenting them in the same format as Figures 2-4 would be quite lengthy, it might be a good idea to contrast the right columns (difference plots) across areas, rather than just the overall averages.
In our revised manuscript we will gladly include a supplementary figure showing the right-column difference plots across areas, in order to make sure to include aspects of our dataset that span up and down the cortical hierarchy.
(4) Statistical testing is treated very generally, which can help to improve the readability of the text; however, in the present case, this is a bit extreme, with even obvious tests not reported or not even performed (in particular in Figure 5).
We appreciate the Reviewer’s concern for statistical rigor, and as noted to the other reviewers, we can add different levels of statistical description and describe the p-values associated with specific clusters. Regarding Figure 5, we must protest as the bar heights were computed came from clusters already subjected to statistical testing and found significant. We could add a supplementary figure which considers untested narrowband activity and tests it only in the “bar height” domain, if the Reviewer would like.
(5) The description of the analysis in the methods is rather short and, to my eye, was missing one of the key descriptions, i.e., how the CSD plots were baselined (which was hinted at in the results, but, as far as I know, not clearly described in the analysis methods). Maybe the authors could section the methods more to point out where this is discussed.
We have added some elaboration to our Materials and Methods section, especially to specify that CSD, having physical rather than arbitrary units, does not require baselining.
(6) While I appreciate the efforts of the authors to formulate their hypotheses and test them clearly, the text is quite dense at times. Partly this is due to the compared conditions in this paradigm; however, it would help a lot to show a visualization of what is being compared in Figures 2-4, rather than just showing the results.
In the revised manuscript we will add a visual aid for the three contrasts we consider.
We are happy to inform the editors that we have implemented, for the Reviewed Preprint, the direct textual Recommendations for the Authors given by Reviewers 2 and 3. We will implement the suggested Figure changes in our revised manuscript. We thank them for their feedback in strengthening our manuscript.
