10,000 Matching Annotations
  1. Jan 2026
    1. To write a paper in a Mathematica notebook is to reveal your results and methods at the same time; the published paper and the work that begot it. Which shouldn’t just make it easier for readers to understand what you did—it should make it easier for them to replicate it (or not). With millions of scientists worldwide producing incremental contributions, the only way to have those contributions add up to something significant is if others can reliably build on them. “That’s what having science presented as computational essays can achieve,” Wolfram said.

      La idea y el uso de las libretas computacionales es menos generales que la de (Inter) Personal Knowledge Management, como era de esperarse y ha evidenciado nuestras prácticas en la comunidad de Grafoscopio, donde sus libretas interactivas fueron usadas extensivamente y de acuerdo a las necesidades descubiertas con la comunidad en la creación y articulación de flujos documentales a medida. Algo similar se puede decir en las ciencias sociales y humanas, donde también se escribe, pero no simulaciones de sistemas complejos.

      En estos contextos comunitarios y de las ciencias amplias y estudios críticos, el hipertexto y la interactividad puede servir, pero más para explorar la memoria propia, imaginar y enactuar otras formas de comunicarla y construirla.

      El énfasis actual en Cardumem, en lugar de Grafoscopio, y los posibles vínculos entre ambos reconoce estas otras posibilidades de interacción y computación desde esa memoria interpersonal e interactiva.

    2. “Pick any field X, from archeology to zoology. There either is now a ‘computational X’ or there soon will be. And it’s widely viewed as the future of the field.” As practitioners in those fields become more literate with computation, Wolfram argues, they’ll vastly expand the range of what’s discoverable. The Mathematica notebook could be an accelerant for science because it could spawn a new kind of thinking.

      Lo he notado con las Ciencias Archivisticas Computacionales, o CAS, por su sigla en inglés y mi rol dentro del departamento de Ciencias de la Información de la PUJ, alentando dicha transición

    3. In the mid-1600s, Gottfried Leibniz devised a notation for integrals and derivatives (the familiar ∫ and dx/dt) that made difficult ideas in calculus almost mechanical. Leibniz developed the sense that a similar notation applied more broadly could create an “algebra of thought.” Since then, logicians and linguists have lusted after a universal language that would eliminate ambiguity and turn complex problem-solving of all kinds into a kind of calculus.

      Bret Victor, en Media for thinking the unthikable, compara a Leibnitz con Steve Jobs, diciendo que el primero era un gran inventor de interfaces de su época, en la forma de nuevas notaciones.

      Cardumem, de hecho, empezó como un ejercicio mental pensando una nueva notación para expresar "álgebras hipertextuales" que pudieran ser embebidas en un motor wiki (al comienzo vía Lua en JavaScript y luego del lado del servidor, con sistemas hipermedia). Y dicha notación fue concebida en la medida en que las herramientas externas para manipular hipertexto, como TiddlyWiki Pharo generaban mucha fricción en los miembros de la comunidad de Grafoscopio, al punto que su uso era marginal. Una nueva herramienta con una nueva notación alentaría usos y personalizaciones compartidas que con herramientas separadas eran muy esporpadicos y más bien solitarios, entre un puñado de personas, a lo sumo.

      Dado que las piezas para armar el wiki (Djot y YueScript) ya estaban integradas al ecosistema Lua, y que originalmente se pensaba en integrarlo directamente a TiddlyWiki, del lado del navegador web, en lugar del servidor, Lua este fue elegido en lugar de Pharo para los procesos de prototipo, dando muy buenos resultados iniciales hasta el momento.

    4. This is, of course, the whole problem of scientific communication in a nutshell: Scientific results today are as often as not found with the help of computers. That’s because the ideas are complex, dynamic, hard to grab ahold of in your mind’s eye. And yet by far the most popular tool we have for communicating these results is the PDF—literally a simulation of a piece of paper. Maybe we can do better.

      O tener transiciones multiformato, entre sistemas más análogos y más digitales, dependiendo de dónde el artículo es compartido como decía en este otro comentario

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Here, the authors have addressed the recruitment and firing patterns of motor units (MUs) from the long and lateral heads of the triceps in the mouse. They used their newly developed Myomatrix arrays to record from these muscles during treadmill locomotion at different speeds, and they used template-based spike sorting (Kilosort) to extract units. Between MUs from the two heads, the authors observed differences in their firing rates, recruitment probability, phase of activation within the locomotor cycle, and interspike interval patterning. Examining different walking speeds, the authors find increases in both recruitment probability and firing rates as speed increases. The authors also observed differences in the relation between recruitment and the angle of elbow extension between motor units from each head. These differences indicate meaningful variation between motor units within and across motor pools and may reflect the somewhat distinct joint actions of the two heads of triceps.

      Strengths:

      The extraction of MU spike timing for many individual units is an exciting new method that has great promise for exposing the fine detail in muscle activation and its control by the motor system. In particular, the methods developed by the authors for this purpose seem to be the only way to reliably resolve single MUs in the mouse, as the methods used previously in humans and in monkeys (e.g. Marshall et al. Nature Neuroscience, 2022) do not seem readily adaptable for use in rodents.

      The paper provides a number of interesting observations. There are signs of interesting differences in MU activation profiles for individual muscles here, consistent with those shown by Marshall et al. It is also nice to see fine-scale differences in the activation of different muscle heads, which could relate to their partially distinct functions. The mouse offers greater opportunities for understanding the control of these distinct functions, compared to the other organisms in which functional differences between heads have previously been described.

      The Discussion is very thorough, providing a very nice recounting of a great deal of relevant previous results.

      We thank the Reviewer for these comments.

      Weaknesses:

      The findings are limited to one pair of muscle heads. While an important initial finding, the lack of confirmation from analysis of other muscles acting at other joints leaves the general relevance of these findings unclear.

      The Reviewer raises a fair point. While outside the scope of this paper, future studies should certainly address a wider range of muscles to better characterize motor unit firing patterns across different sets of effectors with varying anatomical locations. Still, the importance of results from the triceps long and lateral heads should not be understated as this paper, to our knowledge, is the first to capture the difference in firing patterns of motor units across any set of muscles in the locomoting mouse.

      While differences between muscle heads with somewhat distinct functions are interesting and relevant to joint control, differences between MUs for individual muscles, like those in Marshall et al., are more striking because they cannot be attributed potentially to differences in each head's function. The present manuscript does show some signs of differences for MUs within individual heads: in Figure 2C, we see what looks like two clusters of motor units within the long head in terms of their recruitment probability. However, a statistical basis for the existence of two distinct subpopulations is not provided, and no subsequent analysis is done to explore the potential for differences among MUs for individual heads.

      We agree with the Reviewer and have revised the manuscript to better examine potential subpopulations of units within each muscle as presented in Figure 2C. We performed Hartigan’s dip test on motor units within each muscle to test for multimodal distributions. For both muscles, p > 0.05, so we cannot reject the null hypothesis that the units in each muscle come from a multimodal distribution. However, Hartigan’s test and similar statistical methods have poor statistical power for the small sample sizes (n=17 and 16 for long and lateral heads, respectively) considered here, so the failure to achieve statistical significance might reflect either the absence of a true difference or a lack of statistical resolution.

      Still, the limited sample size warrants further data collection and analysis since the varying properties across motor units may lead to different activation patterns. Given these results, we have edited the text as follows:

      “A subset of units, primarily in the long head, were recruited in under 50% of the total strides and with lower spike counts (Figure 2C). This distribution of recruitment probabilities might reflect a functionally different subpopulation of units. However, the distribution of recruitment probabilities were not found to be significantly multimodal (p>0.05 in both cases, Hartigan’s dip test; Hartigan, 1985). However, Hartigan’s test and similar statistical methods have poor statistical power for the small sample sizes (n=17 and 16 for long and lateral heads, respectively) considered here, so the failure to achieve statistical significance might reflect either the absence of a true difference or a lack of statistical resolution.”

      The statistical foundation for some claims is lacking. In addition, the description of key statistical analysis in the Methods is too brief and very hard to understand. This leaves several claims hard to validate.

      We thank the Reviewer for these comments and have clarified the text related to key statistical analyses throughout the manuscript, as described in our other responses below.

      Reviewer #2 (Public review):

      The present study, led by Thomas and collaborators, aims to describe the firing activity of individual motor units in mice during locomotion. To achieve this, they implanted small arrays of eight electrodes in two heads of the triceps and performed spike sorting using a custom implementation of Kilosort. Simultaneously, they tracked the positions of the shoulder, elbow, and wrist using a single camera and a markerless motion capture algorithm (DeepLabCut). Repeated one-minute recordings were conducted in six mice at five different speeds, ranging from 10 to 27.5 cm·s<sup>-1</sup>.

      From these data, the authors reported that:

      (1) a significant portion of the identified motor units was not consistently recruited across strides,

      (2) motor units identified from the lateral head of the triceps tended to be recruited later than those from the long head,

      (3) the number of spikes per stride and peak firing rates were correlated in both muscles, and

      (4) the probability of motor unit recruitment and firing rates increased with walking speed.

      The authors conclude that these differences can be attributed to the distinct functions of the muscles and the constraints of the task (i.e., speed).

      Strengths:

      The combination of novel electrode arrays to record intramuscular electromyographic signals from a larger muscle volume with an advanced spike sorting pipeline capable of identifying populations of motor units.

      We thank the Reviewer for this comment.

      Weaknesses:

      (1) There is a lack of information on the number of identified motor units per muscle and per animal.

      The Reviewer is correct that this information was not explicitly provided in the prior submission. We have therefore added Table 1 that quantifies the number of motor units per muscle and per animal.

      (2) All identified motor units are pooled in the analyses, whereas per-animal analyses would have been valuable, as motor units within an individual likely receive common synaptic inputs. Such analyses would fully leverage the potential of identifying populations of motor units.

      Please see our answer to the following point, where we address questions (2) and (3) together.

      (3) The current data do not allow for determining which motor units were sampled from each pool. It remains unclear whether the sample is biased toward high-threshold motor units or representative of the full pool.

      We thank the Reviewer for these comments. To clarify how motor unit responses were distributed across animals and muscle targets, we updated or added the following figures:  

      Figure 2C

      Figure 4–figure supplement 1

      Figure 5–figure supplement 2

      Figure 6–figure supplement 2

      These provide a more complete look at the range of activity within each motor pool, suggesting that we do measure from units with different activation thresholds within the same motor pool, rather than this variation being due to cross-animal differences. For example, Figure 2C illustrates that motor units from the same muscle and animal show a wide variety of recruitment probabilities. However, the limited number of motor units recorded from each individual animal does not allow a statistically rigorous test for examining cross-animal differences.

      (4) The behavioural analysis of the animals relies solely on kinematics (2D estimates of elbow angle and stride timing). Without ground reaction forces or shoulder angle data, drawing functional conclusions from the results is challenging.

      The Reviewer is correct that we did not measure muscular force generation or ground reaction forces in the present study. Although outside the scope of this study, future work might employ buckle force transducers as used in larger animals (Biewener et al., 1988; Karabulut et al., 2020) to examine the complex interplay between neural commands, passive biomechanics, and the complex force-generating properties of muscle tissue.

      Major comments:

      (1) Spike sorting

      The conclusions of the study rely on the accuracy and robustness of the spike sorting algorithm during a highly dynamic task. Although the pipeline was presented in a previous publication (Chung et al., 2023, eLife), a proper validation of the algorithm for identifying motor unit spikes is still lacking. This is particularly important in the present study, as the experimental conditions involve significant dynamic changes. Under such conditions, muscle geometry is altered due to variations in both fibre pennation angles and lengths.

      This issue differs from electrode drift, and it is unclear whether the original implementation of Kilosort includes functions to address it. Could the authors provide more details on the various steps of their pipeline, the strategies they employed to ensure consistent tracking of motor unit action potentials despite potential changes in action potential waveforms, and the methods used for manual inspection of the spike sorting algorithm's output?

      This is an excellent point and we agree that the dynamic behavior used in this investigation creates potential new challenges for spike sorting. In our analysis, Kilosort 2.5 provides key advantages in comparing unit waveforms across multiple channels and in detecting overlapping spikes. We modified this version of Kilosort to construct unit waveform templates using only the channels within the same muscle (Chung et al., 2023), as clarified in the revised Methods section (see “Electromyography (EMG)”):

      “A total of 33 units were identified across all animals. Each unit’s isolation was verified by confirming that no more than 2% of inter-spike intervals violated a 1 ms refractory limit. Additionally, we manually reviewed cross-correlograms to ensure that each waveform was only reported as a single motor unit.”

      The Reviewer is correct that our ability to precisely measure a unit’s activity based on its waveform will depend on the relationship between the embedded electrode and the muscle geometry, which alters over the course of the stride. As a follow-up to the original text, we have included new analyses to characterize the waveform activity throughout the experiment and stride (also in Methods):

      “We further validated spike sorting by quantifying the stability of each unit’s waveform across time (Figure 1–figure supplement 1). First, we calculated the median waveform of each unit across every trial to capture long-term stability of motor unit waveforms. Additionally, we calculated the median waveform through the stride binned in 50 ms increments using spiking from a single trial. This second metric captures the stability of our spike sorting during the rapid changes in joint angles that occur during the burst of an individual motor unit. In doing so, we calculated each motor unit’s waveforms from the single channel in which that unit’s amplitude was largest and did not attempt to remove overlapping spikes from other units before measuring the median waveform from the data. We then calculated the correlation between a unit’s waveform over either trials or bins in which at least 30 spikes were present. The high correlation of a unit waveform over time, despite potential changes in the electrodes’ position relative to muscle geometry over the dynamic task, provides additional confidence in both the stability of our EMG recordings and the accuracy of our spike sorting.”

      (2) Yield of the spike sorting pipeline and analyses per animal/muscle

      A total of 33 motor units were identified from two heads of the triceps in six mice (17 from the long head and 16 from the lateral head). However, precise information on the yield per muscle per animal is not provided. This information is crucial to support the novelty of the study, as the authors claim in the introduction that their electrode arrays enable the identification of populations of motor units. Beyond reporting the number of identified motor units, another way to demonstrate the effectiveness of the spike sorting algorithm would be to compare the recorded EMG signals with the residual signal obtained after subtracting the action potentials of the identified motor units, using a signal-to-residual ratio.

      Furthermore, motor units identified from the same muscle and the same animal are likely not independent due to common synaptic inputs. This dependence should be accounted for in the statistical analyses when comparing changes in motor unit properties across speeds and between muscles.

      We thank the Reviewer for this comment. Regarding motor unit yield, as described above the newly-added Table 1 displays the yield from each animal and muscle.

      Regarding spike sorting, while signal-to-residual is often an excellent metric, it is not ideal for our high-resolution EMG signals since isolated single motor units are typically superimposed on a “bulk” background consisting of the low-amplitude waveforms of other motor units. Because these smaller units typically cannot be sorted, it is challenging to estimate the “true” residual after subtracting (only) the largest motor unit, since subtracting each sorted unit’s waveform typically has a very small effect on the RMS of the total EMG signal. To further address concerns regarding spike sorting quality, we added Figure 1–figure supplement 1 that demonstrates motor units’ consistency over the experiment, highlighting that the waveform maintains its shape within each stride despite muscle/limb dynamics and other possible sources of electrical noise or artifact.

      Finally, the Reviewer is correct that individual motor units in the same muscle are very likely to receive common synaptic inputs. These common inputs may reflect in sparse motor units being recruited in overlapping rather than different strides. Indeed, in the following added to the Results, we identified that motor units are recruited with higher probability when additional units are recruited.

      “Probabilistic recruitment is correlated across motor units

      Our results show that the recruitment of individual motor units is probabilistic even within a single speed quartile (Figure 5A-C) and predicts body movements (Figure 6), raising the question of whether the recruitment of individual motor units are correlated or independent. Correlated recruitment might reflect shared input onto the population of motor units innervating the muscle (De Luca, 1985; De Luca & Erim, 1994; Farina et al., 2014). For example, two motor units, each with low recruitment probabilities, may still fire during the same set of strides. To assess the independence of motor unit recruitment across the recorded population, we compared each unit’s empirical recruitment probability across all strides to its conditional recruitment probability during strides in which another motor unit from the same muscle was recruited (Figure 7). Doing this for all motor unit pairs revealed that motor units in both muscles were biased towards greater recruitment when additional units were active (p<0.001, Wilcoxon signed-rank tests for both the lateral and long heads of triceps). This finding suggests that probabilistic recruitment reflects common synaptic inputs that covary together across locomotor strides.”

      (3) Representativeness of the sample of identified motor units

      However, to draw such conclusions, the authors should exclusively compare motor units from the same pool and systematically track violations of the recruitment order. Alternatively, they could demonstrate that the motor units that are intermittently active across strides correspond to the smallest motor units, based on the assumption that these units should always be recruited due to their low activation thresholds.

      One way to estimate the size of motor units identified within the same muscle would be to compare the amplitude of their action potentials, assuming that all motor units are relatively close to the electrodes (given the selectivity of the recordings) and that motoneurons innervating more muscle fibres generate larger motor unit action potentials.

      We thank the Reviewer for this comment. Below, we provide more detailed analyses of the relationships between motor unit spike amplitude and the recruitment probability as well as latency (relative to stride onset) of activation.

      We generated the below figures to illustrate the relationship between the amplitude of motor units and their firing properties. As suspected, units with larger-amplitude waveforms fired with lower probability and produced their first spikes later in the stride. If we were comfortable assuming that larger spike amplitudes mean higher-force units, then this would be consistent with a key prediction of the size principle (i.e. that higher-force units are recruited later). However, we are hesitant to base any conclusions on this assumption or emphasize this point with a main-text figure, since EMG signal amplitude may also vary due to the physical properties of the electrode and distance from muscle fibers. Thus it is possible that a large motor unit may have a smaller waveform amplitude relative to the rest of the motor pool.

      Author response image 1.

      Relation between motor unit amplitude and (A) recruitment probability and (B) mean first spike time within the stride. Colored lines indicate the outcome of linear regression analyses.

      Currently, the data seem to support the idea that motor units that are alternately recruited across strides have recruitment thresholds close to the level of activation or force produced during slow walking. The fact that recruitment probability monotonically increases with speed suggests that the force required to propel the mouse forward exceeds the recruitment threshold of these "large" motor units. This pattern would primarily reflect spatial recruitment following the size principle rather than flexible motor unit control.

      We thank the Reviewer for this comment. We agree with this interpretation, particularly in relation to the references suggested in later comments, and have added the following text to the Discussion to better reflect this argument:

      “To investigate the neuromuscular control of locomotor speed, we quantified speed-dependent changes in both motor unit recruitment and firing rate. We found that the majority of units were recruited more often and with larger firing rates at faster speeds (Figure 5, Figure5–figure supplement 1). This result may reflect speed-dependent differences in the common input received by populations of motor neurons with varying spiking thresholds (Henneman et al., 1965). In the case of mouse locomotion, faster speeds might reflect a larger common input, increasing the recruitment probability as more neurons, particularly those that are larger and generate more force, exceed threshold for action potentials (Farina et al., 2014).”

      (4) Analysis of recruitment and firing rates

      The authors currently report active duration and peak firing rates based on spike trains convolved with a Gaussian kernel. Why not report the peak of the instantaneous firing rates estimated from the inverse of the inter-spike interval? This approach appears to be more aligned with previous studies conducted to describe motor unit behaviour during fast movements (e.g., Desmedt & Godaux, 1977, J Physiol; Van Cutsem et al., 1998, J Physiol; Del Vecchio et al., 2019, J Physiol).

      We thank the Reviewer for this comment. In the revised Discussion (see ‘Firing rates in mouse locomotion compared to other species’) we reference several examples of previous studies that quantified spike patterns based on the instantaneous firing rate. We chose to report the peak of the smoothed firing rate because that quantification includes strides with zero spikes or only one spike, which occur regularly in our dataset (and for which ISI rate measures, which require two spikes to define an instantaneous firing rate, cannot be computed). Regardless, in the revised Figure 4B, we present an analysis that uses inter-spike intervals as suggested, which yielded similar ranges of firing rates as the primary analysis.

      (5) Additional analyses of behaviour

      The authors currently analyse motor unit recruitment in relation to elbow angle. It would be valuable to include a similar analysis using the angular velocity observed during each stride, re broadly, comparing stride-by-stride changes in firing rates with changes in elbow angular velocity would further strengthen the final analyses presented in the results section.

      We thank the Reviewer for this comment. To address this, we have modified Figure 6 and the associated Supplemental Figures, to show relationships in unit activation with both the range of elbow extension and the range of elbow velocity for each stride. These new Supplemental Figures show that the trends shown in main text Figure 6C and 6E (which show data from all speed quartiles on the same axes) are also apparent in both the slower and faster quartiles individually, although single-quartile statistical tests (with smaller sample size than the main analysis) not reach statistical significance in all cases.

      Reviewer #3 (Public review):

      Summary:

      Using the approach of Myomatrix recording, the authors report that:

      (1) Motor units are recruited differently in the two types of muscles.

      (2) Individual units are probabilistically recruited during the locomotion strides, whereas the population bulk EMG has a more reliable representation of the muscle.

      (3) The recruitment of units was proportional to walking speed.

      Strengths:

      The new technique provides a unique data set, and the data analysis is convincing and well-performed.

      We thank the Reviewer for the comment.

      Weaknesses:

      The implications of "probabilistical recruitment" should be explored, addressed, and analyzed further.

      Comments:

      One of the study's main findings (perhaps the main finding) is that the motor units are "probabilistically" recruited. The authors do not define what they mean by probabilistically recruited, nor do they present an alternative scenario to such recruitment or discuss why this would be interesting or surprising. However, on page 4, they do indicate that the recruitment of units from both muscles was only active in a subset of strides, i.e., they are not reliably active in every step.

      If probabilistic means irregular spiking, this is not new. Variability in spiking has been seen numerous times, for instance in human biceps brachii motor units during isometric contractions (Pascoe, Enoka, Exp physiology 2014) and elsewhere. Perhaps the distinction the authors are seeking is between fluctuation-driven and mean-driven spiking of motor units as previously identified in spinal motor networks (see Petersen and Berg, eLife 2016, and Berg, Frontiers 2017). Here, it was shown that a prominent regime of irregular spiking is present during rhythmic motor activity, which also manifests as a positive skewness in the spike count distribution (i.e., log-normal).

      We thank the Reviewer for this comment and have clarified several passages in response. The Reviewer is of course correct that irregular motor unit spiking has been described previously and may reflect motor neurons’ operating in a high-sensitivity (fluctuation-driven) regime. We now cite these papers in the Discussion (see ‘Firing rates in mouse locomotion compared to other species’). Additionally, the revision clarifies that “probabilistically” - as defined in our paper - refers only to the empirical observation that a motor unit spikes during only a subset of strides, either when all locomotor speeds are considered together (Figure 2) or separately (Figure 5A-C):

      “Motor units in both muscles exhibited this pattern of probabilistic recruitment (defined as a unit’s firing on only a fraction of strides), but with differing distributions of firing properties across the long and lateral heads (Figure 2).”

      “Our findings (Figure 4) highlight that even with the relatively high firing rates observed in mice, there are still significant changes in firing rate and recruitment probability across the spikes within bursts (Figure 4B) and across locomotor speeds (Figure 5F). Future studies should more carefully examine how these rapidly changing spiking patterns derive from both the statistics of synaptic inputs and intrinsic properties of motor neurons (Manuel & Heckman, 2011; Petersen & Berg, 2016; Berg, 2017).”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      As mentioned above, there are several issues with the statistics that need to be corrected to properly support the claims made in the paper.

      The authors compare the fractions of MUs that show significant variation across locomotor speeds in their firing rate and recruitment probability. However, it is not statistically founded to compare the results of separate statistical tests based on different kinds of measurements and thus have unconstrained differences in statistical power. The comparison of the fractional changes in firing rates and recruitment across speeds that follow is helpful, though in truth, by contemporary standards, one would like to see error bars on these estimates. These could be generated using bootstrapping.

      The Reviewer is correct, and we have revised the manuscript to better clarify which quantities should or should not be compared, including the following passage (see “Motor unit mechanisms of speed control” in Results):

      “Speed-dependent increases in peak firing rate were therefore also present in our dataset, although in a smaller fraction of motor units (22/33) than changes in recruitment probability (31/33). Furthermore, the mean (± SE) magnitude of speed-dependent increases was smaller for spike rates (mean rate<sub>fast</sub>/rate<sub>slow</sub> of 111% ± 20% across all motor units) than for recruitment probabilities (mean p(recruitment) <sub>fast</sub>/p(recruitment) <sub>slow</sub> of 179% ± 3% across all motor units). While fractional changes in rate and recruitment probability are not readily comparable given their different upper limits, these findings could suggest that while both recruitment and peak rate change across speed quartiles, increased recruitment probability may play a larger role in driving changes in locomotor speed.”

      The description in the Methods of the tests for variation in firing rates and recruitment probability across speeds are extremely hard to understand - after reading many times, it is still not clear what was done, or why the method used was chosen. In the main text, the authors quote p-values and then state "bootstrap confidence intervals," which is not a statistical test that yields a p-value. While there are mathematical relationships between confidence intervals and statistical tests such that a one-to-one correspondence between them can exist, the descriptions provided fall short of specifying how they are related in the present instance. For this reason, and those described in what follows, it is not clear what the p-values represent.

      Next, the authors refer to fitting a model ("a Poisson distribution") to the data to estimate firing rate and recruitment probability, that the model results agree with their actual data, and that they then bootstrapped from the model estimates to get confidence intervals and compute p-values. Why do this? Why not just do something much simpler, like use the actual spike counts, and resample from those? I understand that it is hard to distinguish between no recruitment and just no spikes given some low Poisson firing rate, but how does that challenge the ability to test if the firing rates or the number of spiking MUs changes significantly across speeds? I can come up with some reasons why I think the authors might have decided to do this, but reasoning like this really should be made explicit.

      In addition, the authors would provide an unambiguous description of the model, perhaps using an equation and a description of how it was fit. For the bootstrapping, a clear description of how the resampling was done should be included. The focus on peak firing rate instead of mean (or median) firing rate should also be justified. Since peaks are noisier, I would expect the statistical power to be lower compared to using the mean or median.

      We thank the Reviewer for the comments and have revised and expanded our discussion of the statistical tests employed. We expanded and clarified our description of these techniques in the updated Methods section:

      “Joint model of rate and recruitment

      We modeled the recruitment probability and firing rate based on empirical data to best characterize firing statistics within the stride. Particularly, this allowed for multiple solutions to explain why a motor unit would not spike within a stride. From the empirical data alone, strides with zero spikes would have been assumed to have no recruitment of a unit. However, to create a model of motor unit activity that includes both recruitment and rate, it must be possible that a recruited unit can have a firing rate of zero. To quantify the firing statistics that best represent all spiking and non-spiking patterns, we modeled recruitment probability and peak firing rate along the following piecewise function:

      where y denotes the observed peak firing rate on a given stride (determined by convolving motor unit spike times with a Gaussian kernel as described above), p denotes the probability of recruitment, and λ denotes the expected peak firing rate from a Poisson distribution of outcomes. Thus, an inactive unit on a given stride may be the result of either non-recruitment or recruitment with a stochastically zero firing rate. The above equations were fit by minimizing the negative log-likelihood of the parameters given the data.

      “Permutation test for joint model of rate and recruitment and type 2 regression slopes

      To quantify differences in firing patterns across walking speeds, we subdivided each mouse’s total set of strides into speed quartiles and calculated rate (𝜆, Eq. 1 and 2, Fig. 5A-C) and recruitment probability terms (p, Eq. 1 and 2, Fig. 5D-F) for each unit in each speed quartile. Here we calculated the difference in both the rate and recruitment terms across the fastest and slowest speed quartiles (p<sub>fast</sub>-p<sub>slow</sub> and 𝜆<sub>fast</sub>-𝜆<sub>slow</sub>). To test whether these model parameters were significantly different depending on locomotor speed, we developed a null model combining strides from both the fastest and slowest speed quartiles. After pooling strides from both quartiles, we randomly distributed the pooled set of strides into two groups with sample sizes equal to the original slow and fast quartiles. We then calculated the null model parameters for each new group and found the difference between like terms. To estimate the distribution of possible differences, we bootstrapped this result using 1000 random redistributions of the pooled set of strides. Following the permutation test, the 95% confidence interval of this final distribution reflects the null hypothesis of no difference between groups. Thus, the null hypothesis can be rejected if the true difference in rate or recruitment terms exceeds this confidence interval.

      We followed a similar procedure to quantify cross-muscle differences in the relationship between firing parameters. For each muscle, we estimated the slope across firing parameters for each motor unit using type 2 regression. In this case, the true difference was the difference in slopes between muscles. To test the null hypothesis that there was no difference in slopes, the null model reflected the pooled set of units from both muscles. Again, slopes were calculated for 1000 random resamplings of this pooled data to estimate the 95% confidence interval.”

      The argument for delayed activation of the lateral head is interesting, but I am not comfortable saying the nervous system creates a delay just based on observations of the mean time of the first spike, given the potential for differential variability in spike timing across muscles and MUs. One way to make a strong case for a delay would be to show aggregate PSTHs for all the spikes from all the MUs for each of the two heads. That would distinguish between a true delay and more gradual or variable activation between the heads.

      This is a good point and we agree that the claim made about the nervous system is too strong given the results. Even with Author response image 2 below that the Reviewer suggested, there is still not enough evidence to isolate the role of the nervous system in the muscles’ activation.

      Author response image 2.

      Aggregate peristimulus time histogram (PSTH) for all motor unit spike times in the long head (top) and lateral head (bottom) within the stride.

      In the ideal case, we would have more simultaneous recordings from both muscles to make a more direct claim on the delay. Still, within the current scope of the paper, to correct this and better describe the difference in timing of muscle activity, we edited the text to the following:

      “These findings demonstrate that despite the synergistic (extensor) function of the long and lateral heads of the triceps at the elbow, the motor pool for the long head becomes active roughly 100 ms before the motor pool supplying the lateral head during locomotion (Figure 3C).”

      The results from Marshall et al. 2022 suggest that the recruitment of some MUs is not just related to muscle force, but also the frequency of force variation - some of their MUs appear to be recruited only at certain frequencies. Figure 5C could have shown signs of this, but it does not appear to. We do not really know the force or its frequency of variation in the measurements here. I wonder whether there is additional analysis that could address whether frequency-dependent recruitment is present. It may not be addressable with the current data set, but this could be a fruitful direction to explore in the future with MU recordings from mice.

      We agree that this would be a fruitful direction to explore, however the Reviewer is correct that this is not easily addressable with the dataset. As the Reviewer points out, stride frequency increases with increased speed, potentially offering the opportunity to examine how motor unit activity varies with the frequency, phase, and amplitude of locomotor movements. However, given our lack of force data (either joint torques or ground reaction forces), dissociating the frequency/phase/amplitude of skeletal kinematics from the frequency/phase/amplitude of muscle force. Marshall et al. (2022) mitigated these issues by using an isometric force-production task (Marshall et al., 2022). Therefore, while we agree that it would be a major contribution to extend such investigations to whole-body movements like locomotion, given the complexities described above we believe this is a project for the future, and beyond the scope of the present study.

      Minor:

      Page 5: "Units often displayed no recruitment in a greater proportion of strides than for any particular spike count when recruited (Figures 2A, B)," - I had to read this several times to understand it. I suggest rephrasing for clarity.

      We have changed the text to read:

      “Units demonstrated a variety of firing patterns, with some units producing 0 spikes more frequently than any non-zero spike count (Figure 2A, B),...”

      Figure 3 legend: "Mean phase ({plus minus} SE) of motor unit burst duration across all strides.": It is unclear what this means - durations are not usually described as having a phase. Do we mean the onset phase?

      We have changed the text to read:

      “Mean phase ± SE of motor unit burst activity within each stride”

      Page 9: "suggesting that the recruitment of individual motor units in the lateral and long heads might have significant (and opposite) effects on elbow angle in strides of similar speed (see Discussion)." I wouldn't say "opposite" here - that makes it sound like the authors are calling the long head a flexor. The authors should rephrase or clarify the sense in which they are opposite.

      This is a fair point and we agree we should not describe the muscles as ‘opposite’ when both muscles are extensors. We have removed the phrase ‘and opposite’ from the text.

      Page 11: "in these two muscles across in other quadrupedal species" - typo.

      We have corrected this error.

      Page 16: This reviewer cannot decipher after repeated attempts what the first two sentences of the last paragraph mean. - “Future studies might also use perturbations of muscle activity to dissociate the causal properties of each motor unit’s activity from the complex correlation structure of locomotion. Despite the strong correlations observed between motor unit recruitment and limb kinematics (Fig. 6, Supplemental Fig. 3), these results might reflect covariations of both factors with locomotor speed rather than the causal properties of the recorded motor unit.”

      For better clarity, we have changed the text to read:

      “Although strong correlations were observed between motor unit recruitment and limb kinematics during locomotion (Figure 6, Figure 6–figure supplement 1), it remains unclear whether such correlations actually reflect the causal contributions that those units make to limb movement. To resolve this ambiguity, future studies could use electrical or optical perturbations of muscle contraction levels (Kim et al., 2024; Lu et al., 2024; Srivastava et al., 2015, 2017) to test directly how motor unit firing patterns shape locomotor movements. The short-latency effects of patterned motor unit stimulation (Srivastava et al., 2017) could then reveal the sensitivity of behavior to changes in muscle spiking and the extent to which the same behaviors can be performed with many different motor commands.”

      Reviewer #2 (Recommendations for the authors):

      Minor comments:

      Introduction:

      (1) "Although studies in primates, cats, and zebrafish have shown that both the number of active motor units and motor unit firing rates increase at faster locomotor speeds (Grimby, 1984; Hoffer et al., 1981, 1987; Marshall et al., 2022; Menelaou & McLean, 2012)." I would remove Marshall et al. (2022) as their monkeys performed pulling tasks with the upper limb. You can alternatively remove locomotor from the sentence and replace it with contraction speed.

      Thank you for the comment. While we intended to reference this specific paper to highlight the rhythmic activity in muscles, we agree that this deviates from ‘locomotion’ as it is referenced in the other cited papers which study body movement. We have followed the Reviewer’s suggestion to remove the citation to Marshall et al.

      (2) "The capability and need for faster force generation during dynamic behavior could implicate motor unit recruitment as a primary mechanism for modulating force output in mice."

      The authors could add citations to this sentence, of works that showed that recruitment speed is the main determinant of the rate of force development (see for example Dideriksen et al. (2020) J Neurophysiol; J. L. Dideriksen, A. Del Vecchio, D. Farina, Neural and muscular determinants of maximal rate of force development. J Neurophysiol 123, 149-157 (2020)).

      Thank you for pointing out this important reference. We have included this as a citation as recommended.

      Results:

      (3) "Electrode arrays (32-electrode Myomatrix array model RF-4x8-BHS-5) were implanted in the triceps brachii (note that Figure 1D shows the EMG signal from only one of the 16 bipolar recording channels), and the resulting data were used to identify the spike times of individual motor units (Figure 1E) as described previously (Chung et al., 2023)."

      This sentence can be misleading for the reader as the array used by the researchers has 4 threads of 8 electrodes. Would it be possible to specify the number of electrodes implanted per head of interest? I assume 8 per head in most mice (or 4 bipolar channels), even if that's not specifically written in the manuscript.

      Thank you for the suggestion. As described above, we have added Table 1, which includes all array locations, and we edited the statement referenced in the comment as follows:

      “Electrode arrays (32-electrode Myomatrix array model RF-4x8-BHS-5) were implanted in forelimb muscles (note that Figure 1D shows the EMG signal from only one of the 16 bipolar recording channels), and the resulting data were used to identify the spike times of individual motor units in the triceps brachii long and lateral heads (Table 1, Figure 1E) as described previously (Chung et al., 2023).“

      (4) "These findings demonstrate that despite the overlapping biomechanical functions of the long and lateral heads of the triceps, the nervous system creates a consistent, approximately 100 ms delay (Figure 3C) between the activation of the two muscles' motor neuron pools. This timing difference suggests distinct patterns of synaptic input onto motor neurons innervating the lateral and long heads."

      Both muscles don't have fully overlapping biomechanical functions, as one of them also acts on the shoulder joint. Please be more specific in this sentence, saying that both muscles are synergistic at the elbow level rather than "have overlapping biomechanical functions".

      We agree with the above reasoning and that our manuscript should be clearer on this point. We edited the above text in accordance with the Reviewer suggestion as follows:

      "These findings demonstrate that despite the synergistic (extensor) function of the long and lateral heads of the triceps at the elbow, …”  

      (5) "Together with the differences in burst timing shown in Figure 3B, these results again suggest that the motor pools for the lateral and long heads of the triceps receive distinct patterns of synaptic input, although differences in the intrinsic physiological properties of motor neurons innervating the two muscles might also play an important role."

      It is difficult to draw such an affirmative conclusion on the synaptic inputs from the data presented by the authors. The differences in firing rates may solely arise from other factors than distinct synaptic inputs, such as the different intrinsic properties of the motoneurons or the reception of distinct neuromodulatory inputs.

      To better explain our findings, we adjusted the above text in the Results (see “Motor unit firing patterns in the long and lateral heads of the triceps”):

      “Together with the differences in burst timing shown in Figure 3B, these results again suggest that the motor pools for the lateral and long heads of the triceps receive distinct patterns of synaptic input, although differences in the intrinsic physiological properties of motor neurons innervating the two muscles might also play an important role.”

      We also included the following distinction in the Discussion (see “Differences in motor unit activity patterns across two elbow extensors”) to address the other plausible mechanisms mentioned.

      “The large differences in burst timing and spike patterning across the muscle heads suggest that the motor pools for each muscle receive distinct inputs. However, differences in the intrinsic physiological properties of motor units and neuromodulatory inputs across motor pools might also make substantial contributions to the structure of motor unit spike patterns (Martínez-Silva et al., 2018; Miles & Sillar, 2011).”

      (6) "We next examined whether the probabilistic recruitment of individual motor units in the triceps and elbow extensor muscle predicted stride-by-stride variations in elbow angle kinematics."

      I'm not sure that the wording is appropriate here. The analysis does not predict elbow angle variations from parameters extracted from the spiking activity. It rather compares the average elbow angle between two conditions (motor unit active or not active).

      We thank the Reviewer for this comment and agree that the wording could be improved here to better reflect our analysis. To lower the strength of our claim, we replaced usage of the word ‘predict’ with ‘correlates’ in the above text and throughout the paper when discussing this result.

      Methods:

      (7) "Using the four threads on the customizable Myomatrix array (RF-4x8-BHS-5), we implanted a combination of muscles in each mouse, sometimes using multiple threads within the same muscle. [...] Some mice also had threads simultaneously implanted in their ipsilateral or contralateral biceps brachii although no data from the biceps is presented in this study."

      A precise description of the localisation of the array (muscles and the number of arrays per muscle) for each animal would be appreciated.

      (8) "A total of 33 units were identified and manually verified across all animals." A precise description of the number of motor units concurrently identified per muscle and per animal would be appreciated. Moreover, please add details on the manual inspection. Does it involve the manual selection of missing spikes? What are the criteria for considering an identified motor unit as valid?

      As discussed earlier, we added Table 1 to the main text to provide the details mentioned in the above comments.

      Regarding spike sorting, given the very large number of spikes recorded, we did not rely on manual adjusting mislabeled spikes. Instead, as described in the revised Methods section, we verified unit isolation by ensuring units had >98% of spikes outside of 1ms of each other. Moreover, as described above we have added new analyses (Figure 1–figure supplement 1) confirming the stability of motor unit waveforms across both the duration of individual recording sessions (roughly 30 minutes) and across the rapid changes in limb position within individual stride cycles (roughly 250 msec).

      Reviewer #3 (Recommendations for the authors):

      Figure 2 (and supplement) show spike count distributions with strong positive skewness, which is in accordance with the prediction of a fluctuation-driven regime. I suggest plotting these on a logarithmic x-axis (in addition to the linear axis), which should reveal a bell-shaped distribution, maybe even Gaussian, in a majority of the units.

      We thank the Reviewer for the suggestion. We present the requested analysis below, which shows bell-shaped distributions for some (but not all) distributions. However, we believe that investigating why some replotted distributions are Gaussian and others are not falls beyond the scope of this paper, and likely requires a larger dataset than the one we were able to obtain.

      Author response image 3.

      Spike count distributions for each motor unit on a logarithmic x-axis.

      Why not more data? I tried to get an overview of how much data was collected.

      Supplemental Figure 1 has all the isolated units, which amounts to 38 (are the colors the two muscle types?). Given there are 16 leads in each myomatrix, in two muscles, of six mice, this seems like a low yield. Could the authors comment on the reasons for this low yield?

      Regarding motor unit yield, even with multiple electrodes per muscle and a robust sorting algorithm, we often isolated only a few units per muscle. This yield likely reflects two factors. First, because of the highly dynamic nature of locomotion and high levels of muscle contraction, isolating individual spikes reliably across different locomotor speeds is inherently challenging, regardless of the algorithm being employed. Second, because the results of spike-train analyses can be highly sensitive to sorting errors, we have only included the motor units that we can sort with the highest possible confidence across thousands of strides.

      Minor:

      Figure captions especially Figure 6: The text is excessively long. Can the text be shortened?

      We thank the Reviewer for this comment. Generally, we seek to include a description of the methods and results within the figure captions, but we concede that we can condense the information in some cases. In a number of cases, we have moved some of the descriptive text from the caption to the Methods section.

      References

      Berg, R. W. (2017). Neuronal Population Activity in Spinal Motor Circuits: Greater Than the Sum of Its Parts. Frontiers in Neural Circuits, 11. https://doi.org/10.3389/fncir.2017.00103

      Biewener, A. A., Blickhan, R., Perry, A. K., Heglund, N. C., & Taylor, C. R. (1988). Muscle Forces During Locomotion in Kangaroo Rats: Force Platform and Tendon Buckle Measurements Compared. Journal of Experimental Biology, 137(1), 191–205. https://doi.org/10.1242/jeb.137.1.191

      Chung, B., Zia, M., Thomas, K. A., Michaels, J. A., Jacob, A., Pack, A., Williams, M. J., Nagapudi, K., Teng, L. H., Arrambide, E., Ouellette, L., Oey, N., Gibbs, R., Anschutz, P., Lu, J., Wu, Y., Kashefi, M., Oya, T., Kersten, R., … Sober, S. J. (2023). Myomatrix arrays for high-definition muscle recording. eLife, 12, RP88551. https://doi.org/10.7554/eLife.88551

      De Luca, C. J. (1985). Control properties of motor units. Journal of Experimental Biology, 115(1), 125–136. https://doi.org/10.1242/jeb.115.1.125

      De Luca, C. J., & Erim, Z. (1994). Common drive of motor units in regulation of muscle force. Trends in Neurosciences, 17(7), 299–305. https://doi.org/10.1016/0166-2236(94)90064-7

      Farina, D., Negro, F., & Dideriksen, J. L. (2014). The effective neural drive to muscles is the common synaptic input to motor neurons. The Journal of Physiology, 592(16), 3427–3441. https://doi.org/10.1113/jphysiol.2014.273581

      Hartigan, P. M. (1985). Algorithm AS 217: Computation of the Dip Statistic to Test for Unimodality. Applied Statistics, 34(3), 320. https://doi.org/10.2307/2347485

      Henneman, E., Somjen, G., & Carpenter, D. O. (1965). FUNCTIONAL SIGNIFICANCE OF CELL SIZE IN SPINAL MOTONEURONS. Journal of Neurophysiology, 28(3), 560–580. https://doi.org/10.1152/jn.1965.28.3.560

      Karabulut, D., Dogru, S. C., Lin, Y.-C., Pandy, M. G., Herzog, W., & Arslan, Y. Z. (2020). Direct Validation of Model-Predicted Muscle Forces in the Cat Hindlimb During Locomotion. Journal of Biomechanical Engineering, 142(5), 051014. https://doi.org/10.1115/1.4045660

      Kim, J. J., Wyche, I. S., Olson, W., Lu, J., Bakir, M. S., Sober, S. J., & O’Connor, D. H. (2024). Myo-optogenetics: Optogenetic stimulation and electrical recording in skeletal muscles. https://doi.org/10.1101/2024.06.21.600113

      Lu, J., Zia, M., Baig, D. A., Yan, G., Kim, J. J., Nagapudi, K., Anschutz, P., Oh, S., O’Connor, D., Sober, S. J., & Bakir, M. S. (2024). Opto-Myomatrix: μLED integrated microelectrode arrays for optogenetic activation and electrical recording in muscle tissue. https://doi.org/10.1101/2024.07.01.601601

      Manuel, M., & Heckman, C. J. (2011). Adult mouse motor units develop almost all of their force in the subprimary range: A new all-or-none strategy for force recruitment? Journal of Neuroscience, 31(42), 15188–15194. https://doi.org/10.1523/JNEUROSCI.2893-11.2011

      Marshall, N. J., Glaser, J. I., Trautmann, E. M., Amematsro, E. A., Perkins, S. M., Shadlen, M. N., Abbott, L. F., Cunningham, J. P., & Churchland, M. M. (2022). Flexible neural control of motor units. Nature Neuroscience, 25(11), 1492–1504. https://doi.org/10.1038/s41593-022-01165-8

      Martínez-Silva, M. de L., Imhoff-Manuel, R. D., Sharma, A., Heckman, C. J., Shneider, N. A., Roselli, F., Zytnicki, D., & Manuel, M. (2018). Hypoexcitability precedes denervation in the large fast-contracting motor units in two unrelated mouse models of ALS. eLife, 7(2007), 1–26. https://doi.org/10.7554/eLife.30955

      Miles, G. B., & Sillar, K. T. (2011). Neuromodulation of Vertebrate Locomotor Control Networks. Physiology, 26(6), 393–411. https://doi.org/10.1152/physiol.00013.2011

      Petersen, P. C., & Berg, R. W. (2016). Lognormal firing rate distribution reveals prominent fluctuation–driven regime in spinal motor networks. eLife, 5. https://doi.org/10.7554/elife.18805

      Srivastava, K. H., Elemans, C. P. H., & Sober, S. J. (2015). Multifunctional and Context-Dependent Control of Vocal Acoustics by Individual Muscles. The Journal of Neuroscience, 35(42), 14183–14194. https://doi.org/10.1523/JNEUROSCI.3610-14.2015

      Srivastava, K. H., Holmes, C. M., Vellema, M., Pack, A. R., Elemans, C. P. H., Nemenman, I., & Sober, S. J. (2017). Motor control by precisely timed spike patterns. Proceedings of the National Academy of Sciences of the United States of America, 114(5), 1171–1176. https://doi.org/10.1073/pnas.1611734114

    Annotators

    1. analizaron

      mediante el uso de la implementacion numerica presentada en el capitulo 6 y el uso de las herramientas computacionales descritas en el capitulo 7

    Annotators

    1. Reviewer #3 (Public review):

      The paper presents a synaptic mechanism for chunking in working memory, extending previous work of the last author by introducing specialized "chunking clusters", neural populations that can dynamically segment incoming items into chunks. The idea is that this enables hierarchical representations that increase the effective capacity of working memory. They also derive a theoretical bound for working memory capacity based on this idea, suggesting that hierarchical chunking expands the number of retrievable items beyond the basic WM capacity. Finally, they present neural and behavioral data related to their hypothesis.

      Strengths

      A major strength of the paper is its clear theoretical ambition of developing a mechanistic model of working memory chunking.

      Weaknesses

      Despite the inspiration in biophysical mechanisms (short-term synaptic plasticity with different time constants), the model is "cartoonish". It is unclear whether the proposed mechanism would work reliably in the presence of noise and non-zero background activity or in a more realistic implementation (e.g., a spiking network).

      As far as I know, there is no evidence for cyclic neural activation patterns, which are supposed to limit WM capacity (such as in Figure 1d). In fact, I believe there is no evidence for population bursts in WM, which are a crucial ingredient of the model. For example, Panicello et al. 2024 have found evidence for periods during which working memory decoding accuracy decreases, but no population bursts were observed in their data. In brief, my critique is that including some biophysical mechanism in an abstract model does not make the model plausible per se.

      It is claimed that "our proposed chunking mechanism applies to both the persistent-activity and periodic-activity regimes, with chunking clusters serving the same function in each", but this is not shown. If the results and model predictions are the same, irrespective of whether WM is activity-silent or persistent, I suggest highlighting this more and including the corresponding simulations.

      The empirical validations of the model are weak. The single-unit analysis is purely descriptive, without any statistical quantification of the apparent dip-ramp pattern. I agree that the dip-ramp pattern may be consistent with the proposed model, but I don't believe that this pattern is a specific prediction of the proposed model. It seems just to be an interesting observation that may be compatible with several network mechanisms involving some inhibition and a rebound.

      Moreover, the reanalyses of n-gram behavioral data do not constitute a mechanistic test of the model. The "new magic number" depends strongly on structural assumptions about how chunking operates, and it is unclear whether human working memory uses the specific hierarchical scheme required to achieve the predicted limit.

      The presentation of the modeling results is highly compressed in two figures and is rather hard to follow. Plotting the activity of different neural clusters in separate subplots or as heatmaps (x-axis time, y-axis neural population, color = firing rate) would help to clarify (Figure 1d). Also, control signals that activate the chunking clusters should be shown.

      Overall, the theoretical proposal is interesting, but its empirical grounding and biological plausibility need to be substantially reinforced.

    1. Una solución serían los artículos de fuente única y salidas distintas (PDF y Web estática/interactiva) en los que la marginalia puede ser usada para colocar notas extendidas en forma de enlaces abreviados, AprilTags o códigos QR que apunten a las versiones expandidas de esos códigos en los formatos estáticos. Esto haría que esos códigos extendidos se presenten por demanda si la lectora/exploradora los desea, en el medio (impreso, web) donde acceda al artículo.

      Afortunadamente, para las ciencias humanas y sociales, esta computación está aún en desuso y se pueden explorar otras dinámicas de escritura en digital mientras las acá criticadas maduran y mientras se trabaja el problema de reproducibilidad desdes otros lugares, como Cardumem

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      I read the paper by Parrotta et al with great interest. The authors are asking an interesting and important question regarding pain perception, which is derived from predictive processing accounts of brain function. They ask: If the brain indeed integrates information coming from within the body (interoceptive information) to comprise predictions about the expected incoming input and how to respond to it, could we provide false interoceptive information to modulate its predictions, and subsequently alter the perception of such input? To test this question, they use pain as the input and the sounds of heartbeats (falsified or accurate) as the interoceptive signal.

      Strengths:

      I found the question well-established, interesting, and important, with important implications and contributions for several fields, including neuroscience of prediction-perception, pain research, placebo research, and health psychology. The paper is well-written, the methods are adequate, and the findings largely support the hypothesis of the authors. The authors carried out a control experiment to rule out an alternative explanation of their finding, which was important.

      Weaknesses:

      I will list here one theoretical weakness or concern I had, and several methodological weaknesses.

      The theoretical concern regards what I see as a misalignment between a hypothesis and a result, which could influence our understanding of the manipulation of heartbeats, and its meaning: The authors indicate from prior literature and find in their own findings, that when preparing for an aversive incoming stimulus, heartbeats *decrease*. However, in their findings, manipulating the heartbeats that participants hear to be slower than their own prior to receiving a painful stimulus had *no effect* on participants' actual heartbeats, nor on their pain perceptions. What authors did find is that when listening to heartbeats that are *increased* in frequency - that was when their own heartbeats decreased (meaning they expected an aversive stimulus) and their pain perceptions increased.

      This is quite complex - but here is my concern: If the assumption is that the brain is collecting evidence from both outside and inside the body to prepare for an upcoming stimulus, and we know that *slowing down* of heartbeats predicts an aversive stimulus, why is it that participants responded in a change in pain perception and physiological response when listened to *increased heartbeats* and not decreased? My interpretation is that the manipulation did not fool the interoceptive signals that the brain collects, but rather the more conscious experience of participants, which may then have been translated to fear/preparation for the incoming stimulus. As the authors indicate in the discussion (lines 704-705), participants do not *know* that decreased heartbeats indicate upcoming aversive stimulus, and I would even argue the opposite - the common knowledge or intuitive response is to increase alertness when we hear increased heartbeats, like in horror films or similar scenarios. Therefore, the unfortunate conclusion is that what the authors assume is a manipulation of interoception - to me seems like a manipulation of participants' alertness or conscious experience of possible danger. I hope the (important) distinction between the two is clear enough because I find this issue of utmost importance for the point the paper is trying to make. If to summarize in one sentence - if it is decreased heartbeats that lead the brain to predict an approaching aversive input, and we assume the manipulation is altering the brain's interoceptive data collection, why isn't it responding to the decreased signal? --> My conclusion is, that this is not in fact a manipulation of interoception, unfortunately

      We thank the reviewer for their comment, which gives us the opportunity to clarify what we believe is a theoretical misunderstanding that we have not sufficiently made clear in the previous version of the manuscript. The reviewer suggests that a decreased heart rate itself might act as an internal cue for a forthcoming aversive stimulus, and questions why our manipulation of slower heartbeats then did not produce measurable effects.

      The central point is this: decreased heart rate is not a signal the brain uses to predict a threat, but is a consequence of the brain having already predicted the threat. This distinction is crucial. The well-known anticipatory decrease of heartrate serves an allostatic function: preparing the body in advance so that physiological responses to the actual stressor (such as an increase in sympathetic activation) do not overshoot. In other words, the deceleration is an output of the predictive model, not an input from which predictions are inferred. It would be maladaptive for the brain to predict threat through a decrease in heartrate, as this would then call for a further decrease, creating a potential runaway cycle.

      Instead, increased heart rate is a salient and evolutionarily conserved cue for arousal, threat, and pain. This association is reinforced both culturally - for example, through the use of accelerating heartbeats in films and media to signal urgency, as R1 mentions - and physiologically, as elevated heart rates reliably occur in response to actual (not anticipated) stressors. Decreased heartrates, in contrast, are reliably associated with the absence of stressors, for example during relaxation and before (and during) sleep. Thus, across various everyday experiences, increased (instead of decreased) heartrates are robustly associated with actual stressors, and there is no a priori reason to assume that the brain would treat decelerating heartrates as cue for threat. As we argued in previous work, “the relationship between the increase in cardiac activity and the anticipation of a threat may have emerged from participants’ first-hand experience of increased heart rates to actual, not anticipated, pain” (Parrotta et al., 2024). The changes in heart rate and pain perception that we hypothesize (and observe) are therefore fully in line with the prior literature on the anticipatory compensatory heartrate response (Bradley et al., 2008, 2005; Colloca et al., 2006; Lykken et al., 1972; Taggart et al., 1976; Tracy et al., 2017; Skora et al., 2022), as well as with Embodied Predictive Coding models (Barrett & Simmons, 2015; Pezzulo, 2014; Seth, 2013; Seth et al., 2012), which assume that our body is regulated through embodied simulations that anticipate likely bodily responses to upcoming events, thereby enabling anticipatory or allostatic regulation of physiological states (Barrett, 2017).

      We now add further explanation to this point to the Discussion (lines 740-758) and Introduction (lines 145-148; 154-156) of our manuscript to make this important point clearer.

      Barrett, L. F., & Simmons, W. K. (2015). Interoceptive predictions in the brain. Nature reviews neuroscience, 16(7), 419-429.

      Barrett, L. F. (2017). The theory of constructed emotion: An active inference account of interoception and categorization. Social cognitive and affective neuroscience, 12(1), 1-23.

      Bradley, M. M., Moulder, B., & Lang, P. J. (2005). When good things go bad: The reflex physiology of defense. Psychological science, 16(6), 468-473.

      Bradley, M. M., Silakowski, T., & Lang, P. J. (2008). Fear of pain and defensive activation. PAIN®, 137(1), 156-163.

      Colloca, L., Petrovic, P., Wager, T. D., Ingvar, M., & Benedetti, F. (2010). How the number of learning trials affects placebo and nocebo responses. Pain®, 151(2), 430-439.

      Lykken, D., Macindoe, I., & Tellegen, A. (1972). Preception: Autonomic response to shock as a function of predictability in time and locus. Psychophysiology, 9(3), 318-333.

      Taggart, P., Hedworth-Whitty, R., Carruthers, M., & Gordon, P. D. (1976). Observations on electrocardiogram and plasma catecholamines during dental procedures: The forgotten vagus. British Medical Journal, 2(6039), 787-789.

      Tracy, L. M., Gibson, S. J., Georgiou-Karistianis, N., & Giummarra, M. J. (2017). Effects of explicit cueing and ambiguity on the anticipation and experience of a painful thermal stimulus. PloS One, 12(8), e0183650.

      Parrotta, E., Bach, P., Perrucci, M. G., Costantini, M., & Ferri, F. (2024). Heart is deceitful above all things: Threat expectancy induces the illusory perception of increased heartrate. Cognition, 245, 105719.

      Pezzulo, G. (2014). Why do you fear the bogeyman? An embodied predictive coding model of perceptual inference. Cognitive, Affective & Behavioral Neuroscience, 14(3), 902-911.

      Seth, A., Suzuki, K., & Critchley, H. (2012). An Interoceptive Predictive Coding Model of Conscious Presence. Frontiers in Psychology, 2. https://www.frontiersin.org/articles/10.3389/fpsyg.2011.00395

      Seth, A. K. (2013). Interoceptive inference, emotion, and the embodied self. Trends in Cognitive Sciences, 17(11), 565-573.

      Skora, L. I., Livermore, J. J. A., & Roelofs, K. (2022). The functional role of cardiac activity in perception and action. Neuroscience & Biobehavioral Reviews, 104655.

      I will add that the control experiment - with an exteroceptive signal (knocking of wood) manipulated in a similar manner - could be seen as evidence of the fact that heartbeats are regarded as an interoceptive signal, and it is an important control experiment, however, to me it seems that what it is showing is the importance of human-relevant signals to pain prediction/perception, and not directly proves that it is considered interoceptive. For example, it could be experienced as a social cue of human anxiety/fear etc, and induce alertness.

      The reviewer asks us to consider whether our measured changes in pain response happen not because the brain treats the heartrate feedback in Experiment 1 as interoceptive stimulus, but because heartbeat sounds could have signalled threat on a more abstract, perhaps metacognitive or affective, level, in contrast to the less visceral control sounds in Experiment 2. We deem this highly unlikely for several reasons.

      First, as we point out in our response to Reviewer 3 (Point 3), if this were the case, the different sounds in both experiments should have induced overall (between-experiment) differences in pain perception and heart rate, induced by the (supposedly) generally more threatening heart beat sounds. However, when we added such comparisons, no such between-experiment differences were obtained (See Results Experiment 2, and Supplementary Materials, Cross-experiment analysis between-subjects model). Instead, we only find a significant interaction between experiment and feedback (faster, slower). Thus, it is not the heartbeat sounds per se that induce the measured changes to pain perception, but the modulation of their rate, and that identical changes to the rate of non-heartrate sounds produce no such effects. In other words, pain perception is sensitive to a change in heart rate feedback, as we predicted, instead of the overall presence of heartbeat sounds (as one would need to predict if heart beat sounds had more generally induced threat or stress).

      Second, one may suspect that it is precisely the acceleration of heartrate feedback that could act as cue to arousal, while accelerated exteroceptive feedback would not. However, if this were the case, one would need to predict a general heart rate increase with accelerated feedback, as this is the general physiological marker of increasing alertness and arousal (e.g. Tousignant-Laflamme et al., 2005; Terkelsen et al., 2005; for a review, see Forte et al., 2022). However, the data shows the opposite, with real heartrates decreasing when the heartrate feedback increases. This result is again fully in line with the predicted interoceptive consequences of accelerated heartrate feedback, which mandates an immediate autonomic regulation, especially when preparing for an anticipated stressor.

      Third, our view is further supported by neurophysiological evidence showing that heartbeat sounds, particularly under the belief they reflect one’s own body, are not processed merely as generic aversive or “human-relevant” signals. For instance, Vicentin et al. (2024) showed that simulated faster heartbeat sounds elicited stronger EEG alpha-band suppression, indicative of increased cortical activation  over frontocentral and right frontal areas, compatible with the localization of brain regions contributing to interoceptive processes (Kleint et al., 2015). Importantly, Kleint et al. also demonstrated via fMRI that heartbeat sounds, compared to acoustically matched tones, selectively activate bilateral anterior insula and frontal operculum, key hubs of the interoceptive network. This suggests that the semantic identity of the sound as a heartbeat is sufficient to elicit internal body representations, despite its exteroceptive nature. Further evidence comes from van Elk et al. (2014), who found that heartbeat sounds suppress the auditory N1 component, a neural marker of sensory attenuation typically associated with self-generated or predicted stimuli. The authors interpret this as evidence that the brain treats heartbeat sounds as internally predicted bodily signals, supporting interoceptive predictive coding accounts in which exteroceptive cues (i.e., auditory cardiac feedback) are integrated with visceral information to generate coherent internal body representations.

      Finally, it is worth noting that the manipulation of heartrate feedback in our study elicited measurable compensatory changes in participants’ actual heart rate. This is striking compared to our previous work (Parrotta et al., 2024), wherein we used a highly similar design as here, combined with a very strong threat manipulation. Specifically, we presented participants with highly salient threat cues (knives directed at an anatomical depiction of a heart), which predicted forthcoming pain with 100% validity (compared to flowers that did predict the absence of pain with 100%). In other words, these cues perfectly predicted actual pain, through highly visceral stimuli. Nevertheless, we found no measurable decrease in actual heartrate. From an abstract threat perspective, it is therefore striking that the much weaker manipulation of slightly increased or decreased heartrates we used here would induce such a change. The difference therefore suggests that what caused the response here is not due to an abstract feeling of threat, but because the brain indeed treated the increased heartrate feedback as an interoceptive signal for (stressor-induced) sympathetic activation, which would then be immediately down-regulated.

      Together, we hope you agree that these considerations make a strong case against a non-specific, arousal or alertness-related explanation of our data. We now make this point clearer in the new paragraph of the Discussion (Accounting for general unspecific contributionslines 796-830), and have added the relevant between experiment comparisons to the Results of Experiment 2.

      Forte, G., Troisi, G., Pazzaglia, M., Pascalis, V. D., & Casagrande, M. (2022). Heart rate variability and pain: a systematic review. Brain sciences, 12(2), 153.

      Vicentin, S., Guglielmi, S., Stramucci, G., Bisiacchi, P., & Cainelli, E. (2024). Listen to the beat: behavioral and neurophysiological correlates of slow and fast heartbeat sounds. International Journal of Psychophysiology, 206, 112447.

      Kleint, N. I., Wittchen, H. U., & Lueken, U. (2015). Probing the interoceptive network by listening to heartbeats: an fMRI study. PloS one, 10(7), e0133164.

      Parrotta, E., Bach, P., Perrucci, M. G., Costantini, M., & Ferri, F. (2024). Heart is deceitful above all things: Threat expectancy induces the illusory perception of increased heartrate. Cognition, 245, 105719.

      Terkelsen, A. J., Mølgaard, H., Hansen, J., Andersen, O. K., & Jensen, T. S. (2005). Acute pain increases heart rate: differential mechanisms during rest and mental stress. Autonomic Neuroscience, 121(1-2), 101-109.

      Tousignant-Laflamme, Y., Rainville, P., & Marchand, S. (2005). Establishing a link between heart rate and pain in healthy subjects: a gender effect. The journal of pain, 6(6), 341-347.

      van Elk, M., Lenggenhager, B., Heydrich, L., & Blanke, O. (2014). Suppression of the auditory N1-component for heartbeat-related sounds reflects interoceptive predictive coding. Biological psychology, 99, 172-182.

      Several additional, more methodological weaknesses include the very small number of trials per condition - the methods mention 18 test trials per participant for the 3 conditions, with varying pain intensities, which are later averaged (and whether this is appropriate is a different issue). This means 6 trials per condition, and only 2 trials per condition and pain intensity. I thought that this number could be increased, though it is not a huge concern of the paper. It is, however, needed to show some statistics about the distribution of responses, given the very small trial number (see recommendations for authors). The sample size is also rather small, on the verge of "just right" to meet the required sample size according to the authors' calculations.

      We provide detailed responses to these points in the “Recommendations for The Authors” section, where each of these issues is addressed point by point in response to the specific questions raised.

      Finally, and just as important, the data exists to analyze participants' physiological responses (ECG) after receiving the painful stimulus - this could support the authors' claims about the change in both subjective and objective responses to pain. It could also strengthen the physiological evidence, which is rather weak in terms of its effect. Nevertheless, this is missing from the paper.

      This is indeed an interesting point, and we agree that analyzing physiological responses such as ECG following the painful stimulus could offer additional insights into the objective correlates of pain. However, it is important to clarify that the experiment was not designed to investigate post-stimulus physiological responses. Our primary focus was on the anticipatory processes leading up to the pain event. Notably, in the time window immediately following the stimulus - when one might typically expect to observe physiological changes such as an increase in heart rate - participants were asked to provide subjective ratings of their nociceptive experience. It is therefore not a “clean” interval that would lend itself for measurement, especially as a substantial body of evidence indicates that one’s heart rate is strongly modulated by higher-order cognitive processes, including attentional control, executive functioning, decision-making and action itself (e.g., Forte et al., 2021a; Forte et al., 2021b; Luque-Casado et al., 2016).

      This limitation is particularly important as the induced change in pain ratings by our heart rate manipulation is substantially smaller than the changes in heart rate induced by actual pain (e.g., Loggia et al., 2011). To confirm this for our study, we simply estimated how much change in heart rate is produced by a change in actual stimulus intensity in the initial no feedback phase of our experiment. There, we find that a change between stimulus intensities 2 and 4 induces a NPS change of 32.95 and a heart rate acceleration response of 1.19 (difference in heart rate response relative to baseline, Colloca et al., 2006), d = .52, p < .001. The change of NPS induced by our implicit heart rate manipulation, however, is only a seventh of this (4.81 on the NPS). This means that the expected effect size of heart rate acceleration produced by our manipulation would only be d = .17. A power analysis, using GPower, reveals that a sample size of n = 266 would be required to detect such an effect, if it exists. Thus, while we agree that this is an exciting hypothesis to be tested, it requires a specifically designed study, and a much larger sample than was possible here.

      Colloca, L., Benedetti, F., & Pollo, A. (2006). Repeatability of autonomic responses to pain anticipation and pain stimulation. European Journal of Pain, 10(7), 659-665.

      Forte, G., Morelli, M., & Casagrande, M. (2021a). Heart rate variability and decision-making: Autonomic responses in making decisions. Brain sciences, 11(2), 243.

      Forte, G., Favieri, F., Oliha, E. O., Marotta, A., & Casagrande, M. (2021b). Anxiety and attentional processes: the role of resting heart rate variability. Brain sciences, 11(4), 480.

      Loggia, M. L., Juneau, M., & Bushnell, M. C. (2011). Autonomic responses to heat pain: Heart rate, skin conductance, and their relation to verbal ratings and stimulus intensity. PAIN®, 152(3), 592-598.

      Luque-Casado, A., Perales, J. C., Cárdenas, D., & Sanabria, D. (2016). Heart rate variability and cognitive processing: The autonomic response to task demands. Biological psychology, 113, 83-90

      I have several additional recommendations regarding data analysis (using an ANOVA rather than multiple t-tests, using raw normalized data rather than change scores, questioning the averaging across 3 pain intensities) - which I will detail in the "recommendations for authors" section.

      We provide detailed responses to these points in the “Recommendations for The Authors” section, where each of these issues is addressed point by point in response to the specific questions raised.

      Conclusion:

      To conclude, the authors have shown in their findings that predictions about an upcoming aversive (pain) stimulus - and its subsequent subjective perception - can be altered not only by external expectations, or manipulating the pain cue, as was done in studies so far, but also by manipulating a cue that has fundamental importance to human physiological status, namely heartbeats. Whether this is a manipulation of actual interoception as sensed by the brain is - in my view - left to be proven.

      Still, the paper has important implications in several fields of science ranging from neuroscience prediction-perception research, to pain and placebo research, and may have implications for clinical disorders, as the authors propose. Furthermore, it may lead - either the authors or someone else - to further test this interesting question of manipulation of interoception in a different or more controlled manner.

      I salute the authors for coming up with this interesting question and encourage them to continue and explore ways to study it and related follow-up questions.

      We sincerely thank the reviewer for the thoughtful and encouraging feedback. We hope our responses to your points below convince you a bit more that what we are measuring does indeed capture interoceptive processes, but we of course fully acknowledge that additional measures - for example from brain imaging (or computational modelling, see Reviewer 3) - could further support our interpretation, and highlights in the Limitations and Future directions section.

      Reviewer #2 (Public Review):

      In this manuscript, Parrotta et al. tested whether it is possible to modulate pain perception and heart rate by providing false HR acoustic feedback before administering electrical cutaneous shocks. To this end, they performed two experiments. The first experiment tested whether false HR acoustic feedback alters pain perception and the cardiac anticipatory response. The second experiment tested whether the same perceptual and physiological changes are observed when participants are exposed to a non-interoceptive feedback. The main results of the first experiment showed a modulatory effect for faster HR acoustic feedback on pain intensity, unpleasantness, and cardiac anticipatory response compared to a control (acoustic feedback congruent to the participant's actual HR). However, the results of the second experiment also showed an increase in pain ratings for the faster non-interoceptive acoustic feedback compared to the control condition, with no differences in pain unpleasantness or cardiac response.

      The main strengths of the manuscript are the clarity with which it was written, and its solid theoretical and conceptual framework. The researchers make an in-depth review of predictive processing models to account for the complex experience of pain, and how these models are updated by perceptual and active inference. They follow with an account of how pain expectations modulate physiological responses and draw attention to the fact that most previous studies focus on exteroceptive cues. At this point, they make the link between pain experience and heart rate changes, and introduce their own previous work showing that people may illusorily perceive a higher cardiac frequency when expecting painful stimulation, even though anticipating pain typically goes along with a decrease in HR. From here, they hypothesize that false HR acoustic feedback evokes more intense and unpleasant pain perception, although the actual HR actually decreases due to the orienting cardiac response. Furthermore, they also test the hypothesis that an exteroceptive cue will lead to no (or less) changes in those variables. The discussion of their results is also well-rooted in the existing bibliography, and for the most part, provides a credible account of the findings.

      Thank you for the clear and thoughtful review. We appreciate your positive comments on the manuscript’s clarity, theoretical framework, and interpretation of results.

      The main weaknesses of the manuscript lies in a few choices in methodology and data analysis that hinder the interpretation of the results and the conclusions as they stand.

      The first peculiar choice is the convoluted definition of the outcomes. Specifically, pain intensity and unpleasantness are first normalized and then transformed into variation rates (sic) or deltas, which makes the interpretation of the results unnecessarily complicated. This is also linked to the definitions of the smallest effect of interest (SESOI) in terms of these outcomes, which is crucial to determining the sample size and gauging the differences between conditions. However, the choice of SESOI is not properly justified, and strangely, it changes from the first experiment to the second.

      We thank the reviewer for this important observation. In the revised manuscript, we have made substantial changes and clarifications to address both aspects of this concern: (1) the definition of outcome variables and their normalization, and (2) the definition of the SESOI.

      First, As explained in our response to Reviewer #1, we have revised the analyses and removed the difference-based change scores from the main results, addressing concerns about interpretability. However, we retained the normalization procedure: all variables (heart rate, pain intensity, unpleasantness) are normalized relative to the no-feedback baseline using a standard proportional change formula (X−bX)/bX(X - bX)/bX(X−bX)/bX, where X is the feedback-phase mean and bX is the no-feedback baseline. This is a widely used normalization procedure (e.g., Bartolo et al., 2013; Cecchini et al., 2020). This method controls for interindividual variability by expressing responses relative to each participant’s own baseline. The resulting normalized values are then used directly in all analyses, and not further transformed into deltas.

      To address potential concerns about this baseline correction approach and its interpretability, we also conducted a new set of supplementary analyses (now reported in the supplementary materials) that include the no-feedback condition explicitly in the models, rather than treating it as a baseline for normalization. These models confirm that our main effects are not driven by the choice of normalization and hold even when no-feedback is analyzed as an independent condition. The new analyses and results are now reported in the Supplementary Materials.

      Second, concerning the SESOI values and their justification: The difference in SESOI values between Experiment 1 and Experiment 2 reflects the outcome of sensitivity analyses conducted for each dataset separately, rather than a post-hoc reinterpretation of our results. Specifically, we followed current methodological recommendations (Anderson, Kelley & Maxwell, 2017; Albers & Lakens, 2017; Lakens, 2022), which advise against estimating statistical power based on previously published effect sizes, especially when working with novel paradigms or when effect sizes in the literature may be inflated or imprecise. Instead, we used the sensitivity analysis function in G*Power (Version 3.1) to determine the smallest effect size our design was capable of detecting with high statistical power (90%), given the actual sample size, test type, and alpha level used in each experiment. This is a prospective, design-based estimation rather than a post-hoc analysis of observed effects. The slight differences in SESOI are due to more participants falling below our exclusions criteria in Experiment 2, leading to slightly larger effect sizes that can be detected (d = 0.62 vs d = 0.57). Importantly, both experiments remain adequately powered to detect effects of a size commonly reported in the literature on top-down pain modulation. For instance, Iodice et al. (2019) reported effects of approximately d = 0.7, which is well above the minimum detectable thresholds of our designs.

      We have now clarified the logic in the Participant section of Experiment 1 (193-218).

      Anderson, S. F., Kelley, K., & Maxwell, S. E. (2017). Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty. Psychological Science, 28(11), 1547-1562.

      Bartolo, M., Serrao, M., Gamgebeli, Z., Alpaidze, M., Perrotta, A., Padua, L., Pierelli, F., Nappi, G., & Sandrini, G. (2013). Modulation of the human nociceptive flexion reflex by pleasant and unpleasant odors. PAIN®, 154(10), 2054-2059.

      Cecchini, M. P., Riello, M., Sandri, A., Zanini, A., Fiorio, M., & Tinazzi, M. (2020). Smell and taste dissociations in the modulation of tonic pain perception induced by a capsaicin cream application. European Journal of Pain, 24(10), 1946-1955.

      Lakens, D. (2022). Sample size justification. Collabra: psychology, 8(1), 33267.

      Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of experimental social psychology, 74, 187-195.

      Furthermore, the researchers propose the comparison of faster vs. slower delta HR acoustic feedback throughout the manuscript when the natural comparison is the incongruent vs. the congruent feedback.

      We very much disagree that the natural comparison is congruent vs incongruent feedback. First, please note that congruency simply refers to whether the heartrate feedback was congruent with (i.e., matched) the participant’s heartrate measurements in the no feedback trials, or whether it was incongruent, and was therefore either faster or slower than this baseline frequency. As such, simply comparing congruent with incongruent feedback could only indicate that pain ratings change when the feedback does not match the real heart rate, irrespective of whether it is faster or slower. Such a test can therefore only reveal potential general effects of surprise or salience, when the feedback heartrate does not match the real one.

      We therefore assume that the reviewer specifically refers to the comparison of congruent vs incongruent faster feedback. However, this is not a good test either, as this comparison is, by necessity, confounded with the factor of surprise described above. In other words, if a difference would be found, it would not be clear if it emerges because, as we assume, that faster feedback is represented as an interoceptive signal for threat, or simply because participants are surprised about heartrate feedback that diverges from their real heartrate. Note that even a non-significant result in the analogous comparison of congruent vs incongruent slower feedback would not be able to resolve this confound, as in null hypothesis testing the absence of a significant effect does, per definition, not indicate that there is no effect - only that it could not be detected here.

      Instead, the only possible test of our hypothesis is the one we have designed our experiment around and focussed on with our central t-test: the comparison of incongruent faster with incongruent slower feedback. This keeps any possible effects of surprise/salience from generally altered feedback constant and allows us to test our specific hypothesis: that real heart rates will decrease and pain ratings will increase when receiving false interoceptive feedback about increased compared to decreasing heartrates. Note that this test of faster vs slower feedback is also statistically the most appropriate, as it collapses our prediction onto a single and highest-powered hypothesis test: As faster and slower heartrate feedback are assumed to induce effects in the opposite direction, the effect size of their difference is, per definition, double than the averaged effect size for the two separate tests of faster vs congruent feedback and slower vs congruent feedback.

      That being said, we also included comparisons with the congruent condition in our revised analysis, in line with the reviewer’s suggestion and previous studies. These analyses help explore potential asymmetries in the effect of false feedback. While faster feedback (both interoceptive and exteroceptive) significantly modulated pain relative to congruent feedback, the slower feedback did not, consistent with previous literature showing stronger effects for arousal-increasing cues (e.g., Valins, 1966; Iodice et al., 2019). To address this point, in the revised manuscript we have added a paragraph to the Data Analysis section of Experiment 1 (lines 405-437) to make this logic clearer.

      Valins, S. (1966). Cognitive effects of false heart-rate feedback. Journal of personality and social psychology, 4(4), 400.

      Iodice, P., Porciello, G., Bufalari, I., Barca, L., & Pezzulo, G. (2019). An interoceptive illusion of effort induced by false heart-rate feedback. Proceedings of the National Academy of Sciences, 116(28), 13897-13902.

      This could be influenced by the fact that the faster HR exteroceptive cue in experiment 2 also shows a significant modulatory effect on pain intensity compared to congruent HR feedback, which puts into question the hypothesized differences between interoceptive vs. exteroceptive cues. These results could also be influenced by the specific choice of exteroceptive cue: the researchers imply that the main driver of the effect is the nature of the cue (interoceptive vs. exteroceptive) and not its frequency. However, they attempt to generalize their findings using knocking wood sounds to all possible sounds, but it is possible that some features of these sounds (e.g., auditory roughness or loomingness) could be the drivers behind the observed effects.

      We appreciate this thoughtful comment. We agree that low-level auditory features can potentially introduce confounds in the experimental design, and we acknowledge the importance of distinguishing these factors from the higher-order distinction that is central to our study: whether the sound is perceived as interoceptive (originating from within the body) or exteroceptive (perceived as external). To this end, the knocking sound was chosen not for its specific acoustic profile, but because it lacked bodily relevance, thus allowing us to test whether the same temporal manipulations (faster, congruent, slower) would have different effects depending on whether the cue was interpreted as reflecting an internal bodily state or not. In this context, the exteroceptive cue served as a conceptual contrast rather than an exhaustive control for all auditory dimensions.

      Several aspects of our data make it unlikely that the observed effects are driven by unspecific acoustic characteristics of the sounds used in the exteroceptive and interoceptive experiments (see also our responses to Reviewer 1 and Reviewer 3 who raised similar points).

      First, if the knocking sound had inherent acoustic features that strongly influenced perception or physiological responses, we would expect it to have produced consistent effects across all feedback conditions (Faster, Slower, Congruent), regardless of the interpretive context. This would have manifested as an overall difference between experiments in the between-subjects analyses and in the supplementary mixed-effects models that included Experiment as a fixed factor. Yet, we observed no such main effects in any of our variables. Instead, significant differences emerged only in specific theoretically predicted comparisons (e.g., Faster vs. Slower), and critically, these effects depended on the cue type (interoceptive vs. exteroceptive), suggesting that perceived bodily relevance, rather than a specific acoustic property, was the critical modulator. In other words, any alternative explanation based on acoustic features would need to be able to explain why these acoustic properties would induce not an overall change in heart rate and pain perception (i.e., similarly across slower, faster, and congruent feedback), but the brain’s response to changes in the rate of this feedback – increasing pain ratings and decreasing heartrates for faster relative to slower feedback. We hope you agree that a simple effect of acoustic features would not predict such a sensitivity to the rate with which the sound was played.

      Please refer to our responses to Reviewers 1 and 2 for further aspects of the data, arguing strongly against other features associated with the sounds (e.g., alertness, arousal) could be responsible for the results, as the data pattern again goes in the opposite direction than that predicted by such accounts (e.g., faster heartrate feedback decreased real heartrate, instead of increasing them, as would be expected if accelerated heartrate feedback increased arousal).

      Finally, to further support this interpretation, we refer to neurophysiological evidence showing that heartbeat sounds are not processed as generic auditory signals, but as internal, bodily relevant cues especially when believed to reflect one’s own physiological state. For instance, fMRI research (Kleint et al., 2015) shows that heartbeat sounds engage key interoceptive regions such as the anterior insula and frontal operculum more than acoustically matched control tones. EEG data (Vicentin et al., 2024) showed that faster heartbeat sounds produce stronger alpha suppression over frontocentral areas, suggesting enhanced processing in networks associated with interoceptive attention. Moreover, van Elk et al. (2014) found that heartbeat sounds attenuate the auditory N1 response, a neural signature typically linked to self-generated or predicted bodily signals. These findings consistently demonstrate that heartbeats sounds are processed as interoceptive and self-generated signals, which is in line with our rationale that the critical factor at play concern whether it is semantically perceived as reflecting one’s own bodily state, rather than the physical properties of the sound.

      We now explicitly discuss these issues in the revised Discussion section (lines 740-758).

      Kleint, N. I., Wittchen, H. U., & Lueken, U. (2015). Probing the interoceptive network by listening to heartbeats: an fMRI study. PloS one, 10(7), e0133164.

      van Elk, M., Lenggenhager, B., Heydrich, L., & Blanke, O. (2014). Suppression of the auditory N1-component for heartbeat-related sounds reflects interoceptive predictive coding. Biological psychology, 99, 172-182.

      Vicentin, S., Guglielmi, S., Stramucci, G., Bisiacchi, P., & Cainelli, E. (2024). Listen to the beat: behavioral and neurophysiological correlates of slow and fast heartbeat sounds. International Journal of Psychophysiology, 206, 112447.

      Finally, it is noteworthy that the researchers divided the study into two experiments when it would have been optimal to test all the conditions with the same subjects in a randomized order in a single cross-over experiment to reduce between-subject variability. Taking this into consideration, I believe that the conclusions are only partially supported by the evidence. Despite of the outcome transformations, a clear effect of faster HR acoustic feedback can be observed in the first experiment, which is larger than the proposed exteroceptive counterpart. This work could be of broad interest to pain researchers, particularly those working on predictive coding of pain.

      We appreciate the reviewer’s suggestion regarding a within-subject crossover design. While such a design indeed offers increased statistical power by reducing interindividual variability (Charness, Gneezy, & Kuhn, 2012), we intentionally opted for a between-subjects design due to theoretical and methodological considerations specific to studies involving deceptive feedback. Most importantly, carryover effects are a major concern in deception paradigms. Participants exposed to one type of feedback initially (e.g., interoceptive), and then the other (exteroceptive) would be more likely to develop suspicion or adaptive strategies that would alter their responses. Such expectancy effects could contaminate results in a crossover design, particularly when participants realize that feedback is manipulated. In line with this idea, past studies on false cardiac feedback (e.g., Valins, 1966; Pennebaker & Lightner, 1980) often employed between-subjects or blocked designs to mitigate this risk.

      Pennebaker, J. W., & Lightner, J. M. (1980). Competition of internal and external information in an exercise setting. Journal of personality and social psychology, 39(1), 165.

      Valins, S. (1966). Cognitive effects of false heart-rate feedback. Journal of personality and social psychology, 4(4), 400.

      Reviewer #3 (Public Review):

      In their manuscript titled "Exposure to false cardiac feedback alters pain perception and anticipatory cardiac frequency", Parrotta and colleagues describe an experimental study on the interplay between false heart rate feedback and pain experience in healthy, adult humans. The experimental design is derived from Bayesian perspectives on interoceptive inference. In Experiment 1 (N=34), participants rated the intensity and unpleasantness of an electrical pulse presented to their middle fingers. Participants received auditory cardiac feedback prior to the electrical pulse. This feedback was congruent with the participant's heart rate or manipulated to have a higher or lower frequency than the participant's true heart rate (incongruent high/ low feedback). The authors find heightened ratings of pain intensity and unpleasantness as well as a decreased heart rate in participants who were exposed to the incongruent-high cardiac feedback. Experiment 2 (N=29) is equivalent to Experiment 1 with the exception that non-interoceptive auditory feedback was presented. Here, mean pain intensity and unpleasantness ratings were unaffected by feedback frequency.

      Strengths:

      The authors present interesting experimental data that was derived from modern theoretical accounts of interoceptive inference and pain processing.

      (1) The motivation for the study is well-explained and rooted within the current literature, whereas pain is the result of a multimodal, inferential process. The separation of nociceptive stimulation and pain experience is explained clearly and stringently throughout the text.

      (2) The idea of manipulating pain-related expectations via an internal, instead of an external cue, is very innovative.

      (3) An appropriate control experiment was implemented, where an external (non-physiological) auditory cue with parallel frequency to the cardiac cue was presented.

      (4) The chosen statistical methods are appropriate, albeit averaging may limit the opportunity for mechanistic insight, see weaknesses section.

      (5) The behavioral data, showing increased unpleasantness and intensity ratings after exposure to incongruent-high cardiac feedback, but not exteroceptive high-frequency auditory feedback, is backed up by ECG data. Here, the decrease in heart rate during the incongruent-high condition speaks towards a specific, expectation-induced physiological effect that can be seen as resulting from interoceptive inference.

      We thank the reviewer for their positive feedback. We are glad that the study’s theoretical foundation, innovative design, appropriate control conditions, and convergence of behavioral and physiological data were well received.

      Weaknesses:

      Additional analyses and/ or more extensive discussion are needed to address these limitations:

      (1) I would like to know more about potential learning effects during the study. Is there a significant change in ∆ intensity and ∆ unpleasantness over time; e.g. in early trials compared to later trials? It would be helpful to exclude the alternative explanation that over time, participants learned to interpret the exteroceptive cue more in line with the cardiac cue, and the effect is driven by a lack of learning about the slightly less familiar cue (the exteroceptive cue) in early trials. In other words, the heartbeat-like auditory feedback might be "overlearned", compared to the less naturalistic tone, and more exposure to the less naturalistic cue might rule out any differences between them w.r.t. pain unpleasantness ratings.

      We thank the reviewer for raising this important point. Please note that the repetitions in our task were relatively limited (6 trials per condition), which limits the potential influence of such differential learning effects between experiments. To address this concern, we performed an additional analysis, reported in the Supplementary Materials, using a Linear Mixed-Effects Model approach. This method allowed us to include "Trial" (the rank order of each trial) as a variable to account for potential time-on-task effects such as learning, adaptation, or fatigue (e.g., Möckel et al., 2015). All feedback conditions (no-feedback, congruent, faster, slower) and all stimulus intensity levels were included.

      Specifically, we tested the following models:

      Likert Pain Unpleasantness Ratings ~ Experiment × Feedback × StimInt × Trial + (StimInt + Trial | Subject)

      Numeric Pain Scale of Intensity Ratings ~ Experiment × Feedback × StimInt × Trial + (StimInt + Trial | Subject)

      In both models, no significant interactions involving Trial × Experiment or Trial × Feedback × Experiment were found. Instead, we just find generally larger effects in early trials compared to later ones (Main effect of Trial within each Experiment), similar to other cognitive illusions where repeated exposure diminishes effects. Thus, although some unspecific changes over time may have occurred (e.g., due to general task exposure), these changes did not differ systematically across experimental conditions (interoceptive vs. exteroceptive) or feedback types. However, we are fully aware that the absence of significant higher-order interactions does not conclusively rule out the possibility of learning-related effects. It is possible that our models lacked the statistical power to detect more subtle or complex time-dependent modulations, particularly if such effects differ in magnitude or direction across feedback conditions.

      We report the full description of these analyses and results in the Supplementary materials 1. Cross-experiment analysis (between-subjects model).

      (2) The origin of the difference in Cohen's d (Exp. 1: .57, Exp. 2: .62) and subsequently sample size in the sensitivity analyses remains unclear, it would be helpful to clarify where these values are coming from (are they related to the effects reported in the results? If so, they should be marked as post-hoc analyses).

      Following recommendations (Anderson, Kelley & Maxwell, 2017; Albers &  Lakens, 2017), we do not report theoretical power based on previously reported effect sizes as this neglects uncertainty around effect size measurements, especially for new effects for which no reliable expected effect size estimates can be derived across the literature. Instead, the power analysis is based on a sensitivity analysis, conducted in G*Power (Version 3.1). Importantly, these are not post-hoc analyses, as they are not based on observed effect sizes in our study, but derived a priori. Sensitivity analyses estimate effect sizes that our design is well-powered (90%) to detect (i.e. given target power, sample size, type of test), for the crucial comparison between faster and slower feedback in both experiments (Lakens, 2022). Following recommendations, we also report the smallest effect size this test can in principle detect in our study (SESOI, Lakens, 2022). This yields effect sizes of d = .57 in Experiment 1 and d = .62 in Experiment 2 at 90% power and SESOIs of d = .34 and .37, respectively. Note that values are slightly higher in Experiment 2, as more participants were excluded based on our exclusion criteria. Importantly, detectable effect sizes in both experiments are smaller than reported effect sizes for comparable top-down effects on pain measurements of d = .7 (Iodice et al., 2019).  We have now added more information to the power analysis sections to make this clearer (lines 208-217).

      Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of experimental social psychology, 74, 187-195.

      Anderson, S. F., Kelley, K., & Maxwell, S. E. (2017). Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty. Psychological Science, 28(11), 1547-1562.

      Lakens, D. (2022). Sample size justification. Collabra: psychology, 8(1), 33267.

      (3) As an alternative explanation, it is conceivable that the cardiac cue may have just increased unspecific arousal or attention to a larger extent than the exteroceptive cue. It would be helpful to discuss the role of these rather unspecific mechanisms, and how it may have differed between experiments.

      We thank the reviewer for raising this important point. We agree that, in principle, unspecific mechanisms such as increased arousal or attention driven by cardiac feedback could be an alternative explanation for the observed effects. However, several aspects of our data indicate that this is unlikely:

      (1) No main effect of Experiment on pain ratings:

      If the cardiac feedback had simply increased arousal or attention in a general (non-specific) way, we would expect a main effect of Experiment (i.e., interoceptive vs exteroceptive condition) on pain intensity or unpleasantness ratings, regardless of feedback frequency. However, such a main effect was never observed when we compared between experiments (see between-experiment t-tests in results, and in supplementary analyses). Instead, effects were specific to the manipulation of feedback frequency.

      (2) Heart rate as an arousal measure:

      Heart rate (HR) is a classical physiological index of arousal. If there had been an unspecific increase in arousal in the interoceptive condition, we would expect a main effect of Experiment on HR. However, no such main effect was found. Instead, our HR analyses revealed a significant interaction between feedback and experiment, suggesting that HR changes depended specifically on the feedback manipulation rather than reflecting a general arousal increase.

      (3) Arousal predicts faster, not slower, heart rates

      In Experiment 1, faster interoceptive cardiac feedback led to a slowdown in heartrates both when compared to slower feedback and to congruent cardiac feedback. This is in line with the predicted compensatory response to faster heart rates. In contrast, if faster feedback would have only generally increased arousal, heart rates should have increased instead of decreased, as indicated by several prior studies (Tousignant-Laflamme et al., 2005; Terkelsen et al., 2005; for a review, see Forte et al., 2022), predicting the opposite pattern of responses than was found in Experiment 1.

      Taken together, these findings indicate that the effects observed are unlikely to be driven by unspecific arousal or attention mechanisms, but rather are consistent with feedback-specific modulations, in line with our interoceptive inference framework.

      We have now integrated these considerations in the revised discussion (lines 796-830), and added the relevant between-experiment comparisons to the Results of Experiment 2 and the supplementary analysis.

      Terkelsen, A. J., Mølgaard, H., Hansen, J., Andersen, O. K., & Jensen, T. S. (2005). Acute pain increases heart rate: differential mechanisms during rest and mental stress. Autonomic Neuroscience, 121(1-2), 101-109.

      Tousignant-Laflamme, Y., Rainville, P., & Marchand, S. (2005). Establishing a link between heart rate and pain in healthy subjects: a gender effect. The journal of pain, 6(6), 341-347.

      Forte, G., Troisi, G., Pazzaglia, M., Pascalis, V. D., & Casagrande, M. (2022). Heart rate variability and pain: a systematic review. Brain sciences, 12(2), 153.

      (4) The hypothesis (increased pain intensity with incongruent-high cardiac feedback) should be motivated by some additional literature.

      We thank the reviewer for this helpful suggestion. Please note that the current phenomenon was tested in this experiment for the first time. Therefore, there is no specific prior study that motivated our hypotheses; they were driven theoretically, and derived from our model of interoceptive integration of pain and cardiac perception. The idea that accelerated cardiac feedback (relative to decelerated feedback) will increase pain perception and reduce heart rates is grounded on Embodied Predictive coding frameworks. Accordingly, expectations and signals from different sensory modalities (sensory, proprioceptive, interoceptive) are integrated both to efficiently infer crucial homeostatic and physiological variables, such as hunger, thirst, and, in this case, pain, and regulate the body’s own autonomic responses based on these inferences.

      Within this framework, the concept of an interoceptive schema (Tschantz et al., 2022; Iodice et al., 2019; Parrotta et al., 2024; Schoeller et al., 2022) offers the basis for understanding interoceptive illusions, wherein inferred levels of interoceptive states (i.e., pain) deviate from the actual physiological state. Cardiac signals conveyed by the feedback manipulation act as a misleading prior, shaping the internal generative model of pain. Specifically, an increased heart rate may signal a state of threat, establishing a prior expectation of heightened pain. Building on predictive models of interoception, we predict that this cardiac prior is integrated with interoceptive (i.e., actual nociceptive signal) and exteroceptive inputs (i.e., auditory feedback input), leading to a subjective experience of increased pain even when there is no corresponding increase in the nociceptive input.

      This idea is not completely new, but it is based on our previous findings of an interoceptive cardiac illusion driven by misleading priors about anticipated threat (i.e., pain). Specifically, in Parrotta et al. (2024), we tested whether a common false belief that heart rate increases in response to threat lead to an illusory perception of accelerated cardiac activity when anticipating pain. In two experiments, we asked participants to monitor and report their heartbeat while their ECG was recorded. Participants performed these tasks while visual cues reliably predicted a forthcoming harmless (low-intensity) vs. threatening (high-intensity) cutaneous electrical stimulus. We showed that anticipating a painful vs. harmless stimulus causes participants to report an increased cardiac frequency, which does not reflect their real cardiac response, but the common (false) belief that heart rates would accelerate under threat, reflecting the hypothesised integration of prior expectations and interoceptive inputs when estimating cardiac activity.

      Here we tested the counterpart of such a cardiac illusion. We reasoned that if cardiac interoception is shaped by expectations about pain, then the inverse should also be true: manipulating beliefs about cardiac activity (via cardiac feedback) in the context of pain anticipation should influence the perception of pain. Specifically, we hypothesized that presenting accelerated cardiac feedback would act as a misleading prior, leading to an illusory increase in pain experience, even in the absence of an actual change in nociceptive input.

      Moreover, next to the references already provided in the last version of the manuscript, there is ample prior research that provides more general support for such relationships. Specifically, studies have shown that providing mismatched cardiac feedback in contexts where cardiovascular changes are typically expected (i.e. sexual arousal, Rupp & Wallen, 2008; Valins, 1996; physical exercise, Iodice et al., 2019) can enhance the perception of interoceptive states associated with those experiences. Furthermore, findings that false cardiac feedback can influence emotional experience suggest that it is the conscious perception of physiological arousal, combined with the cognitive interpretation of the stimulus, that plays a key role in shaping emotional responses (Crucian et al., 2000).

      This point is now addressed in the revised Introduction, wherein additional references have been integrated (lines 157-170).

      Crucian, G. P., Hughes, J. D., Barrett, A. M., Williamson, D. J. G., Bauer, R. M., Bowers, D., & Heilman, K. M. (2000). Emotional and physiological responses to false feedback. Cortex, 36(5), 623-647.

      Iodice, P., Porciello, G., Bufalari, I., Barca, L., & Pezzulo, G. (2019). An interoceptive illusion of effort induced by false heart-rate feedback. Proceedings of the National Academy of Sciences, 116(28), 13897-13902.

      Parrotta, E., Bach, P., Perrucci, M. G., Costantini, M., & Ferri, F. (2024). Heart is deceitful above all things: Threat expectancy induces the illusory perception of increased heartrate. Cognition, 245, 105719.

      Rupp, H. A., & Wallen, K. (2008). Sex differences in response to visual sexual stimuli: A review. Archives of sexual behavior, 37(2), 206-218.

      Schoeller, F., Horowitz, A., Maes, P., Jain, A., Reggente, N., Moore, L. C., Trousselard, M., Klein, A., Barca, L., & Pezzulo, G. (2022). Interoceptive technologies for clinical neuroscience.

      Tschantz, A., Barca, L., Maisto, D., Buckley, C. L., Seth, A. K., & Pezzulo, G. (2022). Simulating homeostatic, allostatic and goal-directed forms of interoceptive control using active inference. Biological Psychology, 169, 108266.

      Valins, S. (1966). Cognitive effects of false heart-rate feedback. Journal of personality and social psychology, 4(4), 400.

      (5) The discussion section does not address the study's limitations in a sufficient manner. For example, I would expect a more thorough discussion on the lack of correlation between participant ratings and self-reported bodily awareness and reactivity, as assessed with the BPQ.

      We thank the reviewer for this valuable observation. In response, we have revised the Discussion section to explicitly acknowledge and elaborate on the lack of significant correlations between participants’ pain ratings and their self-reported bodily awareness and reactivity as assessed with the BPQ.

      We now clarify that the inclusion of this questionnaire was exploratory. While it would be theoretically interesting to observe a relationship between subjective pain modulation and individual differences in interoceptive awareness, detecting robust correlations between within-subject experimental effects and between-subjects trait measures such as the BPQ typically requires much larger sample sizes (often exceeding N = 200) due to the inherently low reliability of such cross-level associations (see Hedge, Powell & Sumner, 2018; the “reliability paradox”). As such, the absence of a significant correlation in our study does not undermine the conclusions we draw from our main findings. Future studies with larger samples will be needed to systematically address this question. We now acknowledge this point explicitly in the revised manuscript (lines 501-504; 832-851).

      Hedge, C., Powell, G., & Sumner, P. (2018). The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods, 50(3), 1166-1186. https://doi.org/10.3758/s13428-017-0935-1

      (a) Some short, additional information on why the authors chose to focus on body awareness and supradiaphragmatic reactivity subscales would be helpful.

      We chose to focus on the body awareness and supradiaphragmatic reactivity subscales because these aspects are closely tied to emotional and physiological processing, particularly in the context of interoception. Body awareness plays a critical role in how individuals perceive and interpret bodily signals, which in turn affects emotional regulation and self-awareness. Supradiaphragmatic reactivity refers specifically to organs located or occurring above the diaphragm (i.e., the muscle that separates the chest cavity from the abdomen), which includes the heart, compared to subdiaphragmatic reactivity subscales further down. Our decision to include these subscales is further motivated by recent research, including the work by Petzschner et al. (2021), which demonstrates that the focus of attention can modulate the heartbeat-evoked potential (HEP), and that this modulation is predicted by participants’ responses on the supradiaphragmatic reactivity subscales. Thus, this subscale, and the more general body awareness scale, allows us to explore the interplay between bodily awareness, physiological reactivity, and emotional processing in our study. We now clarify this point in the revised version of the Methods - Body Perception Questionnaire (lines 384-393).

      (6) The analyses presented in this version of the manuscript allow only limited mechanistic conclusions - a computational model of participants' behavior would be a very strong addition to the paper. While this may be out of the scope of the article, it would be helpful for the reader to discuss the limitations of the presented analyses and outline avenues towards a more mechanistic understanding and analysis of the data. The computational model in [7] might contain some starting ideas.

      Thank you for your valuable feedback. We agree that a computational model would enhance the mechanistic understanding of our findings. While this is beyond the current scope, we now discuss the limitations of our analysis in the Limitations and Future directions section (lines 852-863). Specifically, we acknowledge that future studies could use computational models to better understand the interactions between physiological, cognitive, and perceptual factors.

      Some additional topics were not considered in the first version of the manuscript:

      (1) The possible advantages of a computational model of task behavior should be discussed.

      We agree that a computational model of task behavior could provide several advantages. By formalizing principles of predictive processing and active inference, such a model could generate quantitative predictions about how heart rate (HR) and feedback interact, providing a more precise understanding of their respective contributions to pain modulation. However, this is a first demonstration of a theoretically predicted phenomenon, and computationally modelling it is currently outside the scope of the article. We would be excited to explore this in the future. We have added a brief discussion of these potential advantages in the revised manuscript and suggest that future work could integrate computational modelling to further deepen our understanding of these processes (lines 852-890).

      (2) Across both experiments, there was a slightly larger number of female participants. Research suggests significant sex-related differences in pain processing [1,2]. It would be interesting to see what role this may have played in this data.

      Thank you for your insightful comment. While we acknowledge that sex-related differences in pain processing are well-documented in the literature, we do not have enough participants in our sample to test this in a well-powered way. As such, exploring the role of sex differences in pain perception will need to be addressed in future studies with more balanced samples. It would be interesting if more sensitive individuals, with a more precise representation of pain, also show smaller effects on pain perception. We have noted this point in the revised manuscript (lines 845-851) and suggest that future research could specifically investigate how sex differences might influence the modulation of pain and physiological responses in similar experimental contexts.

      (3) There are a few very relevant papers that come to mind which may be of interest. These sources might be particularly useful when discussing the roadmap towards a mechanistic understanding of the inferential processes underlying the task responses [3,4] and their clinical implications.

      Thank you for highlighting these relevant papers. We appreciate your suggestion and have now cited them in the Limitations and Future directions paragraph (lines 852-863).

      (4) In this version of the paper, we only see plots that illustrate ∆ scores, averaged across pain intensities - to better understand participant responses and the relationship with stimulus intensity, it would be helpful to see a more descriptive plot of task behavior (e.g. stimulus intensity and raw pain ratings)

      To directly address the reviewer’s request, we now provide additional descriptive plots in the supplementary material of the revised manuscript, showing raw pain ratings across different stimulus intensities and feedback conditions. These plots offer a clearer view of participant behavior without averaging across pain levels, helping to better illustrate the relationship between stimulus intensity and reported pain.

      Mogil, J. S. (2020). Qualitative sex differences in pain processing: emerging evidence of a biased literature. Nature Reviews Neuroscience, 21(7), 353-365. https://www.nature.com/articles/s41583-020-0310-6

      Sorge, R. E., & Strath, L. J. (2018). Sex differences in pain responses. Current Opinion in Physiology, 6, 75-81. https://www.sciencedirect.com/science/article/abs/pii/S2468867318300786?via%3Dihub

      Unal, O., Eren, O. C., Alkan, G., Petzschner, F. H., Yao, Y., & Stephan, K. E. (2021). Inference on homeostatic belief precision. Biological Psychology, 165, 108190.

      Allen, M., Levy, A., Parr, T., & Friston, K. J. (2022). In the body's eye: the computational anatomy of interoceptive inference. PLoS Computational Biology, 18(9), e1010490.

      Stephan, K. E., Manjaly, Z. M., Mathys, C. D., Weber, L. A., Paliwal, S., Gard, T., ... & Petzschner, F. H. (2016). Allostatic self-efficacy: A metacognitive theory of dyshomeostasis-induced fatigue and depression. Frontiers in human neuroscience, 10, 550.

      Friston, K. J., Stephan, K. E., Montague, R., & Dolan, R. J. (2014). Computational psychiatry: the brain as a phantastic organ. The Lancet Psychiatry, 1(2), 148-158.

      Eckert, A. L., Pabst, K., & Endres, D. M. (2022). A Bayesian model for chronic pain. Frontiers in Pain Research, 3, 966034.

      We thank the reviewer for highlighting these relevant references which have now been integrated in the revised version of the manuscript.

      Recommendations For The Authors: 

      Reviewer #1 (Recommendations For The Authors):

      At the time I was reviewing this paper, I could not think of a detailed experiment that would answer my biggest concern: Is this a manipulation of the brain's interoceptive data integration, or rather a manipulation of participants' alertness which indirectly influences their pain prediction?

      One incomplete idea that came to mind was delivering this signal in a more "covert" manner (though I am not sure it will suffice), or perhaps correlating the effect size of a participant with their interoceptive abilities, as measured in a different task or through a questionnaire.... Another potential idea is to tell participants that  this is someone else's HR that they hear and see if that changes the results (though requires further thought). I leave it to the authors to think further, and perhaps this is to be answered in a different paper - but if so, I am sorry to say that I do not think the claims can remain as they are now, and the paper will need a revision of its arguments, unfortunately. I urge the authors to ask further questions if my point about the concern was not made clear enough for them to address or contemplate it.

      We thank the reviewer for raising this important point. As detailed in our previous response, this point invites an important clarification regarding the role of cardiac deceleration in threat processing. Rather than serving as an interoceptive input from which the brain infers the likelihood of a forthcoming aversive event, heart rate deceleration is better described as an output of an already ongoing predictive process, as it reflects an allostatic adjustment of the bodily state aimed at minimizing the impact of the predicted perturbation (e.g., pain) and preventing sympathetic overshoot. It would be maladaptive for the brain to use a decelerating heart rate as evidence of impending threat, since this would paradoxically trigger further parasympathetic activation, initiating a potentially destabilizing feedback loop. Conversely, increased heart rate represents an evolutionarily conserved cue for arousal, threat, and pain. Our results therefore align with the idea that the brain treats externally manipulated increases in cardiac signals as congruent with anticipated sympathetic activation, prompting a compensatory autonomic and perceptual response consistent with embodied predictive processing frameworks (e.g., Barrett & Simmons, 2015; Seth, 2013).

      We would also like to re-iterate that our results cannot be explained by general differences induced by the different heart rate sounds relative to the exteroceptive (see also our detailed comments to your point above, and our response to a similar point from Reviewer 3), for three main reasons.

      (1) No main effect of Experiment on pain ratings:

      If the cardiac feedback had simply increased arousal or attention in a general (non-specific) way, we would expect a main effect of Experiment (i.e., interoceptive vs exteroceptive condition) on pain intensity or unpleasantness ratings, regardless of feedback frequency. However, such a main effect was never observed. Instead, effects were specific to the manipulation of feedback frequency.

      (2) Heart rate as an arousal measure:

      Heart rate (HR) is a classical physiological index of arousal. If there had been an unspecific increase in arousal in the interoceptive condition, we would expect a main effect of Experiment on HR. However, no such main effect was found. Instead, our HR analyses revealed a significant interaction between feedback and experiment, suggesting that HR changes depended specifically on the feedback manipulation rather than reflecting a general arousal increase.

      (3) Arousal predicts faster, not slower, heart rates

      In Experiment 1, faster interoceptive cardiac feedback led to a slowdown in heartrates both when compared to slower feedback and to congruent cardiac feedback. This is in line with the predicted compensatory response to faster heart rates. In contrast, if faster feedback would have only generally increased arousal, heart rates should have increased instead of decreased, as indicated by several prior studies (for a review, see Forte et al., 2022), predicting the opposite pattern of responses than was found in Experiment 1.

      Taken together, these findings indicate that the effects observed are unlikely to be driven by unspecific arousal or attention mechanisms, but rather are consistent with feedback-specific modulations, in line with our interoceptive inference framework. We now integrate these considerations in the general discussion (lines 796-830).

      Barrett, L. F., & Simmons, W. K. (2015). Interoceptive predictions in the brain. Nature reviews neuroscience, 16(7), 419-429.

      Forte, G., Troisi, G., Pazzaglia, M., Pascalis, V. D., & Casagrande, M. (2022). Heart rate variability and pain: a systematic review. Brain sciences, 12(2), 153.

      Seth, A. K. (2013). Interoceptive inference, emotion, and the embodied self. Trends in Cognitive Sciences, 17(11), 565-573.

      Additional recommendations:

      Major (in order of importance):

      (1) Number of trials per participant, per condition: as I mentioned, having only 6 trials for each condition is very little. The minimum requirement to accept so few trials would be to show data about the distribution of participants' responses to these trials, both per pain intensity (which was later averaged across - another issue discussed later), and across pain intensities, and see that it allows averaging across and that it is not incredibly variable such that the mean is unreliable.

      We appreciate the reviewer’s concern regarding the limited number of trials per condition. This choice was driven by both theoretical and methodological considerations.

      First, as is common in body illusion paradigms (e.g., the Rubber Hand Illusion, Botvinick & Cohen, 1998; the Full Body Illusion, Ehrsson, 2007; the Cardio-visual full body illusion, Pratviel et al., 2022) only a few trials are typically employed due to the immediate effects these manipulations elicit. Repetition can reduce the strength of the illusion through habituation, increased awareness, or loss of believability.

      Second, the experiment was already quite long (1.5h to 2h per participant) and cognitively demanding. It would not have been feasible to expand it further without compromising data quality due to fatigue, attentional decline, or participant disengagement.

      Third, the need for a large number of trials is more relevant when using implicit measures such as response times or physiological indices, which are typically indirectly related to the psychological constructs of interest. In contrast, explicit ratings are often more sensitive and less noisy, and thus require fewer repetitions to yield reliable effects (e.g., Corneille et al., 2024).

      Importantly, we also addressed your concern analytically. We ran therefore linear mixed-effects model analyses across all dependent variables (See Supplementary materials), with Trial (i.e., the rank order of each trial) included as a predictor to account for potential time-on-task effects such as learning, adaptation, or fatigue (e.g., Möckel et al., 2015). These models captured trial-by-trial variability and allowed us to test for systematic changes in heart rate (HR) and pain ratings including interactions with feedback conditions (e.g., Klieg et al., 2011; Baayen et al., 2010; Ambrosini et al., 2019). The consistent effects of Trial suggest that repetition dampens the illusion, reinforcing our decision to limit the number of exposures.

      In the interoceptive experiment, these analyses revealed a significant Feedback × Trial interaction (F(3, 711.19) = 6.16, p < .001), indicating that the effect of feedback on HR was not constant over time. As we suspected, and in line with other illusion-like effects, the difference between Faster and Slower feedback, which was significant early on (estimate = 1.68 bpm, p = .0007), decreased by mid-session (estimate = 0.69 bpm, p = .0048), and was no longer significant in later trials (estimate = 0.30 bpm, p = .4775). At the end of the session, HR values in the Faster and Slower conditions even numerically converged (Faster: M = 74.4, Slower: M = 74.1), and the non-significant contrast confirms that the difference had effectively vanished (for further details about slope estimation, see Supplementary material).

      The same pattern emerged for pain-unpleasantness ratings. A significant Feedback × Trial interaction (F (3, 675.33) = 3.44, p = .0165) revealed that the difference between Faster and Slower feedback was strongest at the beginning of the session and progressively weakened. Specifically, Faster feedback produced higher unpleasantness than Slower in early trials (estimate= -0.28, p = .0058) and mid-session (estimate = - 0.19, p = .0001), but this contrast was no longer significant in the final trials, wherein all the differences between active feedback conditions vanished (all ps > .55).

      Finally, similar results were yielded for pain intensity ratings. A significant Feedback × Trial interaction (F (3, 669.15) = 9.86, p < .001) showed that the Faster vs Slower difference was greatest at the start of the session and progressively vanished over trials. In early trials Faster feedback exceeded Slower (estimate=-8.33, p = .0001); by mid-session this gap had shrunk to 4.48 points (p < .0001); and in the final trials it was no longer significant (all ps > .94).

      Taken together, our results show that the illusion induced by Faster relative to slower feedback fades with repetition; adding further trials would likely have masked this key effect, confirming the methodological choice to restrict each condition to fewer exposures. To conclude, given that this is the first study to investigate an illusion of pain using heartbeat-based manipulation, we intentionally limited repeated exposures to preserve the integrity of the illusion. The use of mixed models as complementary analyses strengthens the reliability of our conclusions within these necessary design constraints. We now clarify this point in the Procedure paragraph (lines 328-335)

      Ambrosini, E., Peressotti, F., Gennari, M., Benavides-Varela, S., & Montefinese, M. (2023). Aging-related effects on the controlled retrieval of semantic information. Psychology and Aging, 38(3), 219.

      Baayen, R. H., & Milin, P. (2010). Analyzing reaction times. International Journal of Psychological Research, 3(2), 12-28.

      Botvinick, M., & Cohen, J. (1998). Rubber hands ‘feel’touch that eyes see. Nature, 391(6669), 756-756.

      Corneille, O., & Gawronski, B. (2024). Self-reports are better measurement instruments than implicit measures. Nature Reviews Psychology, 3(12), 835–846.

      Ehrsson, H. H. (2007). The experimental induction of out-of-body experiences. Science, 317(5841), 1048-1048.

      Kliegl, R., Wei, P., Dambacher, M., Yan, M., & Zhou, X. (2011). Experimental effects and individual differences in linear mixed models: Estimating the relation of spatial, object, and attraction effects in visual attention. Frontiers in Psychology, 1, 238. https://doi.org/10.3389/fpsyg.2010.00238

      Möckel, T., Beste, C., & Wascher, E. (2015). The effects of time on task in response selection-an ERP study of mental fatigue. Scientific reports, 5(1), 10113.

      Pratviel, Y., Bouni, A., Deschodt-Arsac, V., Larrue, F., & Arsac, L. M. (2022). Avatar embodiment in VR: Are there individual susceptibilities to visuo-tactile or cardio-visual stimulations?. Frontiers in Virtual Reality, 3, 954808.

      (2) Using different pain intensities: what was the purpose of training participants on correctly identifying pain intensities? You state that the aim of having 5 intensities is to cause ambiguity. What is the purpose of making sure participants accurately identify the intensities? Also, why then only 3 intensities were used in the test phase? The rationale for these is lacking.

      We thank the reviewer for raising these important points regarding the use of different pain intensities. The purpose of using five levels during the calibration and training phases was to introduce variability and increase ambiguity in the participants’ sensory experience. This variability aimed to reduce predictability and prevent participants from forming fixed expectations about stimulus intensity, thereby enhancing the plausibility of the illusion. It also helped prevent habituation to a single intensity and made the manipulation subtler and more credible. We had no specific theoretical hypotheses about this manipulation. Regarding the accuracy training, although the paradigm introduced ambiguity, it was important to ensure that participants developed a stable and consistent internal representation of the pain scale. This step was essential to control for individual differences in sensory discrimination and to ensure that illusion effects were not confounded by participants’ inability to reliably distinguish between intensities.

      As for the use of only three pain intensities in the test phase, the rationale was to focus on a manageable subset that still covered a meaningful range of the stimulus spectrum. This approach followed the same logic as Iodice et al. (2019, PNAS), who used five (rather than all seven) intensity levels during their experimental session. Specifically, they excluded the extreme levels (45 W and 125 W) used during baseline, to avoid floor and ceiling effects and to ensure that each test intensity could be paired with both a “slower” and a “faster” feedback from an adjacent level. This would not have been possible at the extremes of the intensity range, where no adjacent level exists in one direction. We adopted the same strategy to preserve the internal consistency and plausibility of our feedback manipulation.

      We further clarified these points in the revised manuscript (lines 336-342).

      Iodice, P., Porciello, G., Bufalari, I., Barca, L., & Pezzulo, G. (2019). An interoceptive illusion of effort induced by false heart-rate feedback. Proceedings of the National Academy of Sciences, 116(28), 13897-13902.

      (3) Averaging across pain intensities: this is, in my opinion, not the best approach as by matching a participant's specific responses to a pain stimulus before and after the manipulation, you can more closely identify changes resulting from the manipulation. Nevertheless, the minimal requirement to do so is to show data of distributions of pain intensities so we know they did not differ between conditions per participant, and in general - as you indicate they were randomly distributed.

      We thank the reviewer for this thoughtful comment. The decision to average across pain intensities in our main analyses was driven by the specific aim of the study: we did not intend to determine at which exact intensity level the illusion was most effective, and the limited number of trials makes such an analysis difficult. Rather, we introduced variability in nociceptive input to increase ambiguity and reduce predictability in the participants’ sensory experience. This variability was critical for enhancing the plausibility of the illusion by preventing participants from forming fixed expectations about stimulus strength. Additionally, using a range of intensities helped to minimize habituation effects and made the feedback manipulation subtler and more credible.

      That said, we appreciate the reviewer’s point that matching specific responses before and after the manipulation at each intensity level could provide further insights into how the illusion operates across varying levels of nociceptive input. We therefore conducted supplementary analyses using linear mixed-effects models in which all three stimulus intensities were included as a continuous fixed factor. This allowed us to examine whether the effects of feedback were intensity-specific or generalized across different levels of stimulation

      These analyses revealed that, in both the interoceptive and exteroceptive experiments, the effect of feedback on pain ratings was significantly modulated by stimulus intensity, as indicated by a Feedback × Stimulus Intensity interaction (Interoceptive: unpleasantness F(3, 672.32)=3.90, p=.0088; intensity ratings F(3, 667.07)=3.46, p=.016. Exteroceptive: unpleasantness F(3, 569.16)=8.21, p<.0001; intensity ratings F(3, 570.65)=3.00, p=.0301). The interaction term confirmed that the impact of feedback varied with stimulus strength, yet the pattern that emerged in each study diverged markedly.

      In the interoceptive experiment, the accelerated-heartbeat feedback (Faster) systematically heightened pain relative to the decelerated version (Slower) at every level of noxious input: for low-intensity trials Faster exceeded Slower by 0.22 ± 0.08 points on the unpleasantness scale (t = 2.84, p = .0094) and by 3.87 ± 1.69 units on the numeric intensity scale (t = 2.29, p = .0448); at the medium intensity the corresponding differences were 0.19 ± 0.05 (t = -4.02, p = .0001) and 4.52 ± 1.06 (t = 4.28, p < .0001); and even at the highest intensity, Faster still surpassed Slower by 0.17 ± 0.08 on unpleasantness (t = 2.21, p = .0326) and by 5.16 ± 1.67 on intensity (t = 3.09, p = .0032). This uniform Faster > Slower pattern indicates that the interoceptive manipulation amplifies perceived pain in a stimulus-independent fashion.

      The exteroceptive control experiment told a different story: the Faster-Slower contrast reached significance only at the most noxious setting (unpleasantness: estimate = 0.24 ± 0.07, t = -3.24, p = .0019; intensity: estimate = - 5.14 ± 1.82, t = 2.83, p = .0072) and was absent at the medium level (intensity , p=0.29; unpleasantness,  p=0.45), while at the lowest level Slower actually produced numerically higher unpleasantness (2.56 versus 2.40) and intensity ratings (44.7 versus 42.2).

      Thus, although both studies show that feedback effects depend on the actual nociceptive level of the stimulus, the results suggest that the faster vs. slower interoceptive feedback manipulation delivers a robust and intensity-invariant enhancement of pain, whereas the exteroceptive cue exerts a sporadic influence that surfaces solely under maximal stimulation.

      These new results are now included in the Supplementary Materials, where we report the detailed analyses for both the Interoceptive and Exteroceptive experiments on the Likert unpleasantness ratings and the numeric pain intensity ratings.

      (4) Sample size: It seems that the sample size was determined after the experiment was conducted, as the required N is identical to the actual N. I would be transparent about that, and say that retrospective sample size analyses support the ability of your sample size to support your claims. In general, a larger sample size than is required is always recommended, and if you were to run another study, I suggest you increase the sample size.

      As also addressed in our responses to your later comments (see our detailed reply regarding the justification of SESOI and power analyses), the power analyses reported here were not post-hoc power analyses based on obtained results. In line with current recommendations (Anderson, Kelley & Maxwell, 2017; Albers & Lakens, 2018), we did not base our analyses on previously reported effect sizes, as these can carry considerable uncertainty, particularly for novel effects where robust estimates are lacking. Instead, we used sensitivity analyses, conducted using the sensitivity analysis function in G*Power (Version 3.1). Sensitivity analyses allow us to report effect sizes that our design was adequately powered (90%) to detect, given the actual sample size, desired power level, and the statistical test used in each experiment (Lakens, 2022). Following further guidance (Lakens, 2022), we also report the smallest effect size of interest (SESOI) that these tests could reliably detect.

      This approach indicated that our design was powered to detect effect sizes of d = 0.57 in Experiment 1 and d = 0.62 in Experiment 2, with corresponding SESOIs of d = 0.34 and d = 0.37, respectively. The slightly higher value in Experiment 2 reflects the greater number of participants excluded (from an equal number originally tested) based on pre-specified criteria. Importantly, both experiments were well-powered to detect effects smaller than those typically reported in similar top-down pain modulation studies, where effect sizes around d = 0.7 have been observed (Iodice et al., 2019).

      We have now clarified this rationale in the revised manuscript, Experiment 1- Methods - Participants (lines 208-217).

      Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of experimental social psychology, 74, 187-195.

      Anderson, S. F., Kelley, K., & Maxwell, S. E. (2017). Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty. Psychological Science, 28(11), 1547-1562. https://doi.org/10.1177/0956797617723724

      Lakens, D. (2022). Sample size justification. Collabra: psychology, 8(1), 33267.

      (5) Analysis: the use of change scores instead of the actual scores is not recommended, as it is a loss of data, but could have been ignored if it didn't have a significant effect on the analyses conducted. Instead of conducting an RM-ANOVA of conditions (faster, slower, normal heartbeats) across participants, finding significant interaction, and then moving on to specific post-hoc paired comparisons between conditions, the authors begin with the change score but then move on to conduct the said paired comparisons without ever anchoring these analyses in an appropriate larger ANOVA. I strongly recommend the use of an ANOVA but if not, the authors would have to correct for multiple comparisons at the minimum.

      We thank the reviewer for their comment regarding the use of change scores. These were originally derived from the difference between the slower and faster feedback conditions relative to the congruent condition. In line with the reviewer’s recommendation, we have now removed these difference-based change scores from the main analysis. The results remain identical. Please note that we have retained the normalization procedure, relative to each participant’s initial baseline in the no feedback trials, as it is widely used in the interoceptive and pain literature (e.g., Bartolo et al., 2013; Cecchini et al., 2020; Riello et al., 2019). This approach helps to control for interindividual variability and baseline differences by expressing each participant’s response relative to their no-feedback baseline. As before, normalization was applied across all dependent variables (heart rate, pain intensity, and pain unpleasantness).

      To address the reviewer’s concern about statistical validity, we now first report a 1-factor repeated-measures ANOVA (Greenhouse-Geisser corrected) for each dependent variable, with feedback condition (slower, congruent, faster) as the within-subject factor.

      These show in each case a significant main effect, which we then follow with planned paired-sample t-tests comparing:

      Faster vs. slower feedback (our main hypothesis, as these manipulations are expected to produce largest, most powerful, test of our hypothesis, see response to Reviewer 3),

      Faster vs. congruent and slower vs. congruent (to test for potential asymmetries, as suggested  by previous false heart rate feedback studies).

      The rationale of these analyses is further discussed in the Data Analysis of Experiment 1 (lines 405-437).

      Although we report the omnibus one-factor RM-ANOVAs to satisfy conventional expectations, we note that such tests are not statistically necessary, nor even optimal, when the research question is fully captured by a priori, theory-driven contrasts. Extensive methodological work shows that, in this situation, going straight to planned contrasts maximises power without inflating Type I error and avoids the logical circularity of first testing an effect one does not predict (e.g., Rosenthal & Rosnow, 1985). In other words, an omnibus F is warranted only when one wishes to protect against unspecified patterns of differences. Here our hypotheses were precise (Faster ≠ Slower; potential asymmetry relative to Congruent), so the planned paired comparisons would have sufficed statistically. We therefore include the RM-ANOVAs solely for readers who expect to see them, but our inferential conclusions rest on the theoretically motivated contrasts.

      Rosenthal, R., & Rosnow, R. L. (1985). Contrast analysis. New York: Cambridge.

      (6) Correlations: were there correlations between subjects' own heartbeats (which are considered a predictive cue) and pain perceptions? This is critical to show that the two are in fact related.

      We thank the reviewer for this thoughtful suggestion. While we agree that testing for a correlation between anticipatory heart rate responses and subjective pain ratings is theoretically relevant. However, we have not conducted this analysis in the current manuscript, as our study was not designed or powered to reliably detect such individual differences. As noted by Hedge, Powell, and Sumner (2018), robust within-subject experimental designs tend to minimize between-subject variability in order to detect clear experimental effects. This reduction in variance at the between-subject level limits the reliability of correlational analyses involving trait-like or individual response patterns. This issue, known as the reliability paradox, highlights that measures showing robust within-subject effects may not show stable individual differences, and therefore correlations with other individual-level variables (like subjective ratings used here) require much larger samples to produce interpretable results than available here (and commonly used in the literature), typically more than 200 participants. For these reasons, we believe that running such an analysis in our current dataset would not yield informative results and could be misleading.

      We now explicitly acknowledge this point in the revised version of the manuscript (Limitations and future directions, lines 832-851) and suggest that future studies specifically designed to examine individual variability in anticipatory physiological responses and pain perception would be better suited to address this question.

      Hedge, C., Powell, G., & Sumner, P. (2018). The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behavior Research Methods, 50(3), 1166-1186. https://doi.org/10.3758/s13428-017-0935-1

      (7) The direct comparison between studies is great! and finally the use of ANOVA - but why without the appropriate post-hoc tests to support the bold claims in lines 542-544? This is needed. Same for 556-558.

      We apologize if our writing was not clear here, but the result of the ANOVAs fully warrants the claims in 542-544 (now lines 616-618) and 556-558 (now lines 601-603).

      In a 2x2 design, the interaction term is mathematically identical to comparing the difference induced by Factor 1 at one level of Factor 2 with the same difference induced at the other level of Factor 2. In our 2x2 analysis with the factors Experiment (Cardiac feedback, Exteroceptive feedback - between participants) and Feedback Frequency (faster, slower - within participants), the interaction therefore directly tests whether the effect of Feedback frequency differs statistically (i.e., is larger or smaller) in the participants in the interoceptive and exteroceptive experiments. Thus, the conclusion that “faster feedback affected the perceptual bias more strongly in the Experiment 1 than in Experiment 2” captures the outcome of the significant interaction exactly. Indeed, this test would be statistically equivalent (and would produce identical p values) to a simple between-group t-test between each participant’s difference between the faster and slower feedback in the interoceptive group and the analogous differences between the faster and slower feedback in the exteroceptive group, as illustrated in standard examples of factorial analysis (see, e.g., Maxwell, Delaney and Kelley, 2018).

      Please note that, for the above reason, mathematically the conclusion of larger effects in one experiment than the other is licensed by the significant interaction even without follow-up t-tests. However, if the reader would like to see these tests, they are simply the main analysis results reported in each of the two experiment sections, where significant (t-test) differences between faster and slower feedback were induced with interoceptive cues (Experiment 1) but not exteroceptive cues (Experiment 2). Reporting them in the between-experiment comparison section again would therefore be redundant.

      To avoid this lack of clarity, we have now re-written the results section of each experiment. First, as noted above, we now precede our main hypothesis test - the crucial t-test comparing heartrate and pain ratings after faster vs slower feedback - with an ANOVA including all three levels (faster, congruent, slower feedback). Moreover, we removed the separate between-experiment comparison section. Instead, in the Result section of the exteroceptive Experiment 2, we now directly compare the (absent or reversed) effects of faster vs slower feedback directly, with a between-groups t-test, with the present effects in the interoceptive Experiment 1. This shows conclusively, and hopefully more clearly, that the effects in both experiments differ. We hope that this makes the logic of our analyses clearer.

      Maxwell, S. E., Delaney, H. D., & Kelley, K. (2017). Designing experiments and analyzing data: A model comparison perspective. Routledge.

      (8) The discussion is missing a limitation paragraph.

      Thank you for the suggestion. We have now added a dedicated limitations paragraph in the Discussion section (lines 832-890).

      Additional recommendations:

      Minor (chronological order):

      (1) Sample size calculations for both experiments: what was the effect size based on? A citation or further information is needed. Also, clarify why the effect size differed between the two experiments.

      Please see above

      (2) "Participants were asked to either not drink coffee or smoke cigarettes" - either is implying that one of the two was asked. I suspect it is redundant as both were not permitted.

      The intention was to restrict both behaviors, so we have corrected the sentence to clarify that participants were asked not to drink coffee or smoke cigarettes before the session.

      (3) Normalization of ECG - what exactly was normalized, namely what measure of the ECG?

      The normalized measure was the heart rate, expressed in beats per minute (bpm). We now clarify this in the Data Analysis section of Experiment 1 (Measures of the heart rate recorded with the ECG (beats per minute) in the feedback phase were normalized)

      (4) Line 360: "Mean Δ pain unpleasantness ratings were analysed analogously" - this is unclear, if already described in methods then should be removed here, if not - should be further explained here.

      Thank you for your observation. We are no longer using change scores.

      (5) Lines 418-420: "Consequently, perceptual and cardiac modulations associated with the feedback manipulation should be reduced over the exposure to the faster exteroceptive sound." - why reduced and not unchanged? I didn't follow the logic.

      We chose the term “reduced” rather than “unchanged” to remain cautious in our interpretation. Statistically, the absence of a significant effect in one experiment does not necessarily mean that no effect is present; it simply means we did not detect one. For this reason, we avoided using language that would suggest complete absence of modulation. It also more closely matches the results of the between experiment comparisons that we report in the Result section of Experiment 2, which can in principle only show that the effect in Experiment 2 was smaller than that of Experiment 1, not that it was absent. Even the TOST analysis that we utilize to show the absence of an effect can only show that any effect that is present is smaller than we could reasonably expect to detect with our experimental design, not its complete absence.

      Also, on a theoretical level, pain is a complex, multidimensional experience influenced not only by sensory input but also by cognitive, emotional, social and expectancy factors. For this reason, we considered it important to remain open to the possibility that other mechanisms beyond the misleading cardiac prior induced by the feedback might have contributed to the observed effects. If such other influences had contributed to the induced differences between faster and slower feedback in Experiment 1, some remainder of this difference could have been observed in Experiment 2 as well.

      Thus, for both statistical and theoretical reasons, we were careful to predict a reduction of the crucial difference, not its complete elimination. However, to warrant the possibility that effects could be completely eliminated we now write that “perceptual and cardiac modulations associated with the feedback manipulation should be reduced or eliminated with exteroceptive feedback”

      (6) Study 2 generation of feedback - was this again tailored per participants (25% above and beyond their own HR at baseline + gradually increasing or decreasing), or identical for everyone?

      Yes, in Study 2, the generation of feedback was tailored to each participant, mirroring the procedure or Experiment 1. Specifically, the feedback was set to be 25% above or below their baseline heart rate, with the feedback gradually increasing or decreasing. This individualized approach ensured that each participant experienced feedback relative to their own baseline heart rate. We now clarify this in the Methods section (lines 306-318).

      (7) I did not follow why we need the TOST and how to interpret its results.

      We thank the reviewer for raising this important point. In classical null hypothesis significance testing (NHST), a non-significant p-value (e.g., p > .05) only indicates that we failed to find a statistically significant difference, not that there is no difference. It therefore does not allow us to conclude that two conditions are equivalent – only that we cannot confidently say they are different. In our case, to support the claim that exteroceptive feedback does not induce perceptual or physiological changes (unlike interoceptive feedback), we needed a method to test for the absence of a meaningful effect, not just the absence of a statistically detectable one.

      The TOST (Two One-Sided Tests) procedure reverses the logic of NHST by testing whether the observed effect falls within a predefined equivalence interval, called the smallest effect size of interest (SESOI) that is in principle measurable with our design parameters (e.g., type of test, number of participants). This approach is necessary when the goal is not to detect a difference, but rather to demonstrate that an observed effect is so small that it can be considered negligible – or at the least smaller than we could in principle expect to observe in the given experiment. We used the TOST procedure in Experiment 2 to test for statistical equivalence between the effects of faster and slower exteroceptive feedback on pain ratings and heart rate.

      We hope that the clearer explanation now provided in data analysis of Experiment 2 section (lines 5589-563) fully addresses the reviewer’s concern.

      (8) Lines 492-3: authors say TOST significant, while p value = 0.065

      We thank the reviewer for spotting this inconsistency. The discrepancy was due to a typographical error in the initial manuscript. During the revision of the paper, we rechecked and fully recomputed all TOST analyses, and the results have now been corrected throughout the manuscript to accurately reflect the statistical outcomes. In particular, for the comparison of heart rate between faster and slower exteroceptive feedback in Experiment 2, the corrected TOST analysis now shows a significant equivalence, with the observed effect size being d = -0.19 (90% CI [-0.36, -0.03]) and both one-sided tests yielding p = .025 and p < .001. These updated results are reported in the revised Results section.

      Reviewer #2 (Recommendations For The Authors):

      I would suggest the authors revise their definition of pain in the introduction, since it is not always a protective experience. The new IASP definition specifically takes this into consideration.

      We thank the reviewer for this suggestion. We have updated the definition of pain in the Introduction (lines 2-4) to align with the most recent IASP definition (2020), which characterizes pain as “an unpleasant sensory and emotional experience associated with, or resembling that associated with, actual or potential tissue damage” (lines 51-53).

      The work on exteroceptive cues does not necessarily neglect the role of interoceptive sources of information, although it is true that it has been comparatively less studied. I suggest rephrasing this sentence to reflect this.

      We thank the reviewer for pointing out this important nuance. We agree that studies employing exteroceptive cues to modulate pain perception do not necessarily neglect the role of interoceptive sources, even though these are not always the primary focus of investigation. Our intention was not to imply a strict dichotomy, but rather to highlight that interoceptive mechanisms have been comparatively under-investigated. We have revised the sentence in the Introduction accordingly to better reflect this perspective (Introduction, lines 110-112, “Although interoceptive processes may have contributed to the observed effects, these studies did not specifically target interoceptive sources of information within the inferential process.”).

      The last paragraph of the introduction (lines 158-164) contains generalizations beyond what can be supported by the data and the results, about the generation of predictive processes and the origins of these predictions. The statements regarding the understanding of pain-related pathologies in terms of chronic aberrant predictions in the context of this study are also unwarranted.

      We have deleted this paragraph now.

      I could not find the study registration (at least in clinicaltrials.gov). This is curious considering that the hypothesis and the experimental design seem in principle well thought out, and a study pre-registration improves the credibility of the research (Nosek et al., 2018). I also find the choice for the smallest effect of interest (SESOI) odd. Besides the unnecessary variable transformations (more on that later), there is no justification for why that particular SESOI was chosen, or why it changes between experiments (Dienes, 2021; King, 2011), which makes the choice look arbitrary. The SESOI is a fundamental component of a priori power analysis (Lakens, 2022), and without rationale and preregistration, it is impossible to tell whether this is a case of SPARKing or not (Sasaki & Yamada, 2023).

      We acknowledge that the study was not preregistered. Although our hypotheses and design were developed a priori and informed by established theoretical frameworks, the lack of formal preregistration is a limitation.

      The SESOI values for Experiments 1 and 2 were derived from sensitivity analyses based on the fixed design parameters (type of test, number of participants, alpha level) of our study, not from any post-hoc interpretation based on observed results - they can therefore not be a case of SPARKing. Following current recommendations (Anderson, Kelley & Maxwell, 2017; Albers & Lakens, 2017; Lakens, 2022), we avoided basing power estimates on published effect sizes, as no such values exist for in novel paradigms, and are typically inflated due to publication and other biases. Instead, sensitivity analyses (using G*Power, v 3.1) allows us to calculate, prospectively, the smallest effect each design could detect with 90 % power, given the actual sample size, test type, and α level. Because more participants were excluded in Experiment 2, this design can detect slightly larger effects (d = 0.62) than Experiment 1 (d = 0.57). Please note that both studies therefore remain well-powered to capture effects of the magnitude typically reported in previous research using feedback manipulations to explore interoceptive illusions (e.g., Iodice et al., 2019, d ≈ 0.7).

      We have added this clarification to the Participants section of Experiment 1 (Lines 208-217).

      Anderson, S. F., Kelley, K., & Maxwell, S. E. (2017). Sample-Size Planning for More Accurate Statistical Power: A Method Adjusting Sample Effect Sizes for Publication Bias and Uncertainty. Psychological Science, 28(11), 1547-1562.

      Lakens, D. (2022). Sample size justification. Collabra: psychology, 8(1), 33267.

      Albers, C., & Lakens, D. (2018). When power analyses based on pilot data are biased: Inaccurate effect size estimators and follow-up bias. Journal of experimental social psychology, 74, 187-195.

      In the Apparatus subsection, it is stated that the intensity of the electrical stimuli was fixed at 2 ms. I believe the authors refer to the duration of the stimulus, not its intensity.

      You are right, thank you for pointing that out. The text should refer to the duration of the electrical stimulus, not its intensity. We have corrected this wording in the revised manuscript to avoid confusion.

      It would be interesting to report (in graphical form) the stimulation intensities corresponding to the calibration procedure for the five different pain levels identified for all subjects.

      That's a good suggestion. We have included a supplementary figure showing the stimulation intensities corresponding to the five individually calibrated pain levels across all participants (Supplementary Figure 11.)

      It is questionable that researchers state that "pain and unpleasantness should be rated independently" but then the first level of the Likert scale for unpleasantness is "1=no pain". This is particularly relevant since simulation (and specifically electrical stimulation) can be unpleasant but non-painful at the same time. Since the experiments were already performed, the researchers should at least explain this choice.

      Thank you for raising this point. You are right in that the label of “no pain” in the pain unpleasantness scale was not ideal, and we now acknowledge this in the text (lines 886-890). Please note that this was always the second rating that participants gave (after pain intensity), and the strongest results come from this first rating.

      Discussion.

      I did not find in the manuscript the rationale for varying the frequency of the heart rate by 25% (instead of any other arbitrary quantity).

      We thank the Reviewer for this observation, which prompted us to clarify the rationale behind our choice of a ±25% manipulation of heart rate feedback. False feedback paradigms have historically relied on a variety of approaches to modulate perceived cardiac signals. Some studies have adopted non-individualised values, using fixed frequencies (e.g., 60 or 110 bpm) to evoke states of calm or arousal, independently of participants’ actual physiology (Valins, 1966; Shahidi & Baluch, 1991; Crucian et al., 2000; Tajadura-Jiménez et al., 2008). Others have used the participant’s real-time heart rate as a basis, introducing accelerations or decelerations without applying a specific percentage transformation (e.g., Iodice et al., 2019). More recently, a growing body of work has employed percentage-based alterations of the instantaneous heart rate, offering a controlled and participant-specific manipulation. These include studies using −20% (Azevedo et al., 2017), ±30% (Dey et al., 2018), and even ±50% (Gray et al., 2007).

      These different methodologies - non-individualised, absolute, or proportionally scaled - have all been shown to effectively modulate subjective and physiological responses. They suggest that the impact of false feedback does not depend on a single fixed method, but rather on the plausibility and salience of the manipulation within the context of the task. We chose to apply a ±25% variation because it falls well within the most commonly used range and strikes a balance between producing a detectable effect and maintaining the illusion of physiological realism. The magnitude is conceptually justified as being large enough to shape interoceptive and emotional experience (as shown by Azevedo and Dey), yet small enough to avoid implausible or disruptive alterations, such as those approaching ±50%. We have now clarified this rationale in the revised Procedure paragraph of Experiment 1 (lines 306-318).

      T. Azevedo, R., Bennett, N., Bilicki, A., Hooper, J., Markopoulou, F., & Tsakiris, M. (2017). The calming effect of a new wearable device during the anticipation of public speech. Scientific reports, 7(1), 2285.

      Crucian, G. P., Hughes, J. D., Barrett, A. M., Williamson, D. J. G., Bauer, R. M., Bowers, D., & Heilman, K. M. (2000). Emotional and physiological responses to false feedback. Cortex, 36(5), 623-647.

      Dey, A., Chen, H., Billinghurst, M., & Lindeman, R. W. (2018, October). Effects of manipulating physiological feedback in immersive virtual environments. In Proceedings of the 2018 Annual Symposium on Computer-Human Interaction in Play (pp. 101-111).

      Gray, M. A., Harrison, N. A., Wiens, S., & Critchley, H. D. (2007). Modulation of emotional appraisal by false physiological feedback during fMRI. PLoS one, 2(6), e546.

      Shahidi, S., & Baluch, B. (1991). False heart-rate feedback, social anxiety and self-attribution of embarrassment. Psychological reports, 69(3), 1024-1026.

      Tajadura-Jiménez, A., Väljamäe, A., & Västfjäll, D. (2008). Self-representation in mediated environments: the experience of emotions modulated by auditory-vibrotactile heartbeat. CyberPsychology & Behavior, 11(1), 33-38.

      Valins, S. (1966). Cognitive effects of false heart-rate feedback. Journal of personality and social psychology, 4(4), 400.

      The researchers state that pain ratings collected in the feedback phase were normalized to the no-feedback phase to control for inter-individual variability in pain perception, as established by previous research. They cite three studies involving smell and taste, of which the last two contain the same normalization presented in this study. However, unlike these studies, the outcomes here require no normalization whatsoever, because there should be no (or very little) inter-individual variability in pain intensity ratings. Indeed, pain intensity ratings in this study are anchored to 30, 50, and 70 / 100 as a condition of the experimental design. The researchers go to extreme lengths to ensure this is the case, by adjusting stimulation intensities until at least 75% of stimulation intensities are correctly matched to their pain ratings counterpart in the pre-experiment procedure. In other words, inter-individual variability in this study is in stimulation intensities, and not pain intensity ratings. Even if it could be argued that pain unpleasantness and heart rate still need to account for inter-individual variability, the best way to do this is by using the baseline (no-feedback) measures as covariates in a mixed linear model. Another advantage of this approach is that all the effects can be described in terms of the original scales and are readily understandable, and post hoc tests between levels can be corrected for multiple comparisons. On the contrary, the familywise error rate for the comparisons between conditions in the current analysis is larger than 5% (since there is a "main" paired t-test and additional "simple" tests).

      We disagree that there is little to no variability in the no feedback phase. Participants were tested in their ability to distinguish intensities in an initial pre-experiment calibration phase. In the no feedback phase, participants rated the pain stimuli in the full experimental context.

      In the pre-experiment calibration phase, participants were tested only once in their ability to match five electrical‐stimulation levels to the 0-100 NPS scale, before any feedback manipulation started. During this pre-experiment calibration we required that each level was classified correctly on ≥ 75 % of the four repetitions; “correct” meant falling within ± 5 NPS units of the target anchor (e.g., a response of 25–35 was accepted for the 30/100 anchor). This procedure served one purpose only: to make sure that every participant entered the main experiment with three unambiguously distinguishable stimulation levels (30 / 50 / 70). We integrated this point in the revised manuscript lines 263-270.

      Once the real task began, the context changed: shocks are unpredictable, attention is drawn to the heartbeat, and participants must judge both intensity and unpleasantness. In this full experimental setting the no-feedback block indeed shows considerable variability, even for the pain intensity ratings. Participants mean rating on the NPS scale was 46.4, with a standard deviation of 11.9 - thus participants vary quite strongly in their mean ratings (range 14.5 to 70). Moreover, while all participants show a positive correlation between actual intensities and their ratings (i.e., they rate the higher intensities as more intense than the lower ones), they vary in how much of the scale they use, with differences between reported highest and lowest intensities ranging between 8 and 91, for the participants showing the smallest and largest differences, respectively.

      Thus, while we simplified the analysis to remove the difference scoring relative to the congruent trials and now use these congruent trials as an additional condition in the analysis, we retained the normalisation procedure to account for the in-fact-existing between-participant variability, and ensure consistency with prior research (Bartolo et al., 2013; Cecchini et al., 2020; Riello et al., 2019) and our a priori analysis plan.

      However, to ensure we fully address your point here (and the other reviewers’ points about potential additional factors affecting the effects, like trial number and stimulus intensity), we also report an additional linear mixed-effects model analysis without normalization. It includes every feedback level as condition (No-Feedback, Congruent, Slower, Faster), plus additional predictors for actual stimulus intensity and trial rank within the experiment (as suggested by the other reviewers). This confirms that all relevant results remain intact once baseline and congruent trials are explicitly included in the model.

      In brief, cross‐experiment analyses demonstrated that the Faster vs Slower contrast was markedly larger when the feedback was interoceptive than when it was exteroceptive. This held for heart-rate deceleration (b = 0.94 bpm, p = .005), for increases in unpleasantness (b = -0.16 Likert units, p = .015), and in pain-intensity ratings (b = -3.27 NPS points, p = .037).

      These findings were then further confirmed by within-experiment analyses. Within the interoceptive experiment, the mixed-model on raw scores replicated every original effect: heart rate was lower after Faster than Slower feedback (estimate = –0.69 bpm, p = .005); unpleasantness was higher after Faster than Slower feedback (estimate = 0.19, p < .001); pain-intensity rose after Faster versus Slower (estimate=-4.285, p < .001). In the exteroceptive experiment, however, none of these Faster–Slower contrasts reached significance for heart rate (all ps > .33), unpleasantness (all ps > .43) or intensity (all ps > .10).  Because these effects remain significant even with No-Feedback and Congruent trials explicitly included in the model and vanish under exteroceptive control, the supplementary, non-normalised analyses confirm that the faster vs. slower interoceptive feedback uniquely lowers anticipatory heart rate while amplifying both intensity and unpleasantness of pain, independent of data transformation or reference conditions.  Please see Supplementary analyses for further details.

      Bartolo, M., Serrao, M., Gamgebeli, Z., Alpaidze, M., Perrotta, A., Padua, L., Pierelli, F., Nappi, G., & Sandrini, G. (2013). Modulation of the human nociceptive flexion reflex by pleasant and unpleasant odors. PAIN®, 154(10), 2054-2059.

      Cecchini, M. P., Riello, M., Sandri, A., Zanini, A., Fiorio, M., & Tinazzi, M. (2020). Smell and taste dissociations in the modulation of tonic pain perception induced by a capsaicin cream application. European Journal of Pain, 24(10), 1946-1955.

      Riello, M., Cecchini, M. P., Zanini, A., Di Chiappari, M., Tinazzi, M., & Fiorio, M. (2019). Perception of phasic pain is modulated by smell and taste. European Journal of Pain, 23(10), 1790-1800.

      I could initially not find a rationale for bringing upfront the comparison between faster vs. slower HR acoustic feedback when in principle the intuitive comparisons would be faster vs. congruent and slower vs. congruent feedback. This is even more relevant considering that in the proposed main comparison, the congruent feedback does not play a role: since Δ outcomes are calculated as (faster - congruent) and (slower - congruent), a paired t-test between Δ faster and Δ slower outcomes equals (faster - congruent) - (slower - congruent) = (faster - slower). I later realized that the statistical comparison (paired t-test) of pain intensity ratings of faster vs. slower acoustic feedback is significant in experiment 1 but not in experiment 2, which in principle would support the argument that interoceptive, but not exteroceptive, feedback modulates pain perception. However, the "simple" t-tests show that faster feedback modulates pain perception in both experiments, although the effect is larger in experiment 1 (interoceptive feedback) compared to experiment 2 (exteroceptive feedback).

      The comparison between faster and slower feedback is indeed crucial, and we regret not having made this clearer in the first version of the manuscript. As noted in our response to your point in the public review, this comparison is both statistically most powerful, and theoretically the most appropriate, as it controls for any influence of salience or surprise when heart rates deviate (in either direction) from what is expected. It therefore provides a clean measure of how much accelerated heartrate affects pain perception and physiological response, relative to an equal change in the opposite direction. However, as noted above, in the new version of the manuscript we have now removed the analysis via difference scores, and directly compared all three relevant conditions (faster, congruent, slower), first via an ANOVA and then with follow-up planned t-tests.

      Please refer to our previous response for further details (i.e., Furthermore, the researchers propose the comparison of faster vs. slower delta HR acoustic feedback throughout the manuscript when the natural comparison is the incongruent vs. the congruent feedback [..]).

      The design of experiment two involves the selection of knocking wood sounds to act as exteroceptive acoustic feedback. Since the purpose is to test whether sound affects pain intensity ratings, unpleasantness, and heart rate, it would have made sense to choose sounds that would be more likely to elicit such changes, e.g. Taffou et al. (2021), Chen & Wang (2022), Zhou et al. (2022), Tajadura-Jiménez et al. (2010). Whereas I acknowledge that there is a difference in effect sizes between experiment 1 and experiment 2 for the faster acoustic feedback, I am not fully convinced that this difference is due to the nature of the feedback (interoceptive vs. exteroceptive), since a similar difference could arguably be obtained by exteroceptive sound with looming or rough qualities. Since the experiment was already carried out and this hypothesis cannot be tested, I suggest that the researchers moderate the inferences made in the Discussion regarding these results.

      Please refer to our previous response for a previous detailed answer to this point in the Public Review (i.e., This could be influenced by the fact that the faster HR exteroceptive cue in experiment 2 also shows a significant modulatory effect [..]). As we describe there, we see little grounds to suspect such a non-specific influence of acoustic parameters, as it is specifically the sensitivity to the change in heart rate (faster vs slower) that is affected by our between-experiment manipulation, not the overall response to the different exteroceptive or interoceptive sounds. Moreover, the specific change induced by the faster interoceptive feedback - a heartrate deceleration - is not consistent with a change in arousal or alertness (which would have predicted an increase in heartrate with increasing arousal). See also Discussion-Accounting for general unspecific contributions.

      Additionally, the fact that no significant effects were found for unpleasantness ratings or heart rate (absence of evidence) should not be taken as proof that faster exteroceptive feedback does not induce an effect on these outcomes (evidence of absence). In this case, it could be that there is actually no effect on these variables, or that the experiment was not sufficiently powered to detect those effects. This would depend on the SESOIs for these variables, which as stated before, was not properly justified.

      We very much agree that the absence of significant effects should not be interpreted as definitive evidence of absence. Indeed, we were careful not to overinterpret the null findings for heart rate and unpleasantness ratings, and we conducted additional analyses to clarify their interpretation. First, the TOST analysis shows that any effects in Experiment 2 are (significantly) smaller than the smallest effect size that can possibly be detected in our experiment, given the experimental parameters (number of participants, type of test, alpha level). Second, and more importantly, we run between-experiments comparisons (see Results Experiment 2, and Supplementary materials, Cross-experiment analysis between-subjects model) of the crucial difference in the changes induced by faster and slower feedback. This showed that the differences were larger with interoceptive (Experiment 1) than exteroceptive cues (Experiment 2). Thus, even if a smaller than is in principle detectable effect is induced by the exteroceptive cues in Experiment 2, it is smaller than with interoceptive cues in Experiment 1.

      To ensure we fully address this point, we have now simplified our main analysis (main manuscript), replicated it with a different analysis (Supplementary material), we motivate more clearly (Methods Experiment 1), why the comparison between faster and slower feedback is crucial, and we make clearer that the difference between these conditions is larger in Experiment 1 than Experiment 2 (Results Experiment 2). Moreover, we went through the manuscript and ensured that our wording does not over-interpret the absence of effects in Experiment 2, as an absence of a difference.

      The section "Additional comparison analysis between experiments" encompasses in a way all possible comparisons between levels of the different factors in both experiments. My original suggestion regarding the use of a mixed linear model with covariates is still valid for this case. This analysis also brings into question another aspect of the experimental design: what is the rationale for dividing the study into two experiments, considering that variability and confounding factors would have been much better controlled in a single experimental session that includes all conditions?

      We thank the reviewer for their comment. We would like to note, first, that the between-experiment analyses did not encompass all possible comparisons between levels, as it just included faster and slower feedback for the within-experiment comparison Instead, they focus on the specific interaction between faster and slower feedback on the one hand, and interoceptive vs exteroceptive cues on the other. This interaction essentially compares, for each dependent measure (HR, pain unpleasantness, pain intensity), the difference between faster and slower feedback in Experiment 1 with that the same difference in Experiment 2 (and would produce identical p values to a between-experiment t-test). The significant interactions therefore indicate larger effects of interoceptive cues than exteroceptive ones for each of the measures. To make this clearer, we have now exchanged the analysis with between-experiment t-tests of the difference between faster and slower feedback for each measure (Results Experiment 2), producing identical results. Moreover, as suggested, we also now report linear mixed model analyses (see Supplementary Materials), which provide a comprehensive comparison across experiments.

      Regarding the experimental design, we appreciate the reviewer’s suggestion regarding a within-subject crossover design. While such an approach indeed offers greater statistical power by reducing interindividual variability (Charness, Gneezy, & Kuhn, 2012), we intentionally chose a between-subjects design due to theoretical and methodological considerations specific to deceptive feedback paradigms. First, carryover effects are a major concern in deception studies. Participants exposed to one type of feedback could develop suspicion or adaptive strategies that would alter their responses in subsequent conditions (Martin & Sayette, 1993). Expectancy effects could thus contaminate results in a crossover design, particularly when feedback manipulation becomes apparent. In line with this idea, past studies on false cardiac feedback (e.g., Valins, 1966; Pennebaker & Lightner, 1980) often employed between-subjects or blocked designs to maintain the ecological validity of the illusion.

      Charness, G., Gneezy, U., & Kuhn, M. A. (2012). Experimental methods: Between-subject and within-subject design. Journal of economic behavior & organization, 81(1), 1-8.

      Martin, C. S., & Sayette, M. A. (1993). Experimental design in alcohol administration research: limitations and alternatives in the manipulation of dosage-set. Journal of studies on alcohol, 54(6), 750-761.

      Pennebaker, J. W., & Lightner, J. M. (1980). Competition of internal and external information in an exercise setting. Journal of personality and social psychology, 39(1), 165.

      Valins, S. (1966). Cognitive effects of false heart-rate feedback. Journal of personality and social psychology, 4(4), 400.

      References

      Chen ZS, Wang J. Pain, from perception to action: A computational perspective. iScience. 2022 Dec 1;26(1):105707. doi: 10.1016/j.isci.2022.105707.

      Dienes Z. Obtaining Evidence for No Effect. Collabra: Psychology 2021 Jan 4; 7 (1): 28202. doi: 10.1525/collabra.28202

      King MT. A point of minimal important difference (MID): a critique of terminology and methods. Expert Rev Pharmacoecon Outcomes Res. 2011 Apr;11(2):171-84. doi: 10.1586/erp.11.9.

      Lakens D. Sample Size Justification. Collabra: Psychology 2022 Jan 5; 8 (1): 33267. doi: 10.1525/collabra.33267

      Nosek BA, Ebersole CR, DeHaven AC, Mellor DT. The preregistration revolution. Proc Natl Acad Sci U S A. 2018 Mar 13;115(11):2600-2606. doi: 10.1073/pnas.1708274114.

      Sasaki K, Yamada Y. SPARKing: Sample-size planning after the results are known. Front Hum Neurosci. 2023 Feb 22;17:912338. doi: 10.3389/fnhum.2023.912338.

      Taffou M, Suied C, Viaud-Delmon I. Auditory roughness elicits defense reactions. Sci Rep. 2021 Jan 13;11(1):956. doi: 10.1038/s41598-020-79767-0.

      Tajadura-Jiménez A, Väljamäe A, Asutay E, Västfjäll D. Embodied auditory perception: The emotional impact of approaching and receding sound sources. Emotion. 2010, 10(2), 216-229.https://doi.org/10.1037/a0018422

      Zhou W, Ye C, Wang H, Mao Y, Zhang W, Liu A, Yang CL, Li T, Hayashi L, Zhao W, Chen L, Liu Y, Tao W, Zhang Z. Sound induces analgesia through corticothalamic circuits. Science. 2022 Jul 8;377(6602):198-204. doi: 10.1126/science.abn4663.

      Reviewer #3 (Recommendations For The Authors):

      The manuscript would benefit from some spelling- and grammar checking.

      Done

      Discussion:

      The discussion section is rather lengthy and would benefit from some re-structuring, editing, and sub-section headers.

      In response, we have restructured and edited the Discussion section to improve clarity and flow.

      I personally had a difficult time understanding how the data relates to the rubber hand illusion (l.623-630). I would recommend revising or deleting this section.

      We thank the reviewer for this valuable feedback. We have revised the paragraph and made the parallel clearer (lines 731-739).

      Other areas are a bit short and might benefit from some elaboration, such as clinical implications. Since they were mentioned in the abstract, I had expected a bit more thorough discussion here (l. 718).

      Thank you for this suggestion. We have expanded the discussion to more thoroughly address the clinical implications of our interoceptive pain illusion (See Limitations and Future Directions paragraph).

      Further, clarification is needed for the following:

      I would like some more details on participant instructions; in particular, the potential difference in instruction between Exp. 1 and 2, if any. In Exp. 1, it says: (l. 280) "Crucially, they were also informed that over the 60 seconds preceding the administration of the shock, they were exposed to acoustic feedback, which was equivalent to their ongoing heart rate". Was there a similar instruction for Exp. 2? If yes, it would suggest a more specific effect of cardiac auditory feedback; if no, the ramifications of this difference in instructions should be more thoroughly discussed.

      Thank you for this suggestion. We have clarified this point in the Procedure of Experiment 2 (548-550).

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript by Sy SKH. et al. on pallium encoded chemosensory impact of eye-body coordination describes how the valence of chemosensory stimuli can affect the coordination of eye saccades with tail flips. They show that aversive valence stimuli can increase both the strength and frequency of tail flips through a pallium-mediated circuit.

      Overall, the manuscript is well-written and easy to follow, although the figures are quite dense, the methodology is mostly sound, and the improvement to the fish on chips system is very interesting. The methods description is thorough and welcome, making the experiments clear. The limited number of animals, and the spread between 5 and 6dpf is a concern as most of the statistics seem to have been done on the individual events, and not the number of biological samples.

      The initial behavioural experiments are very promising. However, the conclusions surrounding the role of the pallium are a lot more speculative and not supported by the results.

      Comments:

      (1) The fish on chips 2.0 methods show a lot of promise for future studies of chemosensory stimuli, combined with whole-brain imaging. This will provide new avenues of research for zebrafish neuroscientists.

      (2) Chemosensory cues would have a very different timing than visual cues; timing is very important for multisensory integration. How do the authors suggest those are integrated? How would they differentiate between an integration of various cues or a different arousal state, as they describe in the introduction?

      (3) Studies have looked at chemosensation in Drosophila, including multisensory integration, which should be discussed by the authors (see the work of Mark Frye, amongst others).

      (4) In the brain imaging methods, there is a mention of robustly behaving larvae. Does that mean that an exclusion criterion was used to select only 5 larvae? If so, this should be stated clearly. The authors also do not mention how they avoid the switch to a passive state that one of the coauthors has observed in closed closed-loop setup. The authors should comment on this point.

      (5) Were the statistics in Figure 2 done with an n of 5, or do they assume that each tail flip and saccade is an independent event? I would imagine the latter would have inflated p-values and should be avoided.

      (7) Page 7: Why do the authors think that the cumulative effect of these minor differences could lead to very different behavioural goals? Especially when comparing to actual startle responses, which are extremely strong and stereotypical. How do their observations compare to the thermosensory navigation of larval zebrafish observed by Martin Haesemeyer, for example, or the work of the RoLi lab?

      (8) Page 8: Figure 5, I am confused by the y-axis of g, in e and f, the values are capped at 2, whereas in g they go up to 6, with apparently a number of cells whose preference is out of the y-axis limit (especially in Q2). Having the number of cells in each quadrant would also help to assess if indeed there is some preference in the pallium towards Q1.

      (9) Figure 6: How is the onset of neuronal activity determined compared to the motor stimulus? Looking at Supplementary Figure 8, it is quite unclear how the pallium is different from the OB or subpallium. The label of onset delay is also confusing in this figure.

      (10) Page 9: I do not think that the small differences observed in the pallium are as clear-cut as the authors make them out to be, or that they provide such strong evidence of their importance. As there are no interventions showing any causality in the presence of these pallium responses and the sensorimotor responses, these could represent different arousal states rather than any integration of sensory information.

    1. origin of cottage industry

      • maritime empires and commercial revolution led to Indian cotton imports

      • before people spun their own fabric

      • Indian import came in after than the other spun

      • to compete with Indian cotton, Britain used slave labor in Americas to make thread and export it

      • Gave women in Britain some little power because they did their own work weaving, how ever it's slow so the y need new technologies

    1. permettant d’organiser de gérer les captures comme des artefacts

      Il semble y avoir une coupure dans la phrase. Est-ce que ce serait pas plutôt "d'organiser et de gérer les captures [...]" ?

    2. « objectif politique de montrer que le futur Mucem n’était pas qu’un musée de faux et de marteaux, qu’il y avait une collection d’une grande richesse et une utilisation des moyens les plus modernes et les plus perfo mants de l’époque pour les montrer »

      Dans ma dernière version j'ai retiré l'italique pour toutes les citations directes pour ne laisser que les guillemets.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the Reviewers for their positive assessment of the quality and significance of our work, as well as for their insightful comments, which have helped us to further improve the manuscript. We have addressed the majority of the comments in the revised version and, for those that require additional time, we outline below a detailed plan of the experiments we intend to perform.

      We agree with Reviewer #2 that a more detailed mechanistic understanding of the drug effects would further strengthen the study, and we are grateful to both reviewers for the constructive experimental suggestions provided to address this point. In particular, we are highly motivated to better define the causal role of C18 sphingolipid alterations in mediating the effects of the drugs, as suggested by Reviewer #2, as well as to investigate the involvement of the retromer complex in the lysosome-to-Golgi connection, as suggested by Reviewer #1.

      Below, we provide a point-by-point description of the revisions already incorporated into the manuscript, along with the planned experiments that will address the remaining comments

      REVIEWER #1:

      VPS13B is a bridge-like lipid transfer protein, the loss or mutation of which is associated with Cohen syndrome (CS) involving Golgi fragmentation. In this study, the authors performed image-based chemical screens to identify compounds capable of rescuing the Golgi morphology in VPS13B-KO HeLa cells. They identified 50 compounds, the majority of which are lysosomotropic compounds or cationic amphiphilic drugs (CADs). Treatment of cells with several of these compounds causes lysosomal lipid storage, as assessed by BMP/LBPA staining, filipin staining, or LipidTOX staining. Interestingly, most LipidTOX puncta colocalized with transferrin receptor-positive compartments but not lysosomes. Similar to lysosomotropic compounds, knocking down NPC1 or SMPD1, mimicking lysosomal storage disease, also substantially rescued Golgi morphology. The authors show that VPS13B-KO cells have reduced C18 sphingolipids, which is reversed by treatment with CADs. Finally, the authors show that two CADs partially rescue neurite outgrowth in neuronal cultures. However, these drugs do not rescue the size of VPS13B KO organoids.

      Overall, this is an impressive study identifying CADs as potential therapeutics for CS and suggesting sphingolipid upregulation as a general strategy for CS treatment. The morphological and lipidomics analyses unravel important molecular basis of CS pathology. This study will be of high interest to the field of lipid biology and organelle homeostasis. I have a few comments to help improve the quality of this study.

      1. The reverse of lipid changes in VPS13B-KO cells by CADs is intriguing. Are CAD-mediated benefits such as Golgi morphology recovery permanent or only transient within 24 hours of treatment? How do the CADs affect the Golgi morphology in WT HeLa cells?

      RESPONSE:

      We thank the reviewer for this insightful question Indeed, the effects of CADs on Golgi organization are most evident in VPS13B KO cells, where the Golgi apparatus is severely fragmented and becomes more compact upon drug treatment, whereas the effect is much less apparent in wild-type cells. Nevertheless, a careful quantitative analysis of the images (now presented in the new Fig. S7) demonstrates that the impact of these compounds on Golgi morphology is not restricted to KO cells but is likely more general, supporting a link between lysosomal storage and Golgi organization. Although this observation indicates an indirect effect (consistent with the proposed mechanism of action), rather than a direct correction of VPS13B loss, it does not compromise in our opinion their potential beneficial effect for KO cells as shown also from the results obtained in organoid-derived neurons.

      Under continuous treatment, azelastine keeps the Golgi in a compact state for 72 hours without any noticeable deleterious effect on the cells (see new Fig. S10) Raloxifene, on the contrary proved to be toxic over the same time period. We believe this difference reflects the mechanism of action of CADs, which progressively accumulate within acidic organelles and may eventually reach a toxic threshold upon prolonged exposure. For this reason, lower drug concentrations administered over longer treatment periods may represent a viable alternative strategy. In this regard, we also refer the reviewer to our response to the comment on brain organoids below.

      1. Is it surprising that Azelastine-induced lipid storage in transferrin receptor compartments (early and recycling endosomes)? I suggest more controls to examine LipidTOX overlap with Golgi markers or other late endosome/lysosome markers such as LBPA and CD63.

      RESPONSE:

      We agree with the reviewer that this observation is somewhat unexpected. However, we would like to clarify that we do not intend to suggest that lipid storage occurs primarily in early or recycling endosomes, which would indeed contradict a substantial body of existing evidence. Rather, our data indicate that this particular dye (LipidTOX) labels recycling endosomes, at least in HeLa cells. This finding is consistent with the widely accepted view that lysosomal lipid storage exerts broader effects on intracellular trafficking, not limited to late endosomes/lysosomes. We corrected the text in order to clarify this concept.

      LipidTOX was specifically developed to detect drug-induced phospholipidosis, and based on our data, it appears suitable for this purpose. To our knowledge, there is no published information detailing its intracellular localization, which motivated us to perform these control experiments. Unfortunately, the proprietary formulation of this product does not allow informed speculations to explain the observed localization or whether this could refer to the intact molecule or to a catabolite.

      As suggested by the reviewer, we plan to perform co-staining with additional markers to further clarify this this point.

      1. Does the LipidTOX/TFRC overlap suggest potential roles of retrograde transport in supplying sphingolipids to the Golgi? The authors can quickly test if the knockdown of a retromer subunit (VPS35) blocks Azelastine-induced recovery of Golgi morphology.

      RESPONSE:

      We thank the reviewer for this insightful suggestion. Indeed, the retromer complex represents one of the best-characterized trafficking pathways from the endosomal system to the Golgi, and this relatively straightforward experiment could help to mechanistically clarify our observations. We plan to test whether VPS35 knockdown interferes with the effects of the drugs.

      What is the rationale to use 500 nM to 1 uM azelastine and raloxifene for neuronal cultures and organoids? At such concentrations, no obvious changes in Golgi morphology or lipid storage were observed (Fig 4). Also, the lipidomics analysis was performed after 10 uM compound treatment. It might be worth trying dose-response experiments in organoid tests.

      RESPONSE:

      We thank the reviewer for this question. The rationale about this choice was indeed missing from our previous version of the manuscript. The reason of lowering the concentrations comes indeed from toxicity tests, preliminarily performed over long-term treatment of both WT and VPS13B KO organoids. This information has now been explicitly included in the Results section of the revised manuscript, and the broader implications are also discussed in the Discussion section.

      MINOR COMMENTS:

      It is important to know whether the authors used TGN or cis-Golgi markers for Golgi morphology analysis. Please label the two channels in Fig. 2C and throughout all figures. In many cases, it is not clear what is stained in the green channel to show the Golgi morphology. It was not even stated in the legend.

      RESPONSE:

      We now included the antibody staining in all figure legends where it was previously missing.

      The authors stated that Recovery of Golgi morphology is dependent on lysosomal lipid storage. However, while the data show positive correlation between the two, no causal relationship is established by the data. It seems true that in all conditions (CADs or genetic knockdown) where lysosomal lipid storage was observed, the authors detect the Recovery of Golgi morphology. However, budesonide did not depend on lysosomal lipid storage to recover the Golgi morphology. Thus, the recovery of Golgi morphology is NOT dependent on lysosomal lipid storage, but inducing lysosomal lipid storage appears sufficient to recover Golgi morphology in VPS13B-KO HeLa cells.

      RESPONSE:

      We thank the reviewer for this comment and we agree that the previous title of the paragraph could have been misleading. This has been now changed in: “Lysosomal lipid storage mediates the recovery of Golgi morphology” which is probably less prone to ambiguous interpretations.

      Obviously, in the previous version of the title we wanted to mean that Golgi recovery is dependent on lipid storage “in the context of CAD treatment” and not as a general statement.

      With respect to the cause–effect relationship, we believe that the strongest evidence supporting this link is the observation that genetically induced lipid storage phenocopies the effects of drug treatment. We hope that this conclusion is now sufficiently clear from the revised text.

      Each figure needs a title before the detailed legends for specific panels.

      RESPONSE:

      Titles have now been included to all figure legends.

      Fig 8. Y axis labeling is missing.

      RESPONSE:

      Axes labels have now been included

      Does U18666A rescues Golgi morphology in VPS13B-KO cells?

      RESPONSE:

      We thank the reviewer for this comment. U18666A indeed also corrects Golgi morphology. The result is now included in the new figure S5.

      Please do not repeat the result section in discussion. Focus on the most important points.

      RESPONSE:

      We thank the reviewer for this comment. We shortened the descriptive part of the discussion trying as much as possible to avoid repetitions with the result session and keeping only the more essential information for the flow of the discussion.

      Reviewer #1 (Significance (Required)):

      This is an impressive study that identifies Cationic Amphiphilic Drugs (CADs) as potential therapeutics for Cohen syndrome (CS) and suggests sphingolipid upregulation as a general strategy for diseases driven by VPS13B loss-of-function. The unbiased approaches, notably the chemical screen and lipidomics, provide novel mechanistic insights into the underlying pathology of CS. This study will be of high interest to researchers in the fields of lipid biology and organelle homeostasis. It will also be highly valuable for clinical pediatricians managing CS patients.

      REVIEWER #2:

      This manuscript describes a compound screening aimed at identifying molecules that can restore Golgi organization in VPS13B knockout (KO) cells. The authors identify several compounds, most of which are lysosomotropic, and analyze their effects on Golgi morphology and lipid composition using multiple approaches. They report that VPS13B KO cells exhibit a reduction in C18-N-acyl sphingolipids, which can be restored by several of the identified compounds. Furthermore, two of these compounds, azelastine and raloxifene, promote neurite outgrowth in VPS13B KO cortical organoids. These findings are interesting and could potentially contribute to a better understanding of the pathophysiology of Cohen syndrome and the development of therapeutic strategies. However, despite the large number of analyses presented, the study remains largely descriptive, and there is no coherent mechanistic explanation for how these compounds restore Golgi structure in VPS13B KO cells. In addition to the reduction in C18-N-acyl sphingolipids, the KO cells display alterations in several other lipid species (LPC, LPE, PC40:1, PE42:1, TG, etc.), and treatment with the selected compounds induces further lipid accumulations, including cholesterol and BMP/LBPA. The relationship between these diverse lipid changes and the observed Golgi recovery lacks clarity and mechanistic consistency.

      MAJOR COMMENTS:

      The finding that compounds cannot prevent Golgi fragmentation caused by brefeldin A or nocodazole but can suppress statin-induced fragmentation is intriguing, but the underlying mechanism is not addressed. It is not evident whether this difference results from changes in membrane lipid composition or restoration of Rab/SNARE trafficking. The authors should examine Rab prenylation and SNARE localization by immunofluorescence or Western blotting to support their interpretation.

      RESPONSE:

      We thank the reviewer for this suggestion and agree that the ability of these compounds to counteract statin-induced Golgi fragmentation is indeed intriguing. The primary reason we did not further explore this aspect is that we evaluated the effects of statins not to be a central focus of the present study. Nevertheless, we fully agree that this observation represents a valuable opportunity to gain additional insight into the mechanism underlying drug-induced Golgi recovery.

      To address this point, we plan to analyze Rab prenylation by Western blot and Rab localization by microscopy, focusing on a Golgi-associated Rab protein such as Rab6. In addition, we will employ downstream inhibitors of Rab prenylation, such as 3-PEHPC (an inhibitor of type II protein geranylgeranyltransferase (GGTase-II)), which should allow us to formally distinguish effects related to impaired Rab prenylation from those arising from inhibition of cholesterol biosynthesis.

      Although restoration of C18 sphingolipids (SM 36:1, CER 36:1) is observed upon compound treatment, its causal role in Golgi recovery or neurite outgrowth is not established. The authors should test whether blocking the increase of C18 SM/CER prevents the rescue of Golgi or neuronal phenotypes.

      RESPONSE:

      We sincerely thank the reviewer for this comment. We agree that, based on the current data, a definitive cause–effect relationship between Golgi recovery and the increase in C18 sphingolipids cannot be firmly established, and we acknowledge that a deeper understanding of this issue will require further investigation. Furthermore, we believe that addressing this would not only provide a better mechanistic understanding of the biological processes behind the effect of the drugs but provide a potential avenue for therapeutic intervention. For these reasons, we are strongly motivated to pursue this aspect further.

      With respect to the reviewer’s specific suggestion, we agree that preventing the increase in C18 sphingolipids would be an ideal experimental approach. However, the limited understanding of the regulatory mechanisms controlling C18 sphingolipid homeostasis currently precludes a fully informed strategy. In principle, if the observed increase were due to enhanced synthesis, one could envisage blocking it by silencing ceramide synthases with C18 selectivity, such as CERS1. The experiment shown in Fig. 7E (azelastine treatment in the presence of sphingolipid synthesis inhibitors) was designed with this rationale in mind. However, these results suggest that azelastine-induced C18 sphingolipid accumulation is unlikely to result from increased synthesis, and is instead more consistent with reduced degradation, in line with the proposed mechanism of action of CADs.

      Based on these considerations, we propose to invert the experimental approach and test whether cellular re-complementation with C18 sphingolipids is sufficient to recapitulate the drug-induced Golgi recovery. We are aware of the technical challenges associated with the targeted delivery of exogenously supplied lipids, particularly given the likelihood that effective rescue would require lipid access to the Golgi apparatus. Based on current knowledge, we anticipate that externally supplied lipids would primarily traffic either to the ER via non-vesicular routes or to endosomes/lysosomes through endocytic uptake. From both locations they could eventually reach to some extent the Golgi. The route from endosomes to Golgi in particular as been intensively studied in the past with the use of fluorescent sphingolipid analogs1,2 and may well work also with native lipids.

      Since we are not able to predict in advance which lipid species would be more effective or the optimal delivery strategy, we plan to test re-complementation using C18 sphingomyelin and some of its potential precursors, including C18 ceramide as well as using alternative delivery strategies such as incorporation in liposomes of different formulations and delivery at the plasma membrane with bovine serum albumin or cyclodextrins as carriers.

      1. Puri et al., (2001). J Cell Biol.154:535-47 (doi: 10.1083/jcb.200102084)
      2. Koivusalo et al.,(2007). Mol Biol Cell. 18:5113-23 (doi: 10.1091/mbc.e07-04-0330)

        In Figure 7D, comparisons should include the LM and HM fractions isolated from WT cells.

      RESPONSE:

      Wild-type control were included in the figure as requested.

      The subcellular fractionation experiment should be repeated using AZL and RAL, the compounds used in organoid experiments, rather than TFPZ, to assess whether similar results are obtained. The compounds used differ across experiments, making it difficult to draw consistent conclusions.

      RESPONSE:

      We thank the reviewer for this comment and apology for some inconsistencies in the selection of the compounds to highlight in the figures which are mostly remnants of the drug prioritization history over the progression of the project. We tried to make it more consistent in the current version.

      In the new version of figure 7D, AZL is substituting TFPZ, while TFPZ data were moved to supplementary figure S19.

      Golgi morphology in VPS13B KO cells is reported to recover in NPC1 KD and SMPD1 KD cells, but it is not shown whether SM 36:1, CER 36:1, or other lipid levels also increase or change in these conditions. If Golgi morphology recovery occurs via the same mechanism as with compound treatment, a similar lipid pattern should be observed.

      RESPONSE:

      We thank the reviewer for this question that allowed us to expand our study including new interesting findings. We agree that this is an important point to strengthen the link between CAD and genetic perturbation effects. Given the availability of several published lipidomic datasets modelling LDS in HeLa and in other cell lines, we decided to perform a re-analysis of those to specifically focus on C18 sphingolipids. We found a relative increase of 36:1 upon depletion of LSD genes in all analyzed datasets for NPC1 and SMPD1, but also for more than 15 other LSD genes including NPC2, recapitulating what we find with all the CAD molecules tested in our study. These changes, were not noticed or at least not discussed by most of the authors. This is not surprising since those studies are focused on different biological questions. We believe that these findings, besides reinforcing our hypothesis of a common mechanism between CAD and NPC1/SMPD1 KO, have of general interest for the regulation of C18 sphingolipids, which are among the relative few lipid species with a bona fide specific protein binding partner and proposed to play a crucial role in Golgi traffic.

      MINOR POINTS:

      The manuscript lacks sufficient information about the compound library used for screening (number and source of compounds, compound type).

      RESPONSE:

      We apologize if this information was not sufficiently visible in the original version of the manuscript. The data about source, catalog number, formulation and several additional identifiers is included in the File S1. This is now clearly indicated in the methods so that I can be more easily visible to the readers

      Fig. 3A: a WT control image is required.

      RESPONSE:

      A WT control image is now included in the new version of Figure 3.

      Fig. 4: include representative images at concentrations higher than 1.25 µM.

      RESPONSE:

      Representative images are now included for all concentrations higher than 1.25 µM, as requested.

      Abbreviations such as BMP/LBPA should be defined when first mentioned.

      RESPONSE:

      The abbreviation of BMP/LBPA was already defined when first mentioned in the original version of the manuscript

      The abbreviation for raloxifene is inconsistent (RLX vs RAL) and should be unified.

      RESPONSE:

      Raloxifene is now abbreviated as RLX all over the manuscript.

      Fig. 5C: the meaning of the green and magenta bars is not explained.

      RESPONSE:

      Color code for figure 5C has been included.

      The definitions and centrifugation parameters for light and heavy membrane fractions should be clearly stated in the Methods.

      RESPONSE:

      The centrifugation parameters were already defined in the original manuscript. It is not clear to us, which parameter the Referee is referring to. Below is the sentence in the methods section:

      “Gradients were centrifuged at 165,000 g for 1.5 h at 4°C with a SW40Ti Swinging-Bucket rotor (Beckman-Coulter). The LM and HM fractions were collected at the 35%-HB and 35%-40.6% interfaces, respectively”

      The concentration and incubation times for BFA and nocodazole should be included in the main text or figure legends.

      RESPONSE:

      Concentrations and incubation times of BFA and nocodazole were already present in the legend of figure 5.

      Fig. 8C, D, G, H: y-axes lack labels and must be defined.

      RESPONSE:

      Axes labels have now been included

      There are multiple typographical errors, including "VPS12" instead of "VPS13B", that should be corrected.

      RESPONSE:

      We corrected this specific mistake as well as others that we could identify after careful reading of the manuscript.

      Reviewer #2 (Significance (Required)):

      While the dataset is extensive and technically detailed, the manuscript lacks a clear mechanistic explanation connecting lipid changes to Golgi restoration. The choice and comparison of compounds are inconsistent across experiments, and the interpretation remains speculative. Substantial revision and additional experiments are required before the study can be considered for publication.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      This manuscript describes a compound screening aimed at identifying molecules that can restore Golgi organization in VPS13B knockout (KO) cells. The authors identify several compounds, most of which are lysosomotropic, and analyze their effects on Golgi morphology and lipid composition using multiple approaches. They report that VPS13B KO cells exhibit a reduction in C18-N-acyl sphingolipids, which can be restored by several of the identified compounds. Furthermore, two of these compounds, azelastine and raloxifene, promote neurite outgrowth in VPS13B KO cortical organoids. These findings are interesting and could potentially contribute to a better understanding of the pathophysiology of Cohen syndrome and the development of therapeutic strategies. However, despite the large number of analyses presented, the study remains largely descriptive, and there is no coherent mechanistic explanation for how these compounds restore Golgi structure in VPS13B KO cells. In addition to the reduction in C18-N-acyl sphingolipids, the KO cells display alterations in several other lipid species (LPC, LPE, PC40:1, PE42:1, TG, etc.), and treatment with the selected compounds induces further lipid accumulations, including cholesterol and BMP/LBPA. The relationship between these diverse lipid changes and the observed Golgi recovery lacks clarity and mechanistic consistency.

      Major comments

      The finding that compounds cannot prevent Golgi fragmentation caused by brefeldin A or nocodazole but can suppress statin-induced fragmentation is intriguing, but the underlying mechanism is not addressed. It is not evident whether this difference results from changes in membrane lipid composition or restoration of Rab/SNARE trafficking. The authors should examine Rab prenylation and SNARE localization by immunofluorescence or Western blotting to support their interpretation.

      Although restoration of C18 sphingolipids (SM 36:1, CER 36:1) is observed upon compound treatment, its causal role in Golgi recovery or neurite outgrowth is not established. The authors should test whether blocking the increase of C18 SM/CER prevents the rescue of Golgi or neuronal phenotypes.

      In Figure 7D, comparisons should include the LM and HM fractions isolated from WT cells.

      The subcellular fractionation experiment should be repeated using AZL and RAL, the compounds used in organoid experiments, rather than TFPZ, to assess whether similar results are obtained. The compounds used differ across experiments, making it difficult to draw consistent conclusions.

      Golgi morphology in VPS13B KO cells is reported to recover in NPC1 KD and SMPD1 KD cells, but it is not shown whether SM 36:1, CER 36:1, or other lipid levels also increase or change in these conditions. If Golgi morphology recovery occurs via the same mechanism as with compound treatment, a similar lipid pattern should be observed.

      Minor points

      The manuscript lacks sufficient information about the compound library used for screening (number and source of compounds, compound type).

      Fig. 3A: a WT control image is required. Fig. 4: include representative images at concentrations higher than 1.25 µM. Abbreviations such as BMP/LBPA should be defined when first mentioned. The abbreviation for raloxifene is inconsistent (RLX vs RAL) and should be unified. Fig. 5C: the meaning of the green and magenta bars is not explained. The definitions and centrifugation parameters for light and heavy membrane fractions should be clearly stated in the Methods. The concentration and incubation times for BFA and nocodazole should be included in the main text or figure legends. Fig. 8C, D, G, H: y-axes lack labels and must be defined. There are multiple typographical errors, including "VPS12" instead of "VPS13B", that should be corrected.

      Significance

      While the dataset is extensive and technically detailed, the manuscript lacks a clear mechanistic explanation connecting lipid changes to Golgi restoration. The choice and comparison of compounds are inconsistent across experiments, and the interpretation remains speculative. Substantial revision and additional experiments are required before the study can be considered for publication.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      VPS13B is a bridge-like lipid transfer protein, the loss or mutation of which is associated with Cohen syndrome (CS) involving Golgi fragmentation. In this study, the authors performed image-based chemical screens to identify compounds capable of rescuing the Golgi morphology in VPS13B-KO HeLa cells. They identified 50 compounds, the majority of which are lysosomotropic compounds or cationic amphiphilic drugs (CADs). Treatment of cells with several of these compounds causes lysosomal lipid storage, as assessed by BMP/LBPA staining, filipin staining, or LipidTOX staining. Interestingly, most LipidTOX puncta colocalized with transferrin receptor-positive compartments but not lysosomes. Similar to lysosomotropic compounds, knocking down NPC1 or SMPD1, mimicking lysosomal storage disease, also substantially rescued Golgi morphology. The authors show that VPS13B-KO cells have reduced C18 sphingolipids, which is reversed by treatment with CADs. Finally, the authors show that two CADs partially rescue neurite outgrowth in neuronal cultures. However, these drugs do not rescue the size of VPS13B KO organoids.

      Overall, this is an impressive study identifying CADs as potential therapeutics for CS and suggesting sphingolipid upregulation as a general strategy for CS treatment. The morphological and lipidomics analyses unravel important molecular basis of CS pathology. This study will be of high interest to the field of lipid biology and organelle homeostasis. I have a few comments to help improve the quality of this study.

      1. The reverse of lipid changes in VPS13B-KO cells by CADs is intriguing. Are CAD-mediated benefits such as Golgi morphology recovery permanent or only transient within 24 hours of treatment? How do the CADs affect the Golgi morphology in WT HeLa cells?
      2. Is it surprising that Azelastine-induced lipid storage in transferrin receptor compartments (early and recycling endosomes)? I suggest more controls to examine LipidTOX overlap with Golgi markers or other late endosome/lysosome markers such as LBPA and CD63.
      3. Does the LipidTOX/TFRC overlap suggest potential roles of retrograde transport in supplying sphingolipids to the Golgi? The authors can quickly test if the knockdown of a retromer subunit (VPS35) blocks Azelastine-induced recovery of Golgi morphology.
      4. What is the rationale to use 500 nM to 1 uM azelastine and raloxifene for neuronal cultures and organoids? At such concentrations, no obvious changes in Golgi morphology or lipid storage were observed (Fig 4). Also, the lipidomics analysis was performed after 10 uM compound treatment. It might be worth trying dose-response experiments in organoid tests.

      Minor:

      1. It is important to know whether the authors used TGN or cis-Golgi markers for Golgi morphology analysis. Please label the two channels in Fig. 2C and throughout all figures. In many cases, it is not clear what is stained in the green channel to show the Golgi morphology. It was not even stated in the legend.
      2. The authors stated that Recovery of Golgi morphology is dependent on lysosomal lipid storage. However, while the data show positive correlation between the two, no causal relationship is established by the data. It seems true that in all conditions (CADs or genetic knockdown) where lysosomal lipid storage was observed, the authors detect the Recovery of Golgi morphology. However, budesonide did not depend on lysosomal lipid storage to recover the Golgi morphology. Thus, the recovery of Golgi morphology is NOT dependent on lysosomal lipid storage, but inducing lysosomal lipid storage appears sufficient to recover Golgi morphology in VPS13B-KO HeLa cells.
      3. Each figure needs a title before the detailed legends for specific panels.
      4. Fig 8. Y axis labeling is missing.
      5. Does U18666A rescues Golgi morphology in VPS13B-KO cells?
      6. Please do not repeat the result section in discussion. Focus on the most important points.

      Significance

      This is an impressive study that identifies Cationic Amphiphilic Drugs (CADs) as potential therapeutics for Cohen syndrome (CS) and suggests sphingolipid upregulation as a general strategy for diseases driven by VPS13B loss-of-function. The unbiased approaches, notably the chemical screen and lipidomics, provide novel mechanistic insights into the underlying pathology of CS. This study will be of high interest to researchers in the fields of lipid biology and organelle homeostasis. It will also be highly valuable for clinical pediatricians managing CS patients.

    1. corrigé

      Il y a un tout petit problème : lorsque je rentre une adresse mail avec à la fin ".co" au lieu de ".com", une erreur n'est pas affichée.

    1. alleles

      Image

      Image

      Image

      Image

      English — alleles (thorough explanation)

      1) What alleles are (core idea)

      Alleles are different versions of the same gene. They control variations of a trait, such as eye color or flower color.

      Allele = one version of a gene

      Each gene can have two or more alleles, but an individual organism usually carries two alleles per gene (one from each parent).


      2) Where alleles are found

      • Genes are located on chromosomes
      • Alleles sit at the same position (locus) on homologous chromosomes
      • One allele comes from the mother
      • One allele comes from the father

      📌 This is why offspring show traits from both parents.


      3) Example of alleles (simple)

      Trait: Seed color

      • Y = yellow
      • y = green

      Possible allele combinations:

      • YY
      • Yy
      • yy

      These combinations affect the trait that appears.


      4) Dominant vs recessive alleles (Science 10 focus)

      Dominant allele

      • Shown with a capital letter (A)
      • Expressed if at least one copy is present

      Recessive allele

      • Shown with a lowercase letter (a)
      • Expressed only if two copies are present

      📌 Example:

      • Aa → dominant trait shows
      • aa → recessive trait shows

      5) Alleles, genotype, and phenotype (key relationship)

      | Term | Meaning | Example | | --------- | ------------------ | ---------- | | Allele | Version of a gene | A or a | | Genotype | Allele combination | Aa | | Phenotype | Physical trait | Brown eyes |

      Alleles determine the genotype, which determines the phenotype.


      6) Alleles in Punnett squares

      Punnett squares:

      • Show how alleles from parents combine
      • Predict possible offspring genotypes
      • Estimate trait probabilities

      📌 Example: Parents: Aa × Aa

      • Possible offspring: AA, Aa, Aa, aa

      7) Why alleles are important

      Alleles:

      • Explain variation within a species
      • Help predict inheritance patterns
      • Are the basis of genetics and evolution
      • Allow populations to adapt over time

      One-sentence exam summary

      Alleles are different versions of the same gene that determine variations in traits.


      中文 — alleles(等位基因) 详细解释

      1) 什么是等位基因(核心概念)

      等位基因(alleles)是指同一基因的不同版本,决定同一性状的不同表现。

      等位基因 = 同一基因的不同形式


      2) 等位基因在哪里

      • 基因位于染色体
      • 等位基因位于同源染色体的相同位置
      • 一个来自母亲,一个来自父亲

      3) 等位基因举例

      性状:豌豆高度

      • T = 高
      • t = 矮

      组合可能是:

      • TT
      • Tt
      • tt

      4) 显性与隐性等位基因(必考)

      • 显性等位基因:只要有一个就会表现
      • 隐性等位基因:必须两个都有才表现

      📌 Tt → 显性性状 📌 tt → 隐性性状


      5) 等位基因与性状的关系

      • 等位基因 → 基因型
      • 基因型 → 表现型

      一句话考试版总结

      等位基因是控制同一性状的不同基因形式。

      如果你愿意,我可以把 alleles → genotype → phenotype → Punnett squares 整理成 Science 10 中英对照闪卡或互动练习,直接用于复习或教学。

    1. karyotype

      English (thorough explanation with images)

      Image

      Image

      Image

      Image

      1) What a karyotype is

      A karyotype is an organized picture of all the chromosomes in a cell, arranged in pairs, from largest to smallest, showing their number and structure.

      In short:

      Karyotype = a chromosome map of a cell


      2) How a karyotype is made

      To create a karyotype, scientists:

      1. Take dividing cells (often blood cells)
      2. Stop cell division at metaphase → chromosomes are most condensed and visible
      3. Stain and photograph the chromosomes
      4. Arrange them into pairs based on:

      5. Size

      6. Shape
      7. Centromere position
      8. Banding patterns

      3) What information a karyotype shows

      A karyotype reveals:

      • Total chromosome number
      • Sex chromosomes (XX or XY)
      • Large chromosomal abnormalities

      For humans:

      • 46 chromosomes

      • 22 pairs of autosomes

      • 1 pair of sex chromosomes

      4) Autosomes vs sex chromosomes

      | Type | Description | | ------------------- | ------------------------------------------- | | Autosomes | Chromosomes 1–22, same in males and females | | Sex chromosomes | X and Y, determine biological sex |

      Examples:

      • XX → typically female
      • XY → typically male

      5) Detecting chromosomal disorders (key use)

      Karyotypes are commonly used to detect numerical abnormalities, such as:

      • Trisomy 21 (Down syndrome) → three copies of chromosome 21
      • Turner syndrome → XO
      • Klinefelter syndrome → XXY

      ⚠️ Note: karyotypes detect large-scale changes, not small gene mutations.


      6) Karyotype vs genotype (do not confuse)

      | Term | What it shows | | ------------- | ----------------------------- | | Karyotype | Chromosome number & structure | | Genotype | Allele combinations of genes |


      7) One-sentence exam definition

      A karyotype is a visual display of an organism’s complete set of chromosomes, arranged by size and shape.



      中文(配图·深入讲解)

      Image

      Image

      Image

      Image

      1)什么是核型(karyotype)

      核型是把一个细胞中的全部染色体拍照后,按照大小、形态和着丝粒位置系统排列得到的图像。

      一句话:

      核型 = 染色体的“全家福”


      2)核型是如何制作的

      制作步骤包括:

      1)取正在分裂的细胞 2)将细胞固定在分裂中期 3)染色并拍照 4)按以下标准配对和排序:

      • 长短
      • 形状
      • 着丝粒位置
      • 条带特征

      3)核型能告诉我们什么

      核型可以显示:

      • 染色体总数
      • 性染色体组成
      • 明显的染色体异常

      人类正常核型:

      • 46 条染色体

      • 22 对常染色体

      • 1 对性染色体

      4)常染色体 vs 性染色体

      | 类型 | 含义 | | -------- | ----------- | | 常染色体 | 第 1–22 号染色体 | | 性染色体 | X、Y,决定生物性别 |


      5)核型的核心用途(考试重点)

      核型常用于检测染色体数目异常,例如:

      • 21 三体综合征(唐氏综合征)
      • 特纳综合征(XO)
      • 克氏综合征(XXY)

      ⚠️ 核型只能看见大尺度异常,看不到基因层面的细小突变。


      6)核型 vs 基因型(易混点)

      | 概念 | 内容 | | ------- | -------- | | 核型 | 染色体数量和结构 | | 基因型 | 等位基因组合 |


      7)一句话考试总结

      核型是按大小和形态排列的完整染色体图像,用于分析染色体数目和结构。


      如果你愿意,我可以把 chromatin → chromosome → homologous → allele → karyotype 整理成一张 Biology 10/11 中英对照总复习图表,非常适合考前快速回顾。

    1. https://web.archive.org/web/20260105183931/https://moultano.wordpress.com/2025/12/30/children-and-helical-time/ At first glance this graph seems thought provoking. With E we regularly remark to Y that in our heads, our childhood and student years are much bigger than the period aftwards. More firsts. Vgl Gregory Bateson [[Informatie is verschil dat verschil maakt 20230905124229]], information is a difference that makes a difference, i.e. firsts, and make your time perception longer by doing new stuff [[Maak tijd langer met nieuwe dingen 20210418104515]] and Bateson's use of Korzybski's landscape as theory of mind: [[Steps to an Ecology of Mind by Gregory Bateson]] (1972):

    1. sperm

      Sperm(精子)— with appropriate images

      Image

      Image

      Image

      Image


      English (thorough explanation)

      1) What sperm is (core definition)

      Sperm is the male gamete (sex cell) used in sexual reproduction. Its role is to deliver male genetic information to the egg during fertilization.

      In simple terms: 👉 Sperm = male sex cell that carries DNA to the egg


      2) Chromosome number

      • Sperm cells are haploid (n)
      • They contain half the normal number of chromosomes

      Example (humans):

      • Sperm: 23 chromosomes
      • Egg: 23 chromosomes
      • Fertilization → zygote with 46 chromosomes

      3) Structure of a sperm cell

      A sperm cell has three main parts, each with a specific function:

      • Head

      • Contains the nucleus (DNA)

      • Covered by the acrosome, which has enzymes to penetrate the egg
      • Midpiece

      • Packed with mitochondria

      • Produces energy (ATP) for movement
      • Tail (flagellum)

      • Enables the sperm to swim toward the egg

      This streamlined shape is adapted for speed and mobility.


      4) How sperm are produced

      Sperm are produced in the testes through spermatogenesis:

      • Starts from diploid cells
      • Uses meiosis
      • Produces four haploid sperm cells
      • Occurs continuously after puberty

      5) Role of sperm in fertilization

      During fertilization:

      1. Sperm swim toward the egg
      2. One sperm penetrates the egg membrane
      3. The nuclei fuse
      4. A diploid zygote forms

      Only one sperm normally fertilizes the egg.


      6) Why sperm are important

      Sperm:

      • Enable sexual reproduction
      • Contribute genetic diversity
      • Determine the biological sex of offspring in humans (X or Y chromosome)

      中文(详细解释)

      1)什么是精子(Sperm)

      精子是用于有性生殖雄性配子。 它的主要作用是将父方的遗传信息传递给卵细胞。

      一句话: 👉 精子 = 携带遗传信息的雄性性细胞


      2)染色体数目

      • 精子是单倍体(n)
      • 只含一半染色体

      人类例子:

      • 精子:23 条
      • 卵子:23 条
      • 受精后:46 条(合子)

      3)精子的结构

      精子主要由三部分组成:

      • 头部

      • 含细胞核(DNA)

      • 顶体,释放酶帮助进入卵子
      • 中段

      • 富含线粒体

      • 提供运动能量
      • 尾部

      • 帮助精子游动


      4)精子的产生

      精子在睾丸中通过精子发生(减数分裂)产生:

      • 每个原始细胞可形成 4 个精子
      • 从青春期后持续进行

      5)精子在受精中的作用

      受精过程:

      • 精子进入卵细胞
      • 细胞核融合
      • 形成二倍体合子

      6)精子的重要性

      精子:

      • 实现有性生殖
      • 增加遗传多样性
      • 决定后代的生理性别(X 或 Y)

      一句话总结(双语)

      • Sperm = the male haploid gamete
      • 精子 = 单倍体的雄性配子

      如果你需要,我可以继续帮你做 ✔️ sperm vs egg 对照表 ✔️ Biology 10/11 易错点速记 ✔️ 双语闪卡(结构 + 功能)

    1. Peirce es conocido por ser uno de los grandes lógicos que llevó al estado de la lógica formal en quese encuentra actualmente. Sus artículos Peirce (1880) y Peirce (1885)

      Esto es una prueba para que no quede huerfano

    Tags

    Annotators

    1. La información que se encuentra disponible responde, en primer lugar, a variables de caracterización sociodemográfica. En este marco, hay información sobre el sexo, colegios de precedencia, comuna de residencia, región de origen, nacionalidad, tipo de ingreso, nivel socioeconómico. Además, se encuentra disponible información de caracterización académica, tal como el tipo de ingreso, la cohorte de ingreso, percentil en PSU-PAES, situación académica y, la más importante, el promedio bruto y el promedio calculado. Por último, el gran grueso de variables que contienen las bases son los cursos, tanto obligatorios como electivos, de las cinco carreras que contempla el análisis 2.

      Acá se ganaría mucha claridad con una tabla tipo dfsummary

    2. La base que contiene el promedio de notas en bruto lleva por nombre nota_unica.rds (N = 2578) y tiene un total de 17 variables. Esta base contiene información desde el 2020 hasta el 2025. A partir de esta base se genera un subset por carrera, las cuales se llaman nota_unica_nombre carrera.rds. La base que contiene las notas por curso de estudiantes venía en formato long, por lo que después transformarla a formato wide quedó con un N = 3638 con 1754 variables. Esta base solamente tiene la variable RUT, mientras que todas las demás son los cursos que imparten las cinco carreras.

      Describir también la fuente de los datos

    1. Dossier de Synthèse : État Limite

      Synthèse Exécutive

      Ce document de synthèse analyse les thèmes centraux d'un corpus de textes décrivant le quotidien d'un service de psychiatrie au sein d'un hôpital public français.

      Il met en lumière une crise systémique profonde, caractérisée par une pénurie critique de moyens et de personnel, entraînant l'épuisement des soignants et une déshumanisation des soins.

      Face à ce système défaillant, un psychiatre, figure centrale du document, incarne une philosophie du soin basée sur la création de liens humains et la confiance, remettant en question les logiques institutionnelles de quantification et de contention.

      À travers une série d'études de cas poignantes — patients suicidaires, polytoxicomanes, psychotiques — le document expose la complexité de la souffrance psychique et les dilemmes éthiques constants auxquels les praticiens sont confrontés.

      En définitive, le corpus présente la psychiatrie publique non seulement comme un domaine médical, mais comme une métaphore des tensions de la société, où la question de l'inclusion, de la productivité et de l'interdépendance humaine est posée de manière cruciale.

      --------------------------------------------------------------------------------

      1. La Crise Systémique de la Psychiatrie Publique

      Le thème le plus prégnant est celui de l'effondrement du système hospitalier public, en particulier dans le secteur psychiatrique.

      Cette crise se manifeste à plusieurs niveaux interdépendants.

      1.1. Pénurie de Moyens et de Personnel

      Le manque de ressources est une plainte récurrente et fondamentale.

      Les institutions sont décrites comme "très paupérisées" et fonctionnant avec des "moyens très faibles".

      Cette situation a des conséquences directes sur la qualité des soins et la tension au sein des équipes.

      Sous-effectif chronique : Un soignant déplore que le sous-effectif soit devenu "une habitude", menant à une baisse continue des effectifs car le système "marche" malgré tout.

      Il estime qu'il faudrait "5, 6, 7 psychiatres de plus" dans l'hôpital.

      Conséquences matérielles : Un exemple trivial mais révélateur est celui d'un robinet aux urgences qui reste cassé pendant deux mois, alors que des graffitis sont effacés immédiatement.

      Impact sur la prise en charge : Le manque de personnel et de brancards conduit à des situations où des "choix" doivent être faits, au détriment de patients vulnérables comme les SDF, ce qu'un soignant juge inacceptable :

      « Je ne suis pas rentré dans ce métier moi pour faire des choix et pour pas m'occuper d'un mec SDF ».

      1.2. La Logique de Quantification Contre le Soin Relationnel

      Le psychiatre principal exprime une opposition farouche à la logique de gestion comptable qui s'impose à l'hôpital, notamment la tarification à l'activité (T2A), qu'il juge incompatible avec la nature même du soin psychiatrique.

      L'inquantifiable du soin : « Le problème c'est que moi je lutte contre une logique où on ne peut pas quantifier ce que je fais [...] le traitement c'est de la relation, c'est de bâtir des relations de confiance [...] ça n'a pas de prix. »

      La dévalorisation du temps : Il compare son travail à une opération de la cataracte qui dure 15 minutes, alors que son propre travail n'est "pas prévisible" et peut nécessiter une heure juste pour lire un dossier ou apaiser une famille.

      Cette non-prévisibilité justifie difficilement le salaire d'un psychiatre aux yeux d'une administration focalisée sur la productivité.

      1.3. L'Épuisement et le Burnout des Soignants

      La pression systémique engendre une fatigue et une détresse profondes chez le personnel soignant, allant jusqu'au burnout.

      Épuisement généralisé : Le psychiatre sent les agents de police "épuisés" lors d'une intervention.

      Lui-même admet être "assez anxieux", bien qu'il le cache.

      Le burnout comme symptôme : Une infirmière expérimentée, décrite comme "géniale" et "en or", se retrouve hospitalisée en tant que patiente suite à un burnout.

      La cause identifiée est directement liée à "l'encadrement" et à la gestion du service, se sentant "plus utile" et "débordée".

      Impact sur les plus jeunes : Une stagiaire confie être "beaucoup plus anxieuse" et moins bien dormir depuis le début de son stage, le contact avec la souffrance réactivant ses propres angoisses.

      1.4. Complicité et Déshumanisation

      Le personnel se retrouve pris dans un dilemme moral, se sentant complice d'un système qui maltraite les patients par négligence structurelle.

      Le sentiment de complicité : Le psychiatre s'interroge : « Est-ce qu'on se rend pas un peu complice tu vois à faire tenir un truc qui conduit à ce que les gens viennent à l'hôpital et qui en fait ils meurent de notre négligence tu vois ? »

      Perte de sens : Cette situation mène à un découragement profond : « Si c'est pour faire le travail comme ça je te cache pas que moi ça me décourage aussi. »

      Déconstruction de l'idéal du service public : Un confrère assène au psychiatre une vérité crue :

      « Ils en ont rien à faire tu sais si les patients meurent à l'hôpital public [...] on vit maintenant dans une société où les établissements publics ne remplissent plus du tout leur valeur d'intérêt général. »

      --------------------------------------------------------------------------------

      2. Une Philosophie du Soin Centrée sur l'Humain

      En contrepoint à la faillite du système, le psychiatre principal développe et applique une philosophie du soin résolument humaniste, basée sur le lien et une critique de l'institution.

      2.1. Le Lien Thérapeutique comme Fondement

      Au cœur de sa pratique se trouve la conviction que le soin réside avant tout dans la relation interpersonnelle.

      L'interdépendance fondamentale : « Le seule chose qui me paraît être du soin tu vois c'est de se dire on a besoin les uns des autres on est absolument dépendant les uns des autres il faut pas le fuir. »

      Créer des conditions non nuisibles : Il définit son rôle comme celui de créer "des conditions non nuisibles, j'espère le moins nuisible possible et après j'espère qu'il va arriver quelque chose mais c'est pas moi qui décide si ça va arriver ou pas."

      Le suivi comme clé : Il insiste auprès d'un patient alcoolique que le seul conseil qu'il peut donner est que "ce qui va marcher, c'est le suivi".

      2.2. Une Approche Critique de l'Institution Psychiatrique

      Son parcours personnel et ses convictions le poussent à remettre en cause les fondements de la psychiatrie traditionnelle.

      Une vocation de "désingueur" : Il raconte avoir choisi cette spécialité pour "désinguer la psychiatrie", trouvant que "ça n'allait pas de soi [...] le fait d'enfermer les gens, le fait de les prendre un peu pour des idiots".

      Une vision politique et sociale : Il souhaitait s'inscrire dans une "dimension plus politique plus social", ce que la psychiatrie lui permet.

      L'utopie d'une psychiatrie dissoute dans la société : Son objectif ultime est qu'il n'y ait "plus de psychiatrie", non pas par absence de soin, mais parce que la société aurait appris à "prendre tous soin des autres", assignant une place et une existence sociale à chacun, même à ceux qui voient le monde de manière "un peu déraisonnable".

      2.3. Les Dilemmes Éthiques de la Pratique

      Le psychiatre est constamment confronté à des choix difficiles qui mettent en balance la liberté du patient, sa sécurité et le cadre légal.

      Liberté vs. Protection (Cas de Solange) : Avec une patiente psychotique qui souhaite rester en France contre l'avis de sa famille, il verbalise son dilemme :

      « C'est la position impossible du psychiatre, il a envie de répondre a priori à votre liberté et en même temps mon pouvoir est tel que comme vous n'êtes pas dans le cadre de la raison je dois vous priver pour vous renvoyer chez vous et ça c'est un dilemme pour moi qui est très difficile. »

      Usage de la force et de la contention : Face à un patient intoxiqué, agité et refusant les soins, il est contraint de passer de la parole à la force. Après avoir tenté de négocier, il conclut :

      « On a essayé de jouer les choses avec la parole je pense qu'on va pas y arriver [...] à mon avis on a pas le choix. »

      Il ordonne une sédation et une contention physique, malgré l'opposition véhémente du patient.

      --------------------------------------------------------------------------------

      3. Portraits de la Souffrance Psychique : Études de Cas

      Le document est jalonné de rencontres avec des patients dont les histoires illustrent la diversité et la profondeur de la détresse psychique.

      | Patient(e) / Situation | Problématique Principale | Détails Clés | | --- | --- | --- | | Alienor | Tentative de suicide, polytraumatisme, relations familiales | A perdu ses deux jambes et un avant-bras après avoir chuté d'un pont et avoir été percutée par un train.

      Sa sœur refuse de la voir tant qu'elle n'aura pas la "preuve exacte" qu'elle ne fera "aucune connerie pendant au moins 6 mois".

      Le psychiatre nuance sa responsabilité en pointant une prescription précoce et massive de benzodiazépines. | | Patient avec phobies d'impulsion | Re-diagnostic de trouble bipolaire (au lieu de schizophrénie) | Décrit des "phobies d'impulsion" : une peur obsédante de se jeter sous le métro ou par une fenêtre ouverte, bien qu'il n'en ait pas l'envie.

      Il met en place des stratégies d'évitement. | | Solange, "la théologienne" | Épisode psychotique ("voyage des fous"), autonomie | Se dit "en voyage", refuse les neuroleptiques.

      Le psychiatre pèse le risque de la laisser "livrée à elle-même" contre son désir de rester en France au sein d'une communauté religieuse.

      Il décide finalement de la laisser partir. | | Windy | Addictions, mal-être adolescent, échec thérapeutique | Jeune patient suivi par le psychiatre, participant à un atelier théâtre.

      Malgré le lien créé, il est retrouvé mort d'une overdose.

      Sa mort est un choc pour le psychiatre, symbolisant l'échec de sa mission : "on doit les empêcher de se buter avant qu'ils arrivent à trouver leur place". | | Jeune homme avec pancréatite | Douleur chronique, angoisse existentielle, idées noires | Souffre de douleurs intenses et exprime sa peur de tomber dans l'addiction aux opiacés.

      Il déclare : "Je sais pas jusqu'où je serais prêt à aller pour les faire cesser \[les douleurs\]". Son regard est décrit comme empreint de "mélancolie". | | Patient kidnappé et torturé | Traumatisme psychique sévère | Raconte avoir été enlevé par 15 à 20 personnes, frappé, et enfermé dans un coffre de voiture.

      Le psychiatre l'écoute et lui offre un cadre sécurisant pour la nuit. | | Patient intoxiqué et agité | Polytoxicomanie, refus de soins, mise en danger | Revendique son droit à se faire du mal (« Si ce n'est qu'à moi ça me regarde »).

      Son état d'intoxication rend toute discussion impossible, forçant l'équipe à recourir à la contention physique et à la sédation pour le protéger. | | Patiente et son anniversaire | Traumatisme, deuil, culpabilité | La patiente se met en danger à l'approche de son anniversaire.

      Elle révèle que c'est le jour où une amie est décédée dans un accident de voiture en venant la voir. Elle se sent responsable : « Je dis que c'est ma faute ». |

      --------------------------------------------------------------------------------

      4. La Psychiatrie comme Enjeu Social

      Au-delà des cas cliniques, la pratique psychiatrique est présentée comme un miroir des valeurs et des dysfonctionnements de la société contemporaine.

      4.1. L'Exclusion du "Fou" dans l'Environnement Urbain

      Le psychiatre analyse comment l'organisation de la société moderne marginalise structurellement les personnes atteintes de troubles psychiques.

      La perte de "l'idiot du village" : Il oppose le village, où l'on peut avoir de la "sollicitude" pour la personne différente, à la "cité urbaine" qui est un facteur de risque pour les maladies psychiatriques.

      La logique de productivité : Dans un environnement urbain "réfléchi pour permettre à des populations d'être productives", le "fou" est exclu car "il n'est pas productif, il ne sert à rien, il ne produit rien et puis il a un coût en plus".

      L'institution devient alors le lieu où on le met à l'écart sous prétexte de le soigner.

      4.2. Le Soin comme Acte de Résistance

      Face à ce constat, chaque acte de soin centré sur la relation devient un acte politique de résistance contre la déshumanisation et l'isolement.

      L'organisation d'un atelier de théâtre avec des jeunes patients, où ils jouent Molière, est une illustration de cette démarche, visant à recréer du lien social, de la confiance en soi et une existence au-delà du statut de malade.

      La finalité est de réaffirmer que, malgré la maladie, une place dans la communauté est non seulement possible, mais nécessaire.

    1. We found that spinosad application significantly reduced herbivory (untreated: 0.53%, treated: 0.15%; F1 = 174.06, p < 0.001; Figure 2a, Table S1).

      I am struggling with the Y-axis '% herbivory per leaf'. Firstly, I am confused about the range. It seems to go from around -2% to 3% damage. However, in the methods, damage values are explained as being the percentage of the leaf surface damaged by herbivory, averaged across four leaves). How can there be negative herbivory damage? Also, perhaps the Y axis title should be something like '% leaf area damaged'

    1. AbstractHigh-throughput phenotyping is addressing the current bottleneck in phenotyping within breeding programs. Imaging tools are becoming the primary resource for improving the efficiency of phenotyping processes and providing large datasets for genomic selection approaches. The advent of AI brings new advantages by enhancing phenotyping methods using imaging, making them more accessible to breeding programs. In this context, we have developed an open Python workflow for analyzing morphology and heritable morphometric traits using AI, which can be applied to fruits and other plant organs. This workflow has been implemented in almond (Prunus dulcis), a species where efficiency is critical due to its long breeding cycle. Over 25,000 kernels, more than 20,000 nuts, and over 600 individuals have been phenotyped, making this the largest morphological study conducted in almond. As result, new heritable morphometric traits of interest have been identified. These findings pave the way for more efficient breeding strategies, ultimately facilitating the development of improved cultivars with desirable traits.Competing Interest StatementThe authors have declared no competing interest.Footnotes https://github.com/jorgemasgomez/almondcv2 Abbreviations:GPUGraphics Processing UnitYOLOYou Only Look OnceSAMSegment Anything ModelROIRegion of InterestFunder Information DeclaredMinisterio de Ciencia y Universidades, España, PID2021-127421OB-I00, FPU20/00614Fundación Séneca

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf157), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 3: Yu Jiang

      The present study entitled "Open RGB Imaging Workflow for Morphological and Morphometric Analysis of Fruits using AI: A Case Study on Almonds" reported the development of a Python-based image analysis pipeline that extract morphological traits of almond nut shells and kernels. A case study was conducted to use the developed pipeline to analyze breeding populations of 665 genotypes and extract both general morphology traits such as height, length, area, aspect ratio, etc. and specialized traits for almond such as width at three heights, vertical and horizontal symmetry, etc. Further, each nut shell or kernel was weighed, so models were established to use the weight and morphological traits to predict the thickness of each nut shell or kernel. In addition to morphological traits, morphometric (or shape) was extracted for each nut shell or kernel. Clustering analysis was performed on the morphometric traits to identify variability among genotypes. To further validate the efficacy of the extracted traits, broad-sense heritability was calculated and used as a criterion.

      The major contribution of this study is the integration of different components (e.g., camera calibration, image segmentation, and morphological/morphometric trait extraction, etc.) as a user-accessible, open-source Python implementation for the plant breeding community, especially for almond breeders. However, there several aspects that could be further improved.

      First, the present study showed the most number of samples that were phenotyped by the proposed pipeline among recent efforts on almond nut shell and/or kernel phenotyping. However, there was no clear evidence to demonstrate direct benefits to ongoing almond breeding. Certain traits (e.g., aspect ratio, tip/top/side curvatures) could be included in a breeding program, but what's the significance of including these traits in breeding programs. Are they crucial to either improve the productivity, quality, or other management practices or processing practices for the almond industry, especially given breeding context?

      Second, the pipeline uses deep learning-based segmentation which is powerful to handle complex background. Based on the limited figures or example images in the GitHub repo, the background is mostly single colored (e.g., white or black) without appearances that may confuse even conventional segmentation, especially if image color is calibrated. Assuming most of the almond nut shell and kernel analyses would be done in a laboratory condition, it is not convincing why conventional segmentation methods may not be preferred if both illumination and camera configuration can be well controlled. Ultimately, the question is whether it is worthy the effort of labeling hundreds of images to fine-tune a deep learning segmentation model compared to a careful hardware-software design to make operation more efficient. Or with the simplified background, vision foundation models such as SAM will be sufficient.

      Third, in the Introduction section, some technical statements should be revised to make them accurate. For example, image segmentation is a core computer vision task rather than relying on computer vision algorithms. One-stage and two-stage strategies are used to differentiate models for object detection not image segmentation. Further, Faster RCNN is an object detection model and cannot do image segmentation. It is highly recommended that the authors could find a computer science or engineering colleague to proofread the technical statements to ensure the accuracy.

      Last, it is appreciated the authors effort on making an open-source software for the community. However, the dataset can be equally important to advance the scientific discovery and technology development. Is there any plan to make the dataset publicly available to help facilitate the development of additional computer vision algorithms for almond phenotyping?

    2. AbstractHigh-throughput phenotyping is addressing the current bottleneck in phenotyping within breeding programs. Imaging tools are becoming the primary resource for improving the efficiency of phenotyping processes and providing large datasets for genomic selection approaches. The advent of AI brings new advantages by enhancing phenotyping methods using imaging, making them more accessible to breeding programs. In this context, we have developed an open Python workflow for analyzing morphology and heritable morphometric traits using AI, which can be applied to fruits and other plant organs. This workflow has been implemented in almond (Prunus dulcis), a species where efficiency is critical due to its long breeding cycle. Over 25,000 kernels, more than 20,000 nuts, and over 600 individuals have been phenotyped, making this the largest morphological study conducted in almond. As result, new heritable morphometric traits of interest have been identified. These findings pave the way for more efficient breeding strategies, ultimately facilitating the development of improved cultivars with desirable traits.Competing Interest StatementThe authors have declared no competing interest.Footnotes https://github.com/jorgemasgomez/almondcv2 Abbreviations:GPUGraphics Processing UnitYOLOYou Only Look OnceSAMSegment Anything ModelROIRegion of InterestFunder Information DeclaredMinisterio de Ciencia y Universidades, España, PID2021-127421OB-I00, FPU20/00614Fundación Séneca

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf157), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 2: Qi Wang

      We would like to thank you for submitting your manuscript to our journal.The manuscript proposes an AI-powered open RGB imaging workflow for morphological and morphometric analysis of fruits or other plant organs, with almonds as a case study. The workflow, developed in Python, covers the full pipeline from image pre-processing, segmentation model development and deployment, to trait measurement and analysis. It aims to improve the efficiency and accuracy of phenotyping in breeding programs by addressing the limitations of traditional methods, such as their time-consuming and labor-intensive nature. However, there are still the following problems in this paper that need further improvement: 1.Table formatting: Some of the tables in the manuscript do not follow the formatting standards of the journal. The authors are encouraged to revise them accordingly to ensure clarity, consistency, and ease of understanding. 2.Formula presentation: Certain mathematical formulas are not clearly formatted and appear disorganized. The authors should re-typeset the equations to improve readability and provide clearer explanations for each formula. 3.Introduction: The introduction could be strengthened by more thoroughly explaining the relationship between phenotypic data and breeding. The authors may also discuss how phenotyping data supports genomic selection and accelerates breeding via high-throughput workflows. 4.Methods section: While the paper clearly explains how morphological traits and kernel thickness are measured, it does not sufficiently explain how this data contributes to breeding decisions. The authors should elaborate on how the extracted traits are applied in practical breeding or selection strategies. 5.Lack of algorithmic novelty: While the integration of existing tools is commendable, the core methods used (e.g., YOLO, SAHI) are based on publicly available models, without introducing new algorithmic components or comparative ablation studies. The authors are advised to clarify the unique contribution of their workflow, especially in terms of engineering integration or practical usability. 6.Limited evaluation metrics: The performance of segmentation models is only reported using error percentage. The inclusion of standard metrics such as IoU, Precision, Recall, and F1-score would allow for a more comprehensive evaluation and comparison across models (see LLRL methods). 7.Figures and captions: Currently, figure images and their descriptions are placed separately, which may reduce readability. It is recommended to place figure captions immediately beneath or alongside the figures to enhance the paper's coherence and user-friendliness. 8.Trait extension suggestions: In order to enhance the expressiveness and resolution of phenotypic trait modeling, authors are advised to refer to the relevant research on extracting fine-grained phenotypic features in plant images in recent years. For example, PlanText proposed a progressive visual guidance strategy to help improve the modeling quality of phenotypic traits in images. Therefore, I would like to give a "Major Revision" recommendation.

    3. AbstractHigh-throughput phenotyping is addressing the current bottleneck in phenotyping within breeding programs. Imaging tools are becoming the primary resource for improving the efficiency of phenotyping processes and providing large datasets for genomic selection approaches. The advent of AI brings new advantages by enhancing phenotyping methods using imaging, making them more accessible to breeding programs. In this context, we have developed an open Python workflow for analyzing morphology and heritable morphometric traits using AI, which can be applied to fruits and other plant organs. This workflow has been implemented in almond (Prunus dulcis), a species where efficiency is critical due to its long breeding cycle. Over 25,000 kernels, more than 20,000 nuts, and over 600 individuals have been phenotyped, making this the largest morphological study conducted in almond. As result, new heritable morphometric traits of interest have been identified. These findings pave the way for more efficient breeding strategies, ultimately facilitating the development of improved cultivars with desirable traits.Competing Interest StatementThe authors have declared no competing interest.Footnotes https://github.com/jorgemasgomez/almondcv2 Abbreviations:GPUGraphics Processing UnitYOLOYou Only Look OnceSAMSegment Anything ModelROIRegion of InterestFunder Information DeclaredMinisterio de Ciencia y Universidades, España, PID2021-127421OB-I00, FPU20/00614Fundación Séneca

      This work has been peer reviewed in GigaScience (see https://doi.org/10.1093/gigascience/giaf157), which carries out open, named peer-review. These reviews are published under a CC-BY 4.0 license and were as follows:

      Reviewer 1: Yuvraj Chopra

      The methods described in this article represent a useful tool for fast and reliable morphometric analysis of almonds with potential applications in fruits. The pipeline is technically sound, and publicly available workflow will advance the adoption of this technology. However, there are critical concerns which needs to be addressed before the manuscript could be further proceeded for publication in the journal - Major Comments - 1. Authors claim this technique as a new phenotyping tool with breakthrough implications; however, I object to this claim. Numerous studies have utilized this technique in plant phenotyping to the extent that labeling it as a new phenotyping tool may not be ideal. Additionally, for kernel or seed morphometrics, a wide array of user-friendly, open-source tools have already been developed and are readily available, for example, - SeedExtractor https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2020.581546/full#h3 - SmartGrain https://academic.oup.com/plphys/article/160/4/1871/6109568 - GrainScan https://link.springer.com/article/10.1186/1746-4811-10-23 - PlantCV https://peerj.com/articles/4088/ These tools and a lot of other options can be readily used for almond kernel morphometrics. Authors are requested to discuss/compare advantages/performance of their model with SeedExtractor, SmartGrain, and GrainScan. 2. Within horticultural crops, workflows and studies (not acknowledged in this article) are available that can be adapted or modified to do the same thing. For example - publications from as early as 2020 used machine learning models to measure size and mass of almonds, however, this relevant study was not acknowledged by the authors https://onlinelibrary.wiley.com/doi/full/10.1111/jfpe.13374 . Discuss how the presented method is better than the aforementioned article, justify the claim 'breakthrough'. 3. The authors claim successfully testing the pipeline for apples and strawberries. Information for fruit size can be extracted from 2D images; however, the example results show only length, width, circularity, and ellipse ratio. How do these parameters assist fruit breeders? Since it is segmentation based classification using a reference scale, the aforementioned tools particularly SeedExtractor can generate similar results. Does it qualify the tool for integration into fruit crops breeding pipeline? Moreover, fruit breeders require on-tree analysis, recent advancements have enabled 3D sensing for significantly better detection particularly using cost effective RGB-D cameras.

      Minor Comments - 1. Change title - This study uses deep learning models which are a type of AI. AI is a broader term. Additionally, the potential for utility in fruit breeding pipeline appears to be limited. Suggested - Open RGB imaging workflow for morphological and morphometric analysis of almond kernels using deep learning. 2. The video tutorial showed using YOLOv8, mention it in the methods. Add information for all settings used in CVAT.

    1. Gobernanza de datos

      Este componente nomqueda tan claro como los otros, en términos de líneas o acciones concretas de desarrollo. Me parece que puede ayudar la distinción entre el desarrollo de protocolos o modelos de gobernanza, y el foco en uso de información orientada a la toma de desiciones, por medio del desarrollo de estudios estratégicos.

    2. Acceso y uso de datos secundarios: Dar recomendaciones de bases de datos secundarios disponibles acordes a los objetivos del estudio.

      Esto está poco desarrollado. Acceso, desarrollo de manuales de uso, recomendaciones que aborden objetivos de estudios especificos, actividades de formación, etc.

    3. Esta desconfianza se expresa en recortes presupuestarios, cierre o reducciones de programas académicos, así como un creciente escepticismo general respecto de los hallazgos, métodos y motivaciones de la comunidad científica

      Citas

    4. La brecha entre ambas facultades

      Creo que los números me confunden, FACSO realizó 16 actividades y recibió 970 millones y FAGOB realizó 99 y recibió 2.4 millones. Tanto menos? y en que sentido sería una oportunidad para crecer? en el número de actividad, si, pero en $$ no?

    5. generación de evidencia y de metodologías, formación de capacidades humanas especializadas, desarrollo de estándares éticos y técnicos, incubación de soluciones innovadoras, y mecanismos de aseguramiento de la calidad que garanticen la confiabilidad y la trazabilidad.

      Misma cosa sobre el orden de aparición

    6. investigación, formación (en pregrado, postgrado y educación continua), gobernanza y gestión responsable de datos, innovación y transferencia, y certificación de la calidad de los datos sociales

      Mantener el orden en que se presentan los apartados posteriormente.

    7. producen

      el concepto de producción de datos está muy vinculado a levantamiento. Para diferenciarse de esa actividad propongo matizarlo o sacarlo por ahora. Así, el foco estaría en: transformar la manera en que se gestionan, documentan y utilizan los datos sociales, asegurando que el conocimiento generado sea riguroso, transparente y socialmente relevante.

    8. La necesidad de contar con una infraestructura robusta de datos sociales se vuelve aún más urgente en el contexto actual, dadas las transformaciones tecnológicas, sociales y políticas que enfrentan las sociedades contemporáneas. El conocimiento científico se ve amenazado por la crisis de confianza pública, la proliferación de desinformación y la creciente complejidad de los fenómenos sociales. Además los populismos y las crisis democráticas en la región subrayan la importancia de contar con datos sociales rigurosos y transparentes que permitan fundamentar decisiones públicas y fortalecer la participación ciudadana. En este escenario, las universidades tienen un rol crucial como productores y guardianes del conocimiento social, siendo responsables de garantizar que los datos generados sean de alta calidad, éticamente gestionados y accesibles para la sociedad.

      Este párrafo es muy general. Funciona mejor como contexto amplio. Además, aparecen temas un poco descolgados como: los populismos o las crisis democráticas. ¿Cómo se conectan estos temas amplios con lo que se propone?

    1. 5.1 Sexo x Tipo de colegio

      antes de hacer el cruce, se requieren los descriptivos de notas por cada una de las variables y ahí evaluar qué cruces son relevantes. ¿Por qué es importante el cruce entre sexo y tipo de colegio? es decir, ambas son variables dependientes del estudio, y todo cruce debería incluir la dependiente. Llama la atención además que este sea el primer cruce.

      Además, el eje x tiene que ir ordenado en un sentido para ver mejor las tendencias: municipal, subvencionado, privado

      Y partir por un gráfico general, y luego ver si tiene sentido una comparación por carreras. Si es por comparar, un tabset no ayuda mucho, mejor algo tipo cleveland o algo más eficiente visualmente.

    1. 4.1 Ejercicio inicial - Subset Sociología 2024

      parte de esta sección debería ser algo como "chequeo de calidad de los datos" y debería ir en el apéndice. La secciónn 4 debería ser simplemente Resultados, para la audiencia dudo que su principal interés sea conocer discrepancias de las bases de datos, le quita mucho peso al análisis. En la reunión habíamos dicho que vamos a trabajar con la base que tiene las notas por curso; si esto calza o no con la otra base es un tema a resolver, y que como dije ya un par de veces se puede despejar con las notas efectivas de algún curso que me puedes solicitar a mí o a Daniel @tomas, y simplemente comparar con las de la base.

    1. identificar los factores que determinan las calificaciones de los estudiantes en las distintas carreras de FACSO.

      yo creo que es mucho más que esto: describir y analizar el rendimiento académico en las y los estudiantes de pregrado de FACSO entre 2020 y 2025, sus cambios en el tiempo y sus factores asociados.

    2. una problemática

      diría varias problemáticas. Más que la distinción entre quienes necesitan apoyo, que se podría estar en una situación de inflación de notas. Esto podría ser problemático porque ... -> y aquí el problema principal de investigación

    3. la alta heterogeneidad en su composición estudiantil.

      Este punto queda descontextualizado, parece asumir que la heterogeneidad tiene que ver con algo de las notas (?). Y la planificación de este estudio de notas no tiene que ver con esto. Ahora, si refiere al estudio de notas inicial desde la dirección de pregrado para el tema de las tutorías, hace falta contextualizar mucho mejor. Y con fechas, para poder entender bien.

    4. Introducción

      En términos globales, a la introducción le falta desarrollo del problema desde la literatura y desde la documentación de la universidad (reglamentos, modelo educativo, políticas de acceso, etc,). Es importante hacer esa revisión para contextualizar el problema: el análisis de la distribución de notas, su variación en tiempo (cuando se pueda) y los factores asociados. Haciendo un mejor contexto, podemos definir mejor la pregunta y guíar los análisis.

      En la prensa, durante estos días apareció algo sobre esta discusión: - https://www.lun.com/Pages/NewsDetail.aspx?dt=2025-12-30&NewsID=556972&BodyID=0&PaginaId=3

      Además, hay fuentes internacionales interesantes con las que iniciar: - https://www.tandfonline.com/doi/full/10.1080/02602938.2020.1795617 - https://www.tandfonline.com/doi/full/10.1080/03075079.2025.2470297#:~:text=Generally%2C%20grade%20inflation%20is%20harmful,;%20Yeritsyan%20and%20Mjelde%202024).

    1. Venture capitalists are chasing younger and younger founders: the median age of the latest Y Combinator cohort is only 24, down from 30 just three years ago.

      Interesting metric. Is it bc of the chasing (capital, eagerness) or bc of the founders (ideas, surfing a new tech wave). AI people are younger I suppose.

    1. 存储模板 (Template){{y}}/{{MM}}/{{exif.Model}}_{{filename}}

      immich storage template没有区分model的选项 当前模版: {{y}}/{{MM}}/{{y}}{{MM}}{{dd}}-{{HH}}{{mm}}{{ss}}_{{filename}} UPLOAD_LOCATION/library/admin/2022/02/20220203-045605_IMAGE_56437.jpg

    1. orchids

      English (thorough explanation)

      Image

      Image

      Image

      Image

      1) What “orchids” literally means

      Orchids is the plural noun of orchid. It refers to plants in the orchid family (Orchidaceae)—one of the largest and most diverse plant families on Earth. Orchids are flowering plants known for their complex, often beautiful flowers.

      In simple terms: orchids = a very large group of flowering plants with specialized flowers


      2) Key characteristics of orchids

      • Flowers: Highly specialized shapes; often bilaterally symmetrical with a distinctive petal called the labellum (lip) that attracts pollinators.
      • Pollination: Frequently insect-pollinated, sometimes by a single specific species (very precise adaptation).
      • Growth forms:

      • Epiphytic (grow on trees, not parasitic)

      • Terrestrial (grow in soil)
      • Roots: Many have aerial roots with a spongy layer (velamen) that absorbs moisture from air.
      • Seeds: Extremely tiny and dust-like, produced in huge numbers.

      3) Orchids in ecosystems

      Orchids play important ecological roles:

      • Form specialized relationships with pollinators (bees, moths, flies, birds).
      • Many rely on mycorrhizal fungi to germinate and obtain nutrients when young.
      • Serve as indicators of healthy, stable ecosystems, especially forests.

      4) Where orchids are found

      • Found worldwide, but most diverse in tropical rainforests.
      • Common in forest canopies and understorey, where they use trees for support to reach light.
      • Some species grow in temperate regions, grasslands, or even near wetlands.

      5) Orchids and adaptation (science focus)

      Orchids are classic examples of evolutionary adaptation:

      • Flower shapes match specific pollinators
      • Some mimic insects or produce strong scents
      • High specialization increases pollination efficiency but also vulnerability

      6) How the word is used if it appears alone

      If you see “orchids” by itself, it usually functions as:

      • a plant group name,
      • an example of insect-pollinated plants,
      • a label under a picture or diagram,
      • a biology vocabulary term.

      7) Pronunciation

      • orchid /ˈɔːr.kɪd/
      • orchids /ˈɔːr.kɪdz/

      中文(详细解释)

      Image

      Image

      Image

      Image

      1)“orchids”的基本意思

      orchidsorchid(兰花) 的复数形式,指 兰科植物。 兰科是 地球上物种最多的植物科之一,以 结构复杂、形态多样的花朵 著称。

      简单说: orchids = 兰花,一大类开花植物


      2)兰花的主要特征

      • 花结构特殊:左右对称,有明显的 唇瓣 用来吸引传粉者
      • 授粉方式:多为 虫媒授粉,有时高度专一
      • 生长方式

      • 附生兰(长在树上,但不寄生)

      • 地生兰(长在土里)
      • 根系:常有 气生根,可直接吸收空气中的水分
      • 种子极小,数量巨大

      3)生态系统中的作用

      • 与昆虫形成 高度专化的授粉关系
      • 幼苗依赖 真菌共生 才能发芽
      • 常被视为 生态环境健康的指示植物

      4)分布环境

      • 全球都有,但以 热带雨林 最丰富
      • 常见于 树冠层和林下层
      • 也有耐寒、耐旱的品种分布在温带地区

      5)适应性与进化(考试重点)

      兰花是 自然选择与适应进化 的经典例子:

      • 花形与传粉者高度匹配
      • 有些会 拟态昆虫 或释放强烈气味
      • 专一性高,效率高,但对环境变化敏感

      6)单独出现 “orchids” 的含义

      如果只看到 orchids,通常表示:

      • 植物分类名称
      • 虫媒授粉植物的例子
      • 图片或图表标签
      • 生物学词汇

      7)发音

      • orchids /ˈɔːrkɪdz/ 中文可近似理解为:“奥-克兹 / 兰花”

      If you want, paste the full sentence, diagram, or exam question (for example about pollination, coevolution, or forest layers), and I’ll explain exactly why orchids are mentioned and what concept they illustrate(中英对照、直击考点).

    1. ntroducti

      En general, me parece que la introducción está muy bien. Se sintetiza correctamente el problema y el objetivo del artículo; mi único reparo está en el último párrafo (donde dejé los comentarios específicos correspondientes).

    1. Author response:

      The following is the authors’ response to the current reviews.

      We would like to proceed with this paper as a Version of Record but we will correct the mistake that we made in the Key resources table. As the reviewer noted we had added the wrong guide RNA sequence here. We are super thankful to the reviewer and apologize for the mistake.


      The following is the authors’ response to the original reviews.

      eLife Assessment 

      This important study identifies a new key factor in orchestrating the process of glial wrapping of axons in Drosophila wandering larvae. The evidence supporting the claims of the authors is convincing and the EM studies are of outstanding quality.

      We are thankful for this kind and very positive judgment.

      However, the quantification of the wrapping index, the role of Htl/Uif/Notch signaling in differentiation vs growth/wrapping, and the mechanism of how Uif "stabilizes" a specific membrane domain capable of interacting with specific axons might require further clarification or discussion.

      This is now addressed

      Reviewer #1 (Public review):

      Summary:

      A central function of glial cells is the ensheathment of axons. Wrapping of larger-diameter axons involves myelin-forming glial classes (such as oligodendrocytes), whereas smaller axons are covered by non-myelin-forming glial processes (such as olfactory ensheathing glia). While we have some insights into the underlying molecular mechanisms orchestrating myelination, our understanding of the signaling pathways at work in non-myelinating glia remains limited. As non-myelinating glial ensheathment of axons is highly conserved in both vertebrates and invertebrates, the nervous system of Drosophila melanogaster, and in particular the larval peripheral nerves, have emerged as a powerful model to elucidate the regulation of axon ensheathment by a class of glia called wrapping glia. Using this model, this study seeks to specifically address the question, as to which molecular mechanisms contribute to the regulation of the extent of glial ensheathment focusing on the interaction of wrapping glia with axons. 

      Strengths and Weaknesses:

      For this purpose, the study combines state-of-the-art genetic approaches with high-resolution imaging, including classic electron microscopy. The genetic methods involve RNAi-mediated knockdown, acute Crispr-Cas9 knock-outs, and genetic epistasis approaches to manipulate gene function with the help of cell-type specific drivers. The successful use of acute Crispr-Cas9 mediated knockout tools (which required the generation of new genetic reagents for this study) will be of general interest to the Drosophila community. 

      The authors set out to identify new molecular determinants mediating the extent of axon wrapping in the peripheral nerves of third-instar wandering Drosophila larvae. They could show that over-expressing a constitutive-active version of the Fibroblast growth factor receptor Heartless (Htl) causes an increase in wrapping glial branching, leading to the formation of swellings in nerves close to the cell body (named bulges). To identify new determinants involved in axon wrapping acting downstream of Htl, the authors next conducted an impressive large-scale genetic interaction screen (which has become rare, but remains a very powerful approach), and identified Uninflatable (Uif) in this way. Uif is a large single-pass transmembrane protein that contains a whole series of extracellular domains, including Epidermal growth factor-like domains. Linking this protein to glial branch formation is novel, as it has so far been mostly studied in the context of tracheal maturation and growth. Intriguingly, a knock-down or knock-out of uif reduces branch complexity and also suppresses htl over-expression defects. Importantly, uif over-expression causes the formation of excessive membrane stacks. Together these observations are in in line with the notion that htl may act upstream of uif. 

      Further epistasis experiments using this model implicated also the Notch signaling pathway as a crucial regulator of glial wrapping: reduction in Notch signaling reduces wrapping, whereas over-activation of the pathway increases axonal wrapping (but does not cause the formation of bulges). Importantly, defects caused by the over-expression of uif can be suppressed by activated Notch signaling. Knock-down experiments in neurons suggest further that neither Delta nor Serrate act as neuronal ligands to activate Notch signaling in wrapping glia, whereas knock-down of Contactin, a GPI anchored Immunoglobulin domain-containing protein led to reduced axon wrapping by glia, and thus could act as an activating ligand in this context. 

      Based on these results the authors put forward a model proposing that Uif normally suppresses Notch signaling, and that activation of Notch by Contactin leads to suppression of Htl, to trigger the ensheathment of axons. While these are intriguing propositions, future experiments would need to conclusively address whether and how Uif could "stabilize" a specific membrane domain capable of interacting with specific axons.

      We absolutely agree with the reviewer that it would be fantastic to understand whether and how Uif could stabilize specific membrane domains that are capable of interacting with axons. To address this we need to be able to label such membrane domains and unfortunately we still cannot do so. We analyzed the distribution of PIP2/PIP3 but failed to detect any differences. Thus we still lack wrapping glial membrane markers that are able to label specific compartments.

      Moreover, to obtain evidence for Uif suppression by Notch to inhibit "precocious" axon wrapping and for a "gradual increase" of Notch signaling that silences uif and htl, (1) reporters for N and Htl signaling in larvae, (2) monitoring of different stages at a time point when branch extension begins, and (3) a reagent enabling to visualize Uif expression could be important next tools/approaches. Considering the qualitatively different phenotypes of reduced branching, compared to excessive membrane stacks close to cell bodies, it would perhaps be worthwhile to explore more deeply how membrane formation in wrapping glia is orchestrated at the subcellular level by Uif.

      In the revised version of the manuscript we have now included the use of Notch and RTK-signaling reporters.

      (1) reporters for N and Htl signaling in larvae,

      We had already employed the classic reporter generated by the Bray lab: Gbe-Su(H)-lacZ. This unfortunately failed to detect any activity in larval wrapping glia nuclei but was able to detect Notch activity in the adult wrapping glia (Figure S5C,F).

      We did, as requested, the analysis of a RTK signaling reporter.  The activity of sty-lacZ that we had previously characterized in the lab (Sieglitz et al., 2013) increases by 22% when Notch is silenced. Given the normal distribution of the data points, this shows a trend which, however, is not in the significance range. We have not included this in the paper, but would be happy to do so, if requested.

      Author response image 1.

       

      (2) monitoring of different stages at a time point when branch extension begins,

      The reviewer asks for an important question; however, this is extremely difficult to tackle experimentally. It would require a detailed electron microscopic analysis of early larval stages which cannot be done in a reasonable amount of time. We have however added additional information on wrapping glia growth summarizing recently published work from the lab (Kautzmann et al., 2025).

      (3) a reagent enabling to visualize Uif expression could be important next tools/approaches.

      The final comment of the reviewer also addresses an extremely relevant and important issue. We employed antibodies generated by the lab of R. Ward, but they did not allow detection of the protein in larval nerves. We also attempted to generate anti-Uif peptide antibodies but these antibodies unfortunately do not work in tissue. We are still trying to generate suitable reagents but for the current revision cannot offer any solution.

      Lastly, we agree with the reviewer that it would be worthwhile to explore how Uif controls membrane formation at the subcellular level. This, however, is a completely new project and will require the identification of the binding partners of Uif in wrapping glia to start working on a link between Uif and membrane extension. The reduced branching phenotype might well be a direct consequence of excessive membrane formation as it likely blocks recourses needed for efficient growth of glial processes.

      Finally, in light of the importance of correct ensheathment of axons by glia for neuronal function, this study will be of general interest to the glial biology community. 

      We are very grateful for this very positive comment.

      Reviewer #2 (Public review): 

      The FGF receptor Heartless has previously been implicated in Drosophila peripheral glial growth and axonal wrapping. Here, the authors perform a large-scale screen of over 2600 RNAi lines to find factors that control the downstream signaling in this process. They identify a transmembrane protein Uninflatable to be necessary for the formation of plasma membrane domains. They further find that a Uif regulatory target, Notch, is necessary for glial wrapping. Interestingly, additional evidence suggests Notch itself regulates uif and htl, suggesting a feedback system. Together, they propose that Uif functions as a "switch" to regulate the balance between glial growl and wrapping of axons. 

      Little is known about how glial cell properties are coordinated with axons, and the identification of Uif is a promising link to shed light on this orchestration. The manuscript is well-written, and the experiments are generally well-controlled. The EM studies in particular are of outstanding quality and really help to mechanistically dissect the consequences of Uif and Notch signaling in the regulation of glial processes. Together, this valuable study provides convincing evidence of a new player coordinating the interactions controlling the glial wrapping of axons.

      Reviewer #1 (Recommendations for the authors): 

      (1) To be reproducible and understandable, it would be important to provide detailed information about crosses and genotypes, as reagents are currently listed individually and genotypes are provided in rather simplified versions. 

      We have added the requested information to the text.

      (2) Neurons are inherently resistant to RNAi-mediated knockdown and it thus may be necessary to introduce the over-expression of UAS-dcr2 when assessing neuronal requirements and to specifically exclude Delta or Serrate as ligands. 

      We agree with the reviewer and have repeated the knockdown experiments using UAS-dcr2 and obtained the same results. To use an RNAi independent approach we also employed sgRNA expression in the presence of Cas9. The neuron specific gene knockout also showed no glial wrapping phenotype. These results are now added to the manuscript.

      (3) Throughout the manuscript, the authors use the terms "growth" and "differentiation" referring to the extent of branch formation versus axon wrapping. However glial differentiation and growth could have different meanings (for instance, growth could implicate changes in cell size or numbers, while differentiation could refer to a change from an immature precursor-like state to a mature cell identity). It may thus be useful to replace these general terms with more specific ones. 

      This is a very good point. When we use the term “growth” we only infer on glial cell growth and thus, the increase in cell mass. Proliferation is excluded and this is now explicitly stated in the manuscript. The term “differentiation” is indeed difficult and therefore we changed it either directly addressing the morphology or to axon wrapping.

      (4) Page 4. "remake" fibers should be Remak fibers. 

      We have corrected this typo.

      (5) Page 5. "Heartless controls glial growth but does promote axonal wrapping", this sentence is not clear in its message because of the "but".

      We have corrected this sentence.

      (6) Generally, many gene names are used as abbreviations without introductions (e.g. Sos, Rl, Msk on page 7). These would require an introduction.

      All genetic elements are now introduced.

      (7) Page 8. When Cas9 is expressed ubiquitously ... It would be helpful to add how this is done (nsyb-Gal4, nrv2-Gal4, or another Gal4 driver are used to express UAS-Cas9, as the listed Gal4 drivers seem to be specific to neurons or glia?).

      This now added. We used the following genotype for ubiquitous knockout using the four different uif specific sgRNAs (UAS-uif<sup>sgRNA X</sup>): [w; UAS-Cas9/ Df(2L)ED438; da-Gal4 /UAS-uif<sup>sgRNA X</sup>]. We used the following genotype for a glial knockout in wrapping glia ([+/+; UAS-Cas9/+; nrv2-Gal4,UAS-CD8::mCherry/UAS-uif<sup>sgRNA X</sup>].

      We had previously shown that nrv2-Gal4 is a wrapping glia specific driver in the larval PNS (Kottmeier et al., 2020).

      Moreover, the authors mention that "This indicates that a putatively secreted version of Uif is not functional". This conclusion would need to be explained in detail.

      First, because it requires quite some detective work to understand the panels in Figure 1 on which this statement is based; second, since the acutely induced double-stranded breaks in the DNA and subsequent repair may cause variable defects, it may indeed be not certain what changes have been induced in each cell; and third considering that there is a putative cleavage site, would it be not be expected that the protein is not functional, when it is not cleaved, and there is no secreted extracellular part (unless the cleavage site is not required). The latter could probably only be addressed by rescue experiments with UAS transgenes with identified changes.

      We agree with the reviewer. The rescue experiments are unfortunately difficult, since even expression of a full length uif construct does not fully rescue the uif mutant phenotype (Loubéry et al., 2014). We therefore explained the conclusion taken from the different sgRNA knockout experiments better and also removed the statement that secreted Uif forms are non-functional.

      In the Star Method reagent table, it is not clear, why all 8 oligonucleotides are for "uif cleavage just before transmembrane domain" despite targeting different locations. 

      We are very sorry for this mistake and corrected it now. Thank you very much for spotting this.

      (8) Page 13. However, we expressed activated Notch,... the word "when" seems to be missing, and it would be helpful to specify how this was done (over-expression of N[ICD].

      We corrected it now accordingly.

      (9) To strengthen the point similarity of phenotypes caused by Htl pathway over-activation and Uif over-expression, it would be helpful to also show an EM electron micrograph of the former.

      We now added an extensive description of the phenotype caused by activated Heartless. This is shown as new Figure 2.

      (10) Figure 4C, the larval nerve seems to be younger, as many extracellular spaces between axons are detected.

      This perception is a misunderstanding and we are sorry for not explaining this better. The third instar larvae are all age matched. The particular specimen in Figure 4C shows some fixation artifacts that result in the loss of material. Importantly, however, membranes are not affected. Similar loss of material is also seen in Figure 6C. For further examples please see a study on nerve anatomy by (Kautzmann et al., 2025).

      (11) The model could be presented as a figure panel in the manuscript. To connect the recommendation section with the above public review, a step forward could be to adjust the model and the wording in the Result section and to move some of the less explored points and thoughts to the discussion.

      We are thankful for this advice and have moved an updated model figure to the end of the main text (now Figure 7).

      Reviewer #2 (Recommendations for the authors):

      (1) Screen and the interest in Uif: Out of the ~62 genes that came out of the RNAi screen, why did the authors prioritize and focus on Uif? What were the other genes that came out of the screen, and did any of those impinge on Notch signaling? 

      We have now more thoroughly described the results of the screen.  We selected Uif as it was the only transmembrane // adhesion protein identified and given the findings that Uif decorate apical membrane domains in epithelial cells, we hoped to identify a protein specific for a similar membrane domain in wrapping glia.

      Notch as well as its downstream transcription factors were not included in the initial screen, and were only analyzed, once we had seen the contribution of Notch. Interestingly, here is one single hit in our screen linked to Notch signaling: Gp150. Here however, we have tested additional dsRNA expressing lines and were not able to reproduce the phenotype. This information is added to the discussion.

      The authors performed a large-scale screen of 2600 RNAi lines, it seems more details about what came out of the screen and why the focus on Uif would benefit the manuscript. 

      See above comment.

      Relatedly, there would be a discussion of the limitations of the screen, and that it was really a screen looking to modify a gain-of-function phenotype from the activated Htl allele; it seems a screen of this design may lead to artifacts that may not reflect endogenous signaling.

      We have now added a short paragraph on suppressor screens, employing gain of function alleles to the introduction.

      “In Drosophila, such suppressor screens have been used successfully many times (Macagno et al., 2014; Rebay et al., 2000; Therrien et al., 2000). Possibly, such screens also uncover genes that are not directly linked to the signaling pathway under study but this can be tested in further experiments. Our screen led to the unexpected identification of the large transmembrane protein Uninflatable, which in epithelial cells localizes to the apical plasma membrane. Loss of uninflatable suppresses the phenotype caused by activated RTK signaling. In addition, we find that uif knockdown and uif knockout larvae show impaired glial growth while an excess of Uninflatable leads to the formation of ectopic wrapping membrane processes that, however, fail to interact with axons. uninflatable is also known to inhibit Notch.  “

      (2) In general this study relies on RNAi knockdown, and is generally well controlled in using multiple RNAi lines giving the same phenotype, and also controlled for by tissue-specific gene knockout. However, there is little in the way of antibody staining to directly confirm the target of interest is lost/reduced, which would obviously strengthen the study. 

      Lacking the tools or ability to assess RNAi efficiency (qPCR, antibody staining), some conclusions need to be tempered. For example, in the experiments in Figure S6 regarding canonical Notch signaling, the authors do not find a phenotype by Delta or Serrate knockdown, but there are no experiments that show Delta or Serrate are lost. Thus, if the authors cannot directly test for RNAi efficiency, these conclusions should be tempered throughout the manuscript. 

      We agree with the reviewer and now provide information on the use of Dicer in our RNAi experiments and conducted new sgRNA/Cas9 experiments. In addition we tempered our wording stating that Dl and or Ser are still possible ligands.

      (3) More description is needed regarding how the authors are measuring and calculating the "wrapping index". In principle, the approach seems sound. However, are there cases where axons are "partially" wrapped of various magnitudes, and how are these cases treated in the analysis? Are there additional controls of previously characterized mutants to illustrate the dynamic range of the wrapping index in various conditions?

      This is now explained.

      Further, can the authors quantify the phenotypes in the axonal "bulges" in Figures 1, 3, and 5?

      This is a difficult question. Although we can easily quantify the number of bulges we cannot quantify the severity of the phenotype as this will require EM analysis. Sectioning nerves at a specific distance of the ventral nerve cord already requires very careful adjustments. Sectioning at the level of a bulge is way more difficult and it is not possible to get the number of sections needed to quantify the bulge phenotype.

      The fact is that all wrapping glial cells develop swellings (bulges) at the position of the nucleus. As there are in general three wrapping glial cells per segmental nerve, the number of bulges is three.

      (4) It seems difficult to clearly untangle the functions of Htl/Uif/Notch in differentiation itself vs subsequent steps in growth/wrapping. For example, if the differentiation steps are not properly coordinated, couldn't this give rise to some observed differences in growth or wrapping at later stages? I'm not sure of any obvious experiments to pursue here, but at least a brief discussion of these issues in the manuscript would be of use.

      We have discussed this in our discussion now more carefully. To discriminate the function of the three genes in either differentiation or in a stepwise mode of growth and differentiation.

      When comparing the different loss of function phenotypes they al appear the same, which would argue all three genes act in a common process.

      However, when we look at gain of function phenotypes, Htl and Uif behave different compared to Notch. This would favor for two distinct processes.

      We have now added activity markers for RTK signaling to directly show that Notch silences RTK activity. Unfortunately we were not able to do a similar reciprocal experiment.

      Minor:

      (1) The Introduction is too long, and would benefit from revisions to make it shorter and more concise.

      We have shortened the introduction and hopefully made it more concise.

      (2) A schematic illustrating the model the authors propose about Htl, Uif, and Notch in glial differentiation, growth, and wrapping would benefit the clarity of this work. 

      We had previously added the graphical abstract below that we updated and included as a Figure in the main text.

      References

      Kautzmann, S., Rey, S., Krebs, A., and Klämbt, C. (2025). Cholinergic and glutamatergic axons differentially require glial support in the Drosophila PNS. Glia. 10.1002/glia.70011.

      Kottmeier, R., Bittern, J., Schoofs, A., Scheiwe, F., Matzat, T., Pankratz, M., and Klämbt, C. (2020). Wrapping glia regulates neuronal signaling speed and precision in the peripheral nervous system of Drosophila. Nature communications 11, 4491-4417. 10.1038/s41467-020-18291-1.

      Loubéry, S., Seum, C., Moraleda, A., Daeden, A., Fürthauer, M., and González-Gaitán, M. (2014). Uninflatable and Notch control the targeting of Sara endosomes during asymmetric division. Current biology : CB 24, 2142-2148. 10.1016/j.cub.2014.07.054.

      Macagno, J.P., Diaz Vera, J., Yu, Y., MacPherson, I., Sandilands, E., Palmer, R., Norman, J.C., Frame, M., and Vidal, M. (2014). FAK acts as a suppressor of RTK-MAP kinase signalling in Drosophila melanogaster epithelia and human cancer cells. PLoS Genet 10, e1004262. 10.1371/journal.pgen.1004262.

      Rebay, I., Chen, F., Hsiao, F., Kolodziej, P.A., Kuang, B.H., Laverty, T., Suh, C., Voas, M., Williams, A., and Rubin, G.M. (2000). A genetic screen for novel components of the Ras/Mitogen-activated protein kinase signaling pathway that interact with the yan gene of Drosophila identifies split ends, a new RNA recognition motif-containing protein. Genetics 154, 695-712. 10.1093/genetics/154.2.695.

      Sieglitz, F., Matzat, T., Yuva-Adyemir, Y., Neuert, H., Altenhein, B., and Klämbt, C. (2013). Antagonistic Feedback Loops Involving Rau and Sprouty in the Drosophila Eye Control Neuronal and Glial Differentiation. Science signaling 6, ra96. 10.1126/scisignal.2004651.

      Therrien, M., Morrison, D.K., Wong, A.M., and Rubin, G.M. (2000). A genetic screen for modifiers of a kinase suppressor of Ras-dependent rough eye phenotype in Drosophila. Genetics 156, 1231-1242.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      We appreciate the reviewer for the critical review of the manuscript and the valuable comments. We have carefully considered the reviewer’s comments and have revised our manuscript accordingly.

      The reviewer’s comments in this letter are in Bold and Italics.

      Summary:

      This study identified three independent components of glucose dynamics-"value," "variability," and "autocorrelation", and reported important findings indicating that they play an important role in predicting coronary plaque vulnerability. Although the generalizability of the results needs further investigation due to the limited sample size and validation cohort limitations, this study makes several notable contributions: validation of autocorrelation as a new clinical indicator, theoretical support through mathematical modeling, and development of a web application for practical implementation. These contributions are likely to attract broad interest from researchers in both diabetology and cardiology and may suggest the potential for a new approach to glucose monitoring that goes beyond conventional glycemic control indicators in clinical practice.

      Strengths:

      The most notable strength of this study is the identification of three independent elements in glycemic dynamics: value, variability, and autocorrelation. In particular, the metric of autocorrelation, which has not been captured by conventional glycemic control indices, may bring a new perspective for understanding glycemic dynamics. In terms of methodological aspects, the study uses an analytical approach combining various statistical methods such as factor analysis, LASSO, and PLS regression, and enhances the reliability of results through theoretical validation using mathematical models and validation in other cohorts. In addition, the practical aspect of the research results, such as the development of a Web application, is also an important contribution to clinical implementation.

      We appreciate reviewer #1 for the positive assessment and for the valuable and constructive comments on our manuscript.

      Weaknesses:

      The most significant weakness of this study is the relatively small sample size of 53 study subjects. This sample size limitation leads to a lack of statistical power, especially in subgroup analyses, and to limitations in the assessment of rare events. 

      We appreciate the reviewer’s concern regarding the sample size. We acknowledge that a larger sample size would increase statistical power, especially for subgroup analyses and the assessment of rare events.

      We would like to clarify several points regarding the statistical power and validation of our findings. Our sample size determination followed established methodological frameworks, including the guidelines outlined by Muyembe Asenahabi, Bostely, and Peters Anselemo Ikoha. “Scientific research sample size determination.” (2023). These guidelines balance the risks of inadequate sample size with the challenges of unnecessarily large samples. For our primary analysis examining the correlation between CGM-derived measures and %NC, power calculations (a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4) indicated that a minimum of 47 participants was required. Our sample size of 53 exceeded this threshold and allowed us to detect statistically significant correlations, as described in the Methods section. Moreover, to provide transparency about the precision of our estimates, we have included confidence intervals for all coefficients. 

      Furthermore, our sample size aligns with previous studies investigating the associations between glucose profiles and clinical parameters, including Torimoto, Keiichi, et al. “Relationship between fluctuations in glucose levels measured by continuous glucose monitoring and vascular endothelial dysfunction in type 2 diabetes mellitus.” Cardiovascular Diabetology 12 (2013): 1-7. (n=57), Hall, Heather, et al. “Glucotypes reveal new patterns of glucose dysregulation.” PLoS biology 16.7 (2018): e2005143. (n=57), and Metwally, Ahmed A., et al. “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.” Nature Biomedical Engineering (2024): 1-18. (n=32).

      Furthermore, the primary objective of our study was not to assess rare events, but rather to demonstrate that glucose dynamics can be decomposed into three main factors - mean, variance and autocorrelation - whereas traditional measures have primarily captured mean and variance without adequately reflecting autocorrelation. We believe that our current sample size effectively addresses this objective. 

      Regarding the classification of glucose dynamics components, we have conducted additional validation across diverse populations including 64 Japanese, 53 American, and 100 Chinese individuals. These validation efforts have consistently supported our identification of three independent glucose dynamics components.

      However, we acknowledge the importance of further validation on a larger scale. To address this, we conducted a large follow-up study of over 8,000 individuals (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      To address the sample size considerations, we have added the following sentences in the Discussion section (lines 409-414): 

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      We appreciate the reviewer’s feedback and believe that these clarifications improve the manuscript.

      In terms of validation, several challenges exist, including geographical and ethnic biases in the validation cohorts, lack of long-term follow-up data, and insufficient validation across different clinical settings. In terms of data representativeness, limiting factors include the inclusion of only subjects with well-controlled serum cholesterol and blood pressure and the use of only short-term measurement data.

      We appreciate the reviewer’s comment regarding the challenges associated with validation. In terms of geographic and ethnic diversity, our study includes validation datasets from diverse populations, including 64 Japanese, 53 American and 100 Chinese individuals. These datasets include a wide range of metabolic states, from healthy individuals to those with diabetes, ensuring validation across different clinical conditions. In addition, we recognize the limited availability of publicly available datasets with sufficient sample sizes for factor decomposition that include both healthy individuals and those with type 2 diabetes (Zhao, Qinpei, et al. “Chinese diabetes datasets for data-driven machine learning.” Scientific Data 10.1 (2023): 35.). The main publicly available datasets with relevant clinical characteristics have already been analyzed in this study using unbiased approaches.

      However, we fully agree with the reviewer that expanding the geographic and ethnic scope, including long-term follow-up data, and validation in different clinical settings would further strengthen the robustness and generalizability of our findings. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of follow-up (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      Regarding the validation considerations, we have added the following sentences to the Discussion section (lines 409-414, 354-361): 

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      Although our LASSO and factor analysis indicated that CGM-derived measures were strong predictors of %NC, this does not mean that other clinical parameters, such as lipids and blood pressure, are irrelevant in T2DM complications. Our study specifically focused on characterizing glucose dynamics, and we analyzed individuals with well-controlled serum cholesterol and blood pressure to reduce confounding effects. While we anticipate that inclusion of a more diverse population would not alter our primary findings regarding glucose dynamics, it is likely that a broader data set would reveal additional predictive contributions from lipid and blood pressure parameters.

      In terms of elucidation of physical mechanisms, the study is not sufficient to elucidate the mechanisms linking autocorrelation and clinical outcomes or to verify them at the cellular or molecular level.

      We appreciate the reviewer’s point regarding the need for further elucidation of the physical mechanisms linking glucose autocorrelation to clinical outcomes. We fully agree with the reviewer that the detailed molecular and cellular mechanisms underlying this relationship are not yet fully understood, as noted in our Discussion section.

      However, we would like to emphasize the theoretical basis that supports the clinical relevance of autocorrelation. Our results show that glucose profiles with identical mean and variability can exhibit different autocorrelation patterns, highlighting that conventional measures such as mean or variance alone may not fully capture inter-individual metabolic differences. Incorporating autocorrelation analysis provides a more comprehensive characterization of metabolic states. Consequently, incorporating autocorrelation measures alongside traditional diabetes diagnostic criteria - such as fasting glucose, HbA1c and PG120, which primarily reflect only the “mean” component - can improve predictive accuracy for various clinical outcomes. While further research at the cellular and molecular level is needed to fully validate these findings, it is important to note that the primary goal of this study was to analyze the characteristics of glucose dynamics and gain new insights into metabolism, rather than to perform molecular biology experiments.

      Furthermore, our previous research has shown that glucose autocorrelation reflects changes in insulin clearance (Sugimoto, Hikaru, et al. “Improved detection of decreased glucose handling capacities via continuous glucose monitoring-derived indices.” Communications Medicine 5.1 (2025): 103.). The relationship between insulin clearance and cardiovascular disease has been well documented (Randrianarisoa, Elko, et al. “Reduced insulin clearance is linked to subclinical atherosclerosis in individuals at risk for type 2 diabetes mellitus.” Scientific reports 10.1 (2020): 22453.), and the mechanisms described in this prior work may potentially explain the association between glucose autocorrelation and clinical outcomes observed in the present study.

      Rather than a limitation, we view these currently unexplored associations as an opportunity for further research. The identification of autocorrelation as a key glycemic feature introduces a new dimension to metabolic regulation that could serve as the basis for future investigations exploring the molecular mechanisms underlying these patterns.

      While we agree that further research at the cellular and molecular level is needed to fully validate these findings, we believe that our study provides a theoretical framework to support the clinical utility of autocorrelation analysis in glucose monitoring, and that this could serve as the basis for future investigations exploring the molecular mechanisms underlying these autocorrelation patterns, which adds to the broad interest of this study. Regarding the physical mechanisms linking autocorrelation and clinical outcomes, we have added the following sentences in the Discussion section (lines 331-339, 341-352): 

      This study also provided evidence that autocorrelation can vary independently from the mean and variance components using simulated data. In addition, simulated glucose dynamics indicated that even individuals with high AC_Var did not necessarily have high maximum and minimum blood glucose levels. This study also indicated that these three components qualitatively corresponded to the four distinct glucose patterns observed after glucose administration, which were identified in a previous study (Hulman et al., 2018). Thus, the inclusion of autocorrelation in addition to mean and variance may improve the characterization of inter-individual differences in glucose regulation and improve the predictive accuracy of various clinical outcomes.

      Despite increasing evidence linking glycemic variability to oxidative stress and endothelial dysfunction in T2DM complications (Ceriello et al., 2008; Monnier et al., 2008), the biological mechanisms underlying the independent predictive value of autocorrelation remain to be elucidated. Our previous work has shown that glucose autocorrelation is influenced by insulin clearance (Sugimoto et al., 2025), a process known to be associated with cardiovascular disease risk (Randrianarisoa et al., 2020). Therefore, the molecular pathways linking glucose autocorrelation to cardiovascular disease may share common mechanisms with those linking insulin clearance to cardiovascular disease. Although previous studies have primarily focused on investigating the molecular mechanisms associated with mean glucose levels and glycemic variability, our findings open new avenues for exploring the molecular basis of glucose autocorrelation, potentially revealing novel therapeutic targets for preventing diabetic complications.

      Reviewer #2 (Public review):

      We appreciate the reviewer for the critical review of the manuscript and the valuable comments. We have carefully considered the reviewer’s comments and have revised our manuscript accordingly. The reviewer’s comments in this letter are in Bold and Italics.

      Sugimoto et al. explore the relationship between glucose dynamics - specifically value, variability, and autocorrelation - and coronary plaque vulnerability in patients with varying glucose tolerance levels. The study identifies three independent predictive factors for %NC and emphasizes the use of continuous glucose monitoring (CGM)-derived indices for coronary artery disease (CAD) risk assessment. By employing robust statistical methods and validating findings across datasets from Japan, America, and China, the authors highlight the limitations of conventional markers while proposing CGM as a novel approach for risk prediction. The study has the potential to reshape CAD risk assessment by emphasizing CGM-derived indices, aligning well with personalized medicine trends.

      Strengths:

      (1) The introduction of autocorrelation as a predictive factor for plaque vulnerability adds a novel dimension to glucose dynamic analysis.

      (2) Inclusion of datasets from diverse regions enhances generalizability.

      (3) The use of a well-characterized cohort with controlled cholesterol and blood pressure levels strengthens the findings.

      (4) The focus on CGM-derived indices aligns with personalized medicine trends, showcasing the potential for CAD risk stratification.

      We appreciate reviewer #2 for the positive assessment and for the valuable and constructive comments on our manuscript.

      Weaknesses:

      (1) The link between autocorrelation and plaque vulnerability remains speculative without a proposed biological explanation. 

      We appreciate the reviewer’s point about the need for a clearer biological explanation linking glucose autocorrelation to plaque vulnerability. We fully agree with the reviewer that the detailed biological mechanisms underlying this relationship are not yet fully understood, as noted in our Discussion section.

      However, we would like to emphasize the theoretical basis that supports the clinical relevance of autocorrelation. Our results show that glucose profiles with identical mean and variability can exhibit different autocorrelation patterns, highlighting that conventional measures such as mean or variance alone may not fully capture inter-individual metabolic differences. Incorporating autocorrelation analysis provides a more comprehensive characterization of metabolic states. Consequently, incorporating autocorrelation measures alongside traditional diabetes diagnostic criteria - such as fasting glucose, HbA1c and PG120, which primarily reflect only the “mean” component - can improve predictive accuracy for various clinical outcomes.

      Furthermore, our previous research has shown that glucose autocorrelation reflects changes in insulin clearance (Sugimoto, Hikaru, et al. “Improved detection of decreased glucose handling capacities via continuous glucose monitoring-derived indices.” Communications Medicine 5.1 (2025): 103.). The relationship between insulin clearance and cardiovascular disease has been well documented (Randrianarisoa, Elko, et al. “Reduced insulin clearance is linked to subclinical atherosclerosis in individuals at risk for type 2 diabetes mellitus.” Scientific reports 10.1 (2020): 22453.), and the mechanisms described in this prior work may potentially explain the association between glucose autocorrelation and clinical outcomes observed in the present study. 

      Rather than a limitation, we view these currently unexplored associations as an opportunity for further research. The identification of autocorrelation as a key glycemic feature introduces a new dimension to metabolic regulation that could serve as the basis for future investigations exploring the molecular mechanisms underlying these patterns.

      While we agree that further research at the cellular and molecular level is needed to fully validate these findings, we believe that our study provides a theoretical framework to support the clinical utility of autocorrelation analysis in glucose monitoring, and that this could serve as the basis for future investigations exploring the molecular mechanisms underlying these autocorrelation patterns, which adds to the broad interest of this study. Regarding the physical mechanisms linking autocorrelation and clinical outcomes, we have added the following sentences in the Discussion section (lines 331-339, 341-352): 

      This study also provided evidence that autocorrelation can vary independently from the mean and variance components using simulated data. In addition, simulated glucose dynamics indicated that even individuals with high AC_Var did not necessarily have high maximum and minimum blood glucose levels. This study also indicated that these three components qualitatively corresponded to the four distinct glucose patterns observed after glucose administration, which were identified in a previous study (Hulman et al., 2018). Thus, the inclusion of autocorrelation in addition to mean and variance may improve the characterization of inter-individual differences in glucose regulation and improve the predictive accuracy of various clinical outcomes.

      Despite increasing evidence linking glycemic variability to oxidative stress and endothelial dysfunction in T2DM complications (Ceriello et al., 2008; Monnier et al., 2008), the biological mechanisms underlying the independent predictive value of autocorrelation remain to be elucidated. Our previous work has shown that glucose autocorrelation is influenced by insulin clearance (Sugimoto et al., 2025), a process known to be associated with cardiovascular disease risk (Randrianarisoa et al., 2020). Therefore, the molecular pathways linking glucose autocorrelation to cardiovascular disease may share common mechanisms with those linking insulin clearance to cardiovascular disease. Although previous studies have primarily focused on investigating the molecular mechanisms associated with mean glucose levels and glycemic variability, our findings open new avenues for exploring the molecular basis of glucose autocorrelation, potentially revealing novel therapeutic targets for preventing diabetic complications.

      (2) The relatively small sample size (n=270) limits statistical power, especially when stratified by glucose tolerance levels. 

      We appreciate the reviewer’s concern regarding sample size and its potential impact on statistical power, especially when stratified by glucose tolerance levels. We fully agree that a larger sample size would increase statistical power, especially for subgroup analyses.

      We would like to clarify several points regarding the statistical power and validation of our findings. Our sample size followed established methodological frameworks, including the guidelines outlined by Muyembe Asenahabi, Bostely, and Peters Anselemo Ikoha. “Scientific research sample size determination.” (2023). These guidelines balance the risks of inadequate sample size with the challenges of unnecessarily large samples. For our primary analysis examining the correlation between CGM-derived measures and %NC, power calculations (a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4) indicated that a minimum of 47 participants was required. Our sample size of 53 exceeded this threshold and allowed us to detect statistically significant correlations, as described in the Methods section. Moreover, to provide transparency about the precision of our estimates, we have included confidence intervals for all coefficients. 

      Furthermore, our sample size aligns with previous studies investigating the associations between glucose profiles and clinical parameters, including Torimoto, Keiichi, et al. “Relationship between fluctuations in glucose levels measured by continuous glucose monitoring and vascular endothelial dysfunction in type 2 diabetes mellitus.” Cardiovascular Diabetology 12 (2013): 1-7. (n=57), Hall, Heather, et al. “Glucotypes reveal new patterns of glucose dysregulation.” PLoS biology 16.7 (2018): e2005143. (n=57), and Metwally, Ahmed A., et al. “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.” Nature Biomedical Engineering (2024): 1-18. (n=32).

      Regarding the classification of glucose dynamics components, we have conducted additional validation across diverse populations including 64 Japanese, 53 American, and 100 Chinese individuals. These validation efforts have consistently supported our identification of three independent glucose dynamics components.

      However, we acknowledge the importance of further validation on a larger scale. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of followup (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      To address the sample size considerations, we have added the following sentences in the Discussion section (lines 409-414): 

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      (3) Strict participant selection criteria may reduce applicability to broader populations. 

      We appreciate the reviewer’s comment regarding the potential impact of strict participant selection criteria on the broader applicability of our findings. We acknowledge that extending validation to more diverse populations would improve the generalizability of our findings.

      Our study includes validation cohorts from diverse populations, including 64 Japanese, 53 American and 100 Chinese individuals. These cohorts include a wide range of metabolic states, from healthy individuals to those with diabetes, ensuring validation across different clinical conditions. However, we acknowledge that further validation in additional populations and clinical settings would strengthen our conclusions. To address this, we conducted a large follow-up study of over 8,000 individuals (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      We have added the following text to the Discussion section to address these considerations (lines 409-414, 354-361):

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      Although our LASSO and factor analysis indicated that CGM-derived measures were strong predictors of %NC, this does not mean that other clinical parameters, such as lipids and blood pressure, are irrelevant in T2DM complications. Our study specifically focused on characterizing glucose dynamics, and we analyzed individuals with well-controlled serum cholesterol and blood pressure to reduce confounding effects. While we anticipate that inclusion of a more diverse population would not alter our primary findings regarding glucose dynamics, it is likely that a broader data set would reveal additional predictive contributions from lipid and blood pressure parameters.

      (4) CGM-derived indices like AC_Var and ADRR may be too complex for routine clinical use without simplified models or guidelines. 

      We appreciate the reviewer’s concern about the complexity of CGM-derived indices such as AC_Var and ADRR for routine clinical use. We acknowledge that for these indices to be of practical use, they must be both interpretable and easily accessible to healthcare providers. 

      To address this concern, we have developed an easy-to-use web application that automatically calculates these measures, including AC_Var, mean glucose levels, and glucose variability (https://cgmregressionapp2.streamlit.app/). This tool eliminates the need for manual calculations, making these indices more practical for clinical implementation.

      Regarding interpretability, we acknowledge that establishing specific clinical guidelines would enhance the practical utility of these measures. For example, defining a cut-off value for AC_Var above which the risk of diabetes complications increases significantly would provide clearer clinical guidance. However, given our current sample size limitations and our predefined objective of investigating correlations among indices, we have taken a conservative approach by focusing on the correlation between AC_Var and %NC rather than establishing definitive cutoffs. This approach intentionally avoids problematic statistical practices like phacking. It is not realistic to expect a single study to accomplish everything from proposing a new concept to conducting large-scale clinical trials to establishing clinical guidelines. Establishing clinical guidelines typically requires the accumulation of multiple studies over many years. Recognizing this reality, we have been careful in our manuscript to make modest claims about the discovery of new “correlations” rather than exaggerated claims about immediate routine clinical use.

      To address this limitation, we conducted a large follow-up study of over 8,000 individuals in the next study (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which proposed clinically relevant cutoffs and reference ranges for AC_Var and other CGM-derived indices. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, by integrating automated calculation tools with clear clinical thresholds, we expect to make these measures more accessible for clinical use.

      We have added the following text to the Discussion section to address these considerations (lines 415-419):

      While CGM-derived indices such as AC_Var and ADRR hold promise for CAD risk assessment, their complexity may present challenges for routine clinical implementation. To improve usability, we have developed a web-based calculator that automates these calculations. However, defining clinically relevant thresholds and reference ranges requires further validation in larger cohorts.

      (5) The study does not compare CGM-derived indices to existing advanced CAD risk models, limiting the ability to assess their true predictive superiority. 

      We appreciate the reviewer’s comment regarding the comparison of CGMderived indices with existing CAD risk models. Given that our study population consisted of individuals with well-controlled total cholesterol and blood pressure levels, a direct comparison with the Framingham Risk Score for Hard Coronary Heart Disease (Wilson, Peter WF, et al. “Prediction of coronary heart disease using risk factor categories.” Circulation 97.18 (1998): 1837-1847.) may introduce inherent bias, as these factors are key components of the score.

      Nevertheless, to further assess the predictive value of the CGM-derived indices, we performed additional analyses using linear regression to predict %NC. Using the Framingham Risk Score, we obtained an R² of 0.04 and an Akaike Information Criterion (AIC) of 330. In contrast, our proposed model incorporating the three glycemic parameters - CGM_Mean, CGM_Std, and AC_Var - achieved a significantly improved R² of 0.36 and a lower AIC of 321, indicating superior predictive accuracy. 

      We have added the following text to the Result section (lines 115-122):

      The regression model including CGM_Mean, CGM_Std and AC_Var to predict %NC achieved an R² of 0.36 and an Akaike Information Criterion (AIC) of 321. Each of these indices showed statistically significant independent positive correlations with %NC (Fig. 1A). In contrast, the model using conventional glycemic markers (FBG, HbA1c, and PG120) yielded an R² of only 0.05 and an AIC of 340 (Fig. 1B). Similarly, the model using the Framingham Risk Score for Hard Coronary Heart Disease (Wilson et al., 1998) showed limited predictive value, with an R² of 0.04 and an AIC of 330 (Fig. 1C).

      (6) Varying CGM sampling intervals (5-minute vs. 15-minute) were not thoroughly analyzed for impact on results. 

      We appreciate the reviewer’s comment regarding the potential impact of different CGM sampling intervals on our results. To assess the robustness of our findings across different sampling frequencies, we performed a down sampling analysis by converting our 5minute interval data to 15-minute intervals. The AC_Var value calculated from 15-minute intervals was significantly correlated with that calculated from 5-minute intervals (R = 0.99, 95% CI: 0.97-1.00). Furthermore, the regression model using CGM_Mean, CGM_Std, and AC_Var from 15-minute intervals to predict %NC achieved an R² of 0.36 and an AIC of 321, identical to the model using 5-minute intervals. These results indicate that our results are robust to variations in CGM sampling frequency. 

      We have added this analysis to the Result section (lines 122-125):

      The AC_Var computed from 15-minute CGM sampling was nearly identical to that computed from 5-minute sampling (R = 0.99, 95% CI: 0.97-1.00) (Fig. S1A), and the regression using the 15‑min features yielded almost the same performance (R² = 0.36; AIC = 321; Fig. S1B).

      Reviewer #3 (Public review):

      We appreciate the reviewer for the critical review of the manuscript and the valuable comments. We have carefully considered the reviewer’s comments and have revised our manuscript accordingly. The reviewer’s comments in this letter are in Bold and Italics.

      Summary:

      This is a retrospective analysis of 53 individuals over 26 features (12 clinical phenotypes, 12 CGM features, and 2 autocorrelation features) to examine which features were most informative in predicting percent necrotic core (%NC) as a parameter for coronary plaque vulnerability. Multiple regression analysis demonstrated a better ability to predict %NC from 3 selected CGM-derived features than 3 selected clinical phenotypes. LASSO regularization and partial least squares (PLS) with VIP scores were used to identify 4 CGM features that most contribute to the precision of %NC. Using factor analysis they identify 3 components that have CGM-related features: value (relating to the value of blood glucose), variability (relating to glucose variability), and autocorrelation (composed of the two autocorrelation features). These three groupings appeared in the 3 validation cohorts and when performing hierarchical clustering. To demonstrate how these three features change, a simulation was created to allow the user to examine these features under different conditions.

      We appreciate reviewer #3 for the valuable and constructive comments on our manuscript.

      The goal of this study was to identify CGM features that relate to %NC. Through multiple feature selection methods, they arrive at 3 components: value, variability, and autocorrelation. While the feature list is highly correlated, the authors take steps to ensure feature selection is robust. There is a lack of clarity of what each component (value, variability, and autocorrelation) includes as while similar CGM indices fall within each component, there appear to be some indices that appear as relevant to value in one dataset and to variability in the validation. 

      We appreciate the reviewer’s comment regarding the classification of CGMderived measures into the three components: value, variability, and autocorrelation. As the reviewer correctly points out, some measures may load differently between the value and variability components in different datasets. However, we believe that this variability reflects the inherent mathematical properties of these measures rather than a limitation of our study.

      For example, the HBGI clusters differently across datasets due to its dependence on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S3A). Conversely, in populations with a wider range of mean glucose levels, HBGI correlates more strongly with mean glucose levels (Fig. 3A). This context-dependent behaviour is expected given the mathematical properties of these measures and does not indicate an inconsistency in our classification approach.

      Importantly, our main findings remain robust: CGM-derived measures systematically fall into three components-value, variability, and autocorrelation. Traditional CGM-derived measures primarily reflect either value or variability, and this categorization is consistently observed across datasets. While specific indices such as HBGI may shift classification depending on population characteristics, the overall structure of CGM data remains stable.

      To address these considerations, we have added the following text to the Discussion section (lines 388-396):

      Some indices, such as HBGI, showed variation in classification across datasets, with some populations showing higher factor loadings in the “mean” component and others in the “variance” component. This variation occurs because HBGI calculations depend on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S5A). Conversely, in populations with a wider range of mean glucose levels, the HBGI correlates more strongly with mean glucose levels (Fig. 3A). Despite these differences, our validation analyses confirm that CGM-derived indices consistently cluster into three components: mean, variance, and autocorrelation.

      We are sceptical about statements of significance without documentation of p-values. 

      We appreciate the reviewer’s concern regarding statistical significance and the documentation of p values.

      First, given the multiple comparisons in our study, we used q values rather than p values, as shown in Figure 1D. Q values provide a more rigorous statistical framework for controlling the false discovery rate in multiple testing scenarios, thereby reducing the likelihood of false positives.

      Second, our statistical reporting follows established guidelines, including those of the New England Journal of Medicine (Harrington, David, et al. “New guidelines for statistical reporting in the journal.” New England Journal of Medicine 381.3 (2019): 285-286.), which recommend that “reporting of exploratory end points should be limited to point estimates of effects with 95% confidence intervals” and that “replace p values with estimates of effects or association and 95% confidence intervals”. According to these guidelines, p values should not be reported in this type of study. We determined significance based on whether these 95% confidence intervals excluded zero - a method for determining whether an association is significantly different from zero (Tan, Sze Huey, and Say Beng Tan. "The correct interpretation of confidence intervals." Proceedings of Singapore Healthcare 19.3 (2010): 276-278.). 

      For the sake of transparency, we provide p values for readers who may be interested, although we emphasize that they should not be the basis for interpretation, as discussed in the referenced guidelines. Specifically, in Figure 1A-B, the p values for CGM_Mean, CGM_Std, and AC_Var were 0.02, 0.02, and <0.01, respectively, while those for FBG, HbA1c, and PG120 were 0.83,

      0.91, and 0.25, respectively. In Figure 3C, the p values for factors 1–5 were 0.03, 0.03, 0.03, 0.24, and 0.87, respectively, and in Figure S8C, the p values for factors 1–3 were <0.01, <0.01, and 0.20, respectively.

      We appreciate the opportunity to clarify our statistical methodology and are happy to provide additional details if needed.

      While hesitations remain, the ability of these authors to find groupings of these many CGM metrics in relation to %NC is of interest. The believability of the associations is impeded by an obtuse presentation of the results with core data (i.e. correlation plots between CGM metrics and %NC) buried in the supplement while main figures contain plots of numerical estimates from models which would be more usefully presented in supplementary tables. 

      We appreciate the reviewer’s comment regarding the presentation of our results and recognize the importance of ensuring clarity and accessibility of the core data. 

      The central finding of our study is twofold: first, that the numerous CGM-derived measures can be systematically classified into three distinct components-mean, variance, and autocorrelation-and second, that each of these components is independently associated with %NC. This insight cannot be derived simply from examining scatter plots of individual correlations, which are provided in the Supplementary Figures. Instead, it emerges from our statistical analyses in the main figures, including multiple regression models that reveal the independent contributions of these components to %NC.

      We acknowledge the reviewer’s concern regarding the accessibility of key data. To improve clarity, we have moved several scatter plots from the Supplementary Figures to the main figures (Fig. 1D-J) to allow readers to more directly visualize the relationships between CGM-derived measures and %NC. We believe this revision improved the transparency and readability of our results while maintaining the rigor of our analytical approach.

      Given the small sample size in the primary analysis, there is a lot of modeling done with parameters estimated where simpler measures would serve and be more convincing as they require less data manipulation. A major example of this is that the pairwise correlation/covariance between CGM_mean, CGM_std, and AC_var is not shown and would be much more compelling in the claim that these are independent factors.

      We appreciate the reviewer’s feedback on our statistical analysis and data presentation. The correlations between CGM_Mean, CGM_Std, and AC_Var were documented in Figure S1B. However, to improve accessibility and clarity, we have moved these correlation analyses to the main figures (Fig. 1F). 

      Regarding our modeling approach, we chose LASSO and PLS methods because they are wellestablished techniques that are particularly suited to scenarios with many input variables and a relatively small sample size. These methods have been used in the literature as robust approaches for variable selection under such conditions (Tibshirani R. 1996. Regression shrinkage and selection via the lasso. J R Stat Soc 58:267–288. Wold S, Sjöström M, Eriksson L. 2001. PLS-regression: a basic tool of chemometrics. Chemometrics Intellig Lab Syst 58:109–130. Pei X, Qi D, Liu J, Si H, Huang S, Zou S, Lu D, Li Z. 2023. Screening marker genes of type 2 diabetes mellitus in mouse lacrimal gland by LASSO regression. Sci Rep 13:6862. Wang C, Kong H, Guan Y, Yang J, Gu J, Yang S, Xu G. 2005. Plasma phospholipid metabolic profiling and biomarkers of type 2 diabetes mellitus based on high-performance liquid chromatography/electrospray mass spectrometry and multivariate statistical analysis.

      Anal Chem 77:4108–4116.). 

      Lack of methodological detail is another challenge. For example, the time period of CGM metrics or CGM placement in the primary study in relation to the IVUS-derived measurements of coronary plaques is unclear. Are they temporally distant or proximal/ concurrent with the PCI? 

      We appreciate the reviewer’s important question regarding the temporal relationship between CGM measurements and IVUS-derived plaque assessments. As described in our previous work (Otowa‐Suematsu, Natsu, et al. “Comparison of the relationship between multiple parameters of glycemic variability and coronary plaque vulnerability assessed by virtual histology–intravascular ultrasound.” Journal of Diabetes Investigation 9.3 (2018): 610615.), all individuals underwent continuous glucose monitoring for at least three consecutive days within the seven-day period prior to the PCI procedure. To improve clarity for readers, we have added the following text to the Methods section (lines 440-441):

      All individuals underwent CGM for at least three consecutive days within the seven-day period prior to the PCI procedure.

      A patient undergoing PCI for coronary intervention would be expected to have physiological and iatrogenic glycemic disturbances that do not reflect their baseline state. This is not considered or discussed. 

      We appreciate the reviewer’s concern regarding potential glycemic disturbances associated with PCI. As described in our previous work (Otowa‐Suematsu, Natsu, et al. “Comparison of the relationship between multiple parameters of glycemic variability and coronary plaque vulnerability assessed by virtual histology–intravascular ultrasound.” Journal of Diabetes Investigation 9.3 (2018): 610-615.), all CGM measurements were performed before the PCI procedure. This temporal separation ensures that the glycemic patterns analyzed in our study reflect the baseline metabolic state of the patients, rather than any physiological or iatrogenic effects of PCI. To avoid any misunderstanding, we have clarified this temporal relationship in the revised manuscript (lines 440-441):

      All individuals underwent CGM for at least three consecutive days within the seven-day period prior to the PCI procedure.

      The attempts at validation in external cohorts, Japanese, American, and Chinese are very poorly detailed. We could only find even an attempt to examine cardiovascular parameters in the Chinese data set but the outcome variables are unspecified with regard to what macrovascular events are included, their temporal relation to the CGM metrics, etc. Notably macrovascular event diagnoses are very different from the coronary plaque necrosis quantification. This could be a source of strength in the findings if carefully investigated and detailed but due to the lack of detail seems like an apples-to-oranges comparison. 

      We appreciate the reviewer’s comment regarding the validation cohorts and the need for greater clarity, particularly in the Chinese dataset. We acknowledge that our initial description lacked sufficient methodological detail, and we have expanded the Methods section to provide a more comprehensive explanation.

      For the Chinese dataset, the data collection protocol was previously documented (Zhao, Qinpei, et al. “Chinese diabetes datasets for data-driven machine learning.” Scientific Data 10.1 (2023): 35.). Briefly, trained research staff used standardized questionnaires to collect demographic and clinical information, including diabetes diagnosis, treatment history, comorbidities, and medication use. Physical examinations included anthropometric measurements, and body mass index was calculated using standard protocols. CGM was performed using the FreeStyle Libre H device (Abbott Diabetes Care, UK), which records interstitial glucose levels at 15-minute intervals for up to 14 days. Laboratory measurements, including metabolic panels, lipid profiles, and renal function tests, were obtained within six months of CGM placement. While previous studies have linked necrotic core to macrovascular events (Xie, Yong, et al. “Clinical outcome of nonculprit plaque ruptures in patients with acute coronary syndrome in the PROSPECT study.” JACC: Cardiovascular Imaging 7.4 (2014): 397-405.), we acknowledge the limitations of the cardiovascular outcomes in the Chinese data set. These outcomes were extracted from medical records rather than standardized diagnostic procedures or imaging studies. To address these concerns, we have added the following text to the Methods section (lines 496-504):

      The data collection protocol for the Chinese dataset was previously documented (Zhao et al., 2023). Briefly, trained research staff used standardized questionnaires to collect demographic and clinical information, including diabetes diagnosis, treatment history, comorbidities, and medication use. CGM records interstitial glucose levels at 15-minute intervals for up to 14 days. Laboratory measurements, including metabolic panels, lipid profiles, and renal function tests, were obtained within six months of CGM placement. While previous studies have linked necrotic core to macrovascular events, we acknowledge the limitations of the cardiovascular outcomes in the Chinese data set. These outcomes were extracted from medical records rather than from standardized diagnostic procedures or imaging studies.

      Finally, the simulations at the end are not relevant to the main claims of the paper and we would recommend removing them for the coherence of this manuscript. 

      We appreciate the reviewer’s feedback regarding the relevance of the simulation component of our manuscript. The primary contribution of our study goes beyond demonstrating correlations between CGM-derived measures and %NC; it highlights three fundamental components of glycemic patterns-mean, variability, and autocorrelation-and their independent relationships with coronary plaque characteristics. The simulations are included to illustrate how glycemic patterns with identical means and variability can have different autocorrelation structures. Because temporal autocorrelation can be conceptually difficult to interpret, these visualizations were intended to provide intuitive examples for the readers. 

      However, we agree with the reviewer’s concern about the coherence of the manuscript. In response, we have streamlined the simulation section by removing simulations that do not directly support our primary conclusions (old version of the manuscript, lines 239-246, 502526), while retaining only those that enhance understanding of the three glycemic components. Regarding reviewer 2’s minor comment #4, we acknowledge that autocorrelation can be challenging to understand intuitively. To address this, we kept Fig. 4A with a brief description.

      Recommendations for the authors:

      Reviewer 2# (Recommendations for the authors):

      Summary:

      The study by Sugimoto et. al. investigates the association between components of glucose dynamics-value, variability, and autocorrelation-and coronary plaque vulnerability (%NC) in patients with varying glucose tolerance levels. The research identifies three key factors that independently predict %NC and highlights the potential of continuous glucose monitoring (CGM)-derived indices in risk assessment for coronary artery disease (CAD). Using robust statistical methods and validation across diverse populations, the study emphasizes the limitations of conventional diagnostic markers and suggests a novel, CGMbased approach for improved predictive performance While the study demonstrates significant novelty and potential impact, several issues must be addressed by the authors.

      Major Comments:

      (1) The study demonstrates originality by introducing autocorrelation as a novel predictive factor in glucose dynamics, a perspective rarely explored in prior research. While the innovation is commendable, the biological mechanisms linking autocorrelation to plaque vulnerability remain speculative. Providing a hypothesis or potential pathways would enhance the scientific impact and practical relevance of this finding.

      We appreciate the reviewer’s point about the need for a clearer biological explanation linking glucose autocorrelation to plaque vulnerability. Our previous research has shown that glucose autocorrelation reflects changes in insulin clearance (Sugimoto, Hikaru, et al. “Improved detection of decreased glucose handling capacities via continuous glucose monitoring-derived indices.” Communications Medicine 5.1 (2025): 103.). The relationship between insulin clearance and cardiovascular disease has been well documented (Randrianarisoa, Elko, et al. “Reduced insulin clearance is linked to subclinical atherosclerosis in individuals at risk for type 2 diabetes mellitus.” Scientific reports 10.1 (2020): 22453.), and the mechanisms described in this prior work may potentially explain the association between glucose autocorrelation and clinical outcomes observed in the present study. We have added the following sentences to the Discussion section (lines 341-352):

      Despite increasing evidence linking glycemic variability to oxidative stress and endothelial dysfunction in T2DM complications (Ceriello et al., 2008; Monnier et al., 2008), the biological mechanisms underlying the independent predictive value of autocorrelation remain to be elucidated. Our previous work has shown that glucose autocorrelation is influenced by insulin clearance (Sugimoto et al., 2025), a process known to be associated with cardiovascular disease risk (Randrianarisoa et al., 2020). Therefore, the molecular pathways linking glucose autocorrelation to cardiovascular disease may share common mechanisms with those linking insulin clearance to cardiovascular disease. Although previous studies have primarily focused on investigating the molecular mechanisms associated with mean glucose levels and glycemic variability, our findings open new avenues for exploring the molecular basis of glucose autocorrelation, potentially revealing novel therapeutic targets for preventing diabetic complications.

      (2) The inclusion of datasets from Japan, America, and China adds a valuable cross-cultural dimension to the study, showcasing its potential applicability across diverse populations. Despite the multi-regional validation, the sample size (n=270) is relatively small, especially when stratified by glucose tolerance categories. This limits the statistical power and applicability to diverse populations. A larger, multi-center cohort would strengthen conclusions.

      We appreciate the reviewer’s concern regarding sample size and its potential impact on statistical power, especially when stratified by glucose tolerance levels. We fully agree that a larger sample size would increase statistical power, especially for subgroup analyses.

      We would like to clarify several points regarding the statistical power and validation of our findings. Our study adheres to established methodological frameworks for sample size determination, including the guidelines outlined by Muyembe Asenahabi, Bostely, and Peters Anselemo Ikoha. “Scientific research sample size determination.” (2023). These guidelines balance the risks of inadequate sample size with the challenges of unnecessarily large samples. For our primary analysis examining the correlation between CGM-derived measures and %NC, power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4 indicated that a minimum of 47 participants was required. Our sample size of 53 exceeded this threshold and allowed us to detect statistically significant correlations, as described in the Methods section.

      Furthermore, our sample size aligns with previous studies investigating the associations between glucose profiles and clinical parameters, including Torimoto, Keiichi, et al. “Relationship between fluctuations in glucose levels measured by continuous glucose monitoring and vascular endothelial dysfunction in type 2 diabetes mellitus.” Cardiovascular Diabetology 12 (2013): 1-7. (n=57), Hall, Heather, et al. “Glucotypes reveal new patterns of glucose dysregulation.” PLoS biology 16.7 (2018): e2005143. (n=57), and Metwally, Ahmed A., et al. “Prediction of metabolic subphenotypes of type 2 diabetes via continuous glucose monitoring and machine learning.” Nature Biomedical Engineering (2024): 1-18. (n=32). Moreover, to provide transparency about the precision of our estimates, we have included confidence intervals for all coefficients.

      Regarding the classification of glucose dynamics components, we have conducted additional validation across diverse populations including 64 Japanese, 53 American, and 100 Chinese individuals. These validation efforts have consistently supported our identification of three independent glucose dynamics components. Furthermore, the primary objective of our study was not to assess rare events, but rather to demonstrate that glucose dynamics can be decomposed into three main factors - mean, variance and autocorrelation - whereas traditional measures have primarily captured mean and variance without adequately reflecting autocorrelation. We believe that our current sample size effectively addresses this objective. 

      However, we acknowledge the importance of further validation on a larger scale. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of followup (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      To address the sample size considerations, we have added the following sentences to the Discussion section (lines 409-414):

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      (3) The study focuses on a well-characterized cohort with controlled cholesterol and blood pressure levels, reducing confounding variables. However, this stringent selection might exclude individuals with significant variability in these parameters, potentially limiting the study's applicability to broader, real-world populations. The authors should discuss how this may affect generalizability and potential bias in the results.

      We appreciate the reviewer’s comment regarding the potential impact of strict participant selection criteria on the broader applicability of our findings. We acknowledge that extending validation to more diverse populations would improve the generalizability of our findings.

      Our validation strategy included multiple cohorts from different regions, specifically 64 Japanese, 53 American and 100 Chinese individuals. These cohorts represent a clinically diverse population, including both healthy individuals and those with diabetes, allowing for validation across a broad spectrum of metabolic conditions. However, we recognize that further validation in additional populations and clinical settings would strengthen our conclusions. To address this, we conducted a large follow-up study of over 8,000 individuals with two years of follow-up (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which confirmed our main finding that glucose dynamics consist of mean, variance, and autocorrelation. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, it provides further support for the clinical relevance and generalizability of our findings.

      We have added the following text to the Discussion section to address these considerations (lines 409-414, 354-361):

      Although our analysis included four datasets with a total of 270 individuals, and our sample size of 53 met the required threshold based on power calculations with a type I error of 0.05, a power of 0.8, and an expected correlation coefficient of 0.4, we acknowledge that the sample size may still be considered relatively small for a comprehensive assessment of these relationships. To further validate these findings, larger prospective studies with diverse populations are needed.

      Although our LASSO and factor analysis indicated that CGM-derived measures were strong predictors of %NC, this does not mean that other clinical parameters, such as lipids and blood pressure, are irrelevant in T2DM complications. Our study specifically focused on characterizing glucose dynamics, and we analyzed individuals with well-controlled serum cholesterol and blood pressure to reduce confounding effects. While we anticipate that inclusion of a more diverse population would not alter our primary findings regarding glucose dynamics, it is likely that a broader data set would reveal additional predictive contributions from lipid and blood pressure parameters.

      (4) The study effectively highlights the potential of CGM-derived indices as a tool for CAD risk assessment, a concept that aligns with contemporary advancements in personalized medicine. Despite its potential, the complexity of CGM-derived indices like AC_Var and ADRR may hinder their routine clinical adoption. Providing simplified models or actionable guidelines would facilitate their integration into everyday practice.

      We appreciate the reviewer’s concern about the complexity of CGM-derived indices such as AC_Var and ADRR for routine clinical use. We recognize that for these indices to be of practical use, they must be both interpretable and easily accessible to healthcare providers.

      To address this, we have developed an easy-to-use web application that automatically calculates these measures, including AC_Var, mean glucose levels, and glucose variability. By eliminating the need for manual calculations, this tool streamlines the process and makes these indices more practical for clinical use.

      Regarding interpretability, we acknowledge that establishing specific clinical guidelines would enhance the practical utility of these measures. For example, defining a cut-off value for AC_Var above which the risk of diabetes complications increases significantly would provide clearer clinical guidance. However, given our current sample size limitations and our predefined objective of investigating correlations among indices, we have taken a conservative approach by focusing on the correlation between AC_Var and %NC rather than establishing definitive cutoffs. This approach intentionally avoids problematic statistical practices like phacking. It is not realistic to expect a single study to accomplish everything from proposing a new concept to conducting large-scale clinical trials to establishing clinical guidelines. Establishing clinical guidelines typically requires the accumulation of multiple studies over many years. Recognizing this reality, we have been careful in our manuscript to make modest claims about the discovery of new “correlations” rather than exaggerated claims about immediate routine clinical use.

      To address this limitation, we conducted a large follow-up study of over 8,000 individuals in the next study (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which proposed clinically relevant cutoffs and reference ranges for AC_Var and other CGM-derived indices. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper; however, by integrating automated calculation tools with clear clinical thresholds, we expect to make these measures more accessible for clinical use.

      We have added the following text to the Discussion section to address these considerations (lines 415-419):

      While CGM-derived indices such as AC_Var and ADRR hold promise for CAD risk assessment, their complexity may present challenges for routine clinical implementation. To improve usability, we have developed a web-based calculator that automates these calculations. However, defining clinically relevant thresholds and reference ranges requires further validation in larger cohorts.

      (5) The exclusion of TIR from the main analysis is noted, but its relevance in diabetes management warrants further exploration. Integrating TIR as an outcome measure could provide additional clinical insights.

      We appreciate the reviewer’s comment regarding the potential role of time in range (TIR) as an outcome measure in our study. Because TIR is primarily influenced by the mean and variance of glucose levels, it does not fully capture the distinct role of glucose autocorrelation, which was the focus of our investigation.

      To clarify this point, we have expanded the Discussion section as follows (lines 380-388):

      Although time in range (TIR) was not included in the main analyses due to the relatively small number of T2DM patients and the predominance of participants with TIR >70%, our results demonstrate that CGM-derived indices outperformed conventional markers such as FBG, HbA1c, and PG120 in predicting %NC. Furthermore, multiple regression analysis between factor scores and TIR revealed that only factor 1 (mean) and factor 2 (variance) were significantly associated with TIR (Fig. S8C, D). This finding confirms the presence of three distinct components in glucose dynamics and highlights the added value of examining AC_Var as an independent glycemic feature beyond conventional CGM-derived measures.

      (6) While the study reflects a commitment to understanding CAD risks in a global context by including datasets from Japan, America, and China, the authors should provide demographic details (e.g., age, gender, socioeconomic status) and discuss how these factors might influence glucose dynamics and coronary plaque vulnerability.

      We appreciate the reviewer’s comment regarding the potential influence of demographic factors on glucose dynamics and coronary plaque vulnerability. We examined these relationships and found that age and sex had minimal effects on glucose dynamics characteristics, as shown in Figure S8A and S8B. These findings suggest that our primary conclusions regarding glucose dynamics and coronary risk remain robust across demographic groups within our data set.

      To address the reviewer’s suggestion, we have added the following discussion (lines 361-368):

      In our analysis of demographic factors, we found that age and gender had minimal influence on glucose dynamics characteristics (Fig. S8A, B), suggesting that our findings regarding the relationship between glucose dynamics and coronary risk are robust across different demographic groups within our dataset. Future studies involving larger and more diverse populations would be valuable to comprehensively elucidate the potential influence of age, gender, and other demographic factors on glucose dynamics characteristics and their relationship to cardiovascular risk.

      (7) While the article shows CGM-derived indices outperform traditional markers (e.g., HbA1c, FBG, PG120), it does not compare these indices against existing advanced risk models (e.g., Framingham Risk Score for CAD). A direct comparison would strengthen the claim of superiority.

      We appreciate the reviewer’s comment regarding the comparison of CGMderived indices with existing CAD risk models. Given that our study population consisted of individuals with well-controlled total cholesterol and blood pressure levels, a direct comparison with the Framingham Risk Score for Hard Coronary Heart Disease (Wilson, Peter WF, et al. “Prediction of coronary heart disease using risk factor categories.” Circulation 97.18 (1998): 1837-1847.) may introduce inherent bias, as these factors are key components of the score.

      Nevertheless, to further assess the predictive value of the CGM-derived indices, we performed additional analyses using linear regression to predict %NC. Using the Framingham Risk Score, we obtained an R² of 0.04 and an Akaike Information Criterion (AIC) of 330. In contrast, our proposed model incorporating the three glycemic parameters - CGM_Mean, CGM_Std, and AC_Var - achieved a significantly improved R² of 0.36 and a lower AIC of 321, indicating superior predictive accuracy. We have updated the Result section as follows (lines 115-122):

      The regression model including CGM_Mean, CGM_Std and AC_Var to predict %NC achieved an R<sup>2</sup> of 0.36 and an Akaike Information Criterion (AIC) of 321. Each of these indices showed statistically significant independent positive correlations with %NC (Fig. 1A). In contrast, the model using conventional glycemic markers (FBG, HbA1c, and PG120) yielded an R² of only 0.05 and an AIC of 340 (Fig. 1B). Similarly, the model using the Framingham Risk Score for Hard Coronary Heart Disease (Wilson et al., 1998) showed limited predictive value, with an R² of 0.04 and an AIC of 330 (Fig. 1C).

      (8) The study mentions varying CGM sampling intervals across datasets (5-minute vs. 15minute). Authors should employ sensitivity analysis to assess the impact of these differences on the results. This would help clarify whether higher-resolution data significantly improves predictive performance.

      We appreciate the reviewer’s comment regarding the potential impact of different CGM sampling intervals on our results. To assess the robustness of our findings across different sampling frequencies, we performed a down sampling analysis by converting our 5minute interval data to 15-minute intervals. The AC_Var value calculated from 15-minute intervals was significantly correlated with that calculated from 5-minute intervals (R = 0.99, 95% CI: 0.97-1.00). Consequently, the main findings remained consistent across both sampling frequencies, indicating that our results are robust to variations in temporal resolution. We have added this analysis to the Result section (lines 122-126):

      The AC_Var computed from 15-minute CGM sampling was nearly identical to that computed from 5-minute sampling (R = 0.99, 95% CI: 0.97-1.00) (Fig. S1A), and the regression using the 15‑min features yielded almost the same performance (R<sup>2</sup>  = 0.36; AIC = 321; Fig. S1B).

      (9) The identification of actionable components in glucose dynamics lays the groundwork for clinical stratification. The authors could explore the use of CGM-derived indices to develop a simple framework for stratifying risk into certain categories (e.g., low, moderate, high). This could improve clinical relevance and utility for healthcare providers.

      We appreciate the reviewer’s suggestion regarding the potential for CGMderived indices to support clinical stratification. We completely agree with the idea that establishing risk categories (e.g., low, moderate, high) based on specific thresholds would enhance the clinical utility of these measures. However, given our current sample size limitations and our predefined objective of investigating correlations among indices, we have taken a conservative approach by focusing on the correlation between AC_Var and %NC rather than establishing definitive cutoffs. This approach intentionally avoids problematic statistical practices like p-hacking. It is not realistic to expect a single study to accomplish everything from proposing a new concept to conducting large-scale clinical trials to establishing clinical thresholds. Establishing clinical thresholds typically requires the accumulation of multiple studies over many years. Recognizing this reality, we have been careful in our manuscript to make modest claims about the discovery of new “correlations” rather than exaggerated claims about immediate routine clinical use.

      To address this limitation, we conducted a large follow-up study of over 8,000 individuals in the next study (Sugimoto, Hikaru, et al. “Stratification of individuals without prior diagnosis of diabetes using continuous glucose monitoring” medRxiv (2025)), which proposed clinically relevant cutoffs and reference ranges for AC_Var and other CGM-derived indices. As this large study was beyond the scope of the present manuscript due to differences in primary objectives and analytical approaches, it was not included in this paper. However, we expect to make these measures more actionable in clinical use by integrating automated calculation tools with clear clinical thresholds.

      We have added the following text to the Discussion section to address these considerations (lines 415-419):

      While CGM-derived indices such as AC_Var and ADRR hold promise for CAD risk assessment, their complexity may present challenges for routine clinical implementation. To improve usability, we have developed a web-based calculator that automates these calculations. However, defining clinically relevant thresholds and reference ranges requires further validation in larger cohorts.

      (10) While the study acknowledges several limitations, authors should also consider explicitly addressing the potential impact of inter-individual variability in glucose metabolism (e.g., age-related changes, hormonal influences) on the findings.

      We appreciate the reviewer’s comment regarding the potential impact of interindividual variability in glucose metabolism, including age-related changes and hormonal influences, on our results. In our analysis, we found that age had minimal effects on glucose dynamics characteristics, as shown in Figure S8A. In addition, CGM-derived measures such as ADRR and AC_Var significantly contributed to the prediction of %NC independent of insulin secretion (I.I.) and insulin sensitivity (Composite index) (Fig. 2). These results suggest that our primary conclusions regarding glucose dynamics and coronary risk remain robust despite individual differences in glucose metabolism.

      To address the reviewer’s suggestion, we have added the following discussion (lines 186-188, 361-368):

      Conventional indices, including FBG, HbA1c, PG120, I.I., Composite index, and Oral DI, did not contribute significantly to the prediction compared to these CGM-derived indices.

      In our analysis of demographic factors, we found that age and gender had minimal influence on glucose dynamics characteristics (Fig. S8A, B), suggesting that our findings regarding the relationship between glucose dynamics and coronary risk are robust across different demographic groups within our dataset. Future studies involving larger and more diverse populations would be valuable to comprehensively elucidate the potential influence of age, gender, and other demographic factors on glucose dynamics characteristics and their relationship to cardiovascular risk.

      (11) It's unclear whether the identified components (value, variability, and autocorrelation) could serve as proxies for underlying physiological mechanisms, such as beta-cell dysfunction or insulin resistance. Please clarify.

      We appreciate the reviewer’s comment regarding the physiological underpinnings of the glucose components we identified. The mean, variance, and autocorrelation components we identified likely reflect specific underlying physiological mechanisms related to glucose regulation. In our previous research (Sugimoto, Hikaru, et al. “Improved detection of decreased glucose handling capacities via continuous glucose monitoring-derived indices.” Communications Medicine 5.1 (2025): 103.), we explored the relationship between glucose dynamics characteristics and glucose control capabilities using clamp tests and mathematical modelling. These investigations revealed that autocorrelation specifically shows a significant correlation with the disposition index (the product of insulin sensitivity and insulin secretion) and insulin clearance parameters.

      Furthermore, our current study demonstrates that CGM-derived measures such as ADRR and AC_Var significantly contributed to the prediction of %NC independent of established metabolic parameters including insulin secretion (I.I.) and insulin sensitivity (Composite index), as shown in Figure 2. These results suggest that the components we identified capture distinct physiological aspects of glucose metabolism beyond traditional measures of beta-cell function and insulin sensitivity. Further research is needed to fully characterize these relationships, but our results imply that these characteristics of glucose dynamics offer supplementary insight into the underlying beta-cell dysregulation that contributes to coronary plaque vulnerability.

      To address the reviewer’s suggestion, we have added the following discussion to the Result section (lines 186-188):

      Conventional indices, including FBG, HbA1c, PG120, I.I., Composite index, and Oral DI, did not contribute significantly to the prediction compared to these CGM-derived indices.

      Minor Comments:

      (1) The use of LASSO and PLS regression is appropriate, but the rationale for choosing these methods over others (e.g., Ridge regression) should be explained in greater detail.

      We appreciate the reviewer’s comment and have added the following discussion to the Methods section (lines 578-585):

      LASSO regression was chosen for its ability to perform feature selection by identifying the most relevant predictors. Unlike Ridge regression, which simply shrinks coefficients toward zero without reaching exactly zero, LASSO produces sparse models, which is consistent with our goal of identifying the most critical features of glucose dynamics associated with coronary plaque vulnerability. In addition, we implemented PLS regression as a complementary approach due to its effectiveness in dealing with multicollinearity, which was particularly relevant given the high correlation among several CGM-derived measures.

      (2) While figures are well-designed, adding annotations to highlight key findings (e.g., significant contributors in factor analysis) would improve clarity.

      We appreciate the reviewer’s suggestion to improve the clarity of our figures. In the factor analysis, we decided not to include annotations because indicators such as ADRR and J-index can be associated with multiple factors, which could lead to misleading or confusing interpretations. However, in response to the suggestion, we have added annotations to the PLS analysis, specifically highlighting items with VIP values greater than 1 (Fig. 2D, S2D) to emphasize key contributors.

      (3) The term "value" as a component of glucose dynamics could be clarified. For instance, does it strictly refer to mean glucose levels, or does it encompass other measures?

      We appreciate the reviewer’s question regarding the term “value” in the context of glucose dynamics. Factor 1 was predominantly influenced by CGM_Mean, with a factor loading of 0.99, indicating that it primarily represents mean glucose levels. Given this strong correlation, we have renamed Factor 1 to “Mean” (Fig. 3A) to more accurately reflect its role in glucose dynamics.

      (4) The concept of autocorrelation may be unfamiliar to some readers. A brief, intuitive explanation with a concrete example of how it manifests in glucose dynamics would enhance understanding.

      We appreciate the reviewer’s suggestion. Autocorrelation refers to the relationship between a variable and its past values over time. In the context of glucose dynamics, it reflects how current glucose levels are influenced by past levels, capturing patterns such as sustained hyperglycemia or recurrent fluctuations. For example, if an individual experiences sustained high glucose levels after a meal, the strong correlation between successive glucose readings indicates high autocorrelation. We have included this explanation in the revised manuscript (lines 519-524) to improve clarity for readers unfamiliar with the concept. Additionally, Figure 4A shows an example of glucose dynamics with different autocorrelation.

      (5) Ensure consistent use of terms like "glucose dynamics," "CGM-derived indices," and "plaque vulnerability." For instance, sometimes indices are referred to as "components," which might confuse readers unfamiliar with the field.

      We appreciate the reviewer’s comment about ensuring consistency in terminology. To avoid confusion, we have reviewed and standardized the use of terms such as “CGM-derived indices,” and “plaque vulnerability” throughout the manuscript. Additionally, while many of our measures are strictly CGM-derived indices, several “components” in our analysis include fasting blood glucose (FBG) and glucose waveforms during the OGTT. For these measures, we retained the descriptors “glucose dynamics” and “components” rather than relabelling them as CGM-derived indices.

      (6) Provide a more detailed overview of the supplementary materials in the main text, highlighting their relevance to the key findings.

      We appreciate the reviewer’s suggestion. We revised the manuscript by integrating the supplementary text into the main text (lines 129-160), which provides a clearer overview of the supplementary materials. Consequently, the Supplementary Information section now only contains supplementary figures, while their relevance and key details are described in the main text. 

      Reviewer #3 (Recommendations for the authors):

      Other Concerns:

      (1) The text states the significance of tests, however, no p-values are listed: Lines 118-119: Significance is cited between CGM indices and %NC, however, neither the text nor supplementary text have p-values. Need p-values for Figure 3C, Figure S10. When running the https://cgm-basedregression.streamlit.app/ multiple regression analysis, a p-value should be given as well. Do the VIP scores (Line 142) change with the inclusion of SBP, DBP, TG, LDL, and HDL? Do the other datasets have the same well-controlled serum cholesterol and BP levels?

      We appreciate the reviewer’s concern regarding statistical significance and the documentation of p values.

      First, given the multiple comparisons in our study, we used q values rather than p values, as shown in Figure 1D. Q values provide a more rigorous statistical framework for controlling the false discovery rate in multiple testing scenarios, thereby reducing the likelihood of false positives.

      Second, our statistical reporting follows established guidelines, including those of the New England Journal of Medicine (Harrington, David, et al. “New guidelines for statistical reporting in the journal.” New England Journal of Medicine 381.3 (2019): 285-286.), which recommend that “reporting of exploratory end points should be limited to point estimates of effects with 95% confidence intervals” and that “replace p values with estimates of effects or association and 95% confidence intervals”. According to these guidelines, p values should not be reported in this type of study. We determined significance based on whether these 95% confidence intervals excluded zero - a statistical method for determining whether an association is significantly different from zero (Tan, Sze Huey, and Say Beng Tan. “The correct interpretation of confidence intervals.” Proceedings of Singapore Healthcare 19.3 (2010): 276-278.).

      For the sake of transparency, we provide p values for readers who may be interested, although we emphasize that they should not be the basis for interpretation, as discussed in the referenced guidelines. Specifically, in Figure 1A-B, the p values for CGM_Mean, CGM_Std, and AC_Var were 0.02, 0.02, and <0.01, respectively, while those for FBG, HbA1c, and PG120 were 0.83, 0.91, and 0.25, respectively. In Figure 3C, the p values for factors 1–5 were 0.03, 0.03, 0.03, 0.24, and 0.87, respectively, and in Figure S8C, the p values for factors 1–3 were <0.01, <0.01, and 0.20, respectively. We appreciate the opportunity to clarify our statistical methodology and are happy to provide additional details if needed.

      We confirmed that the results of the variable importance in projection (VIP) analysis remained stable after including additional covariates, such as systolic blood pressure (SBP), diastolic blood pressure (DBP), triglycerides (TG), low-density lipoprotein cholesterol (LDL-C), and high-density lipoprotein cholesterol (HDL-C). The VIP values for ADRR, MAGE, AC_Var, and LI consistently exceeded one even after these adjustments, suggesting that the primary findings are robust in the presence of these clinical variables. We have added the following sentences in the Results and Methods section (lines 188-191, 491-494):

      Even when SBP, DBP, TG, LDL-C, and HDL-C were included as additional input variables, the results remained consistent, and the VIP scores for ADRR, AC_Var, MAGE, and LI remained greater than 1 (Fig. S2D).

      Of note, as the original reports document, the validation datasets did not specify explicit cutoffs for blood pressure or cholesterol. Consequently, they included participants with suboptimal control of these parameters.

      (2) Negative factor loadings have not been addressed and consistency in components: Figure 3, Figure S7. All the main features for value in Figure 3A are positive. However, MVALUE in S7B is very negative for value whereas the other features highlighted for value are positive. What is driving this difference? Please explain if the direction is important. Line 480 states that variables with factor loadings >= 0.30 were used for interpretation, but it appears in the text (Line 156, Figure 3) that oral DI was used for value, even though it had a -0.61 loading. Figure 3, Figure S7. HBGI falls within two separate components (value and variability). There is not a consistent component grouping. Removal of MAG (Line 185) and only MAG does not seem scientific. Did the removal of other features also result in similar or different Cronbach's ⍺? It is unclear what Figure S8B is plotting. What does each point mean?

      We appreciate the reviewer’s comment regarding the classification of CGMderived measures into the three components: value, variability, and autocorrelation. As the reviewer correctly points out, some measures may load differently between the value and variability components in different datasets. However, we believe that this variability reflects the inherent mathematical properties of these measures rather than a limitation of our study.

      For example, the HBGI clusters differently across datasets due to its dependence on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S3A). Conversely, in populations with a wider range of mean glucose levels, HBGI correlates more strongly with mean glucose levels (Fig. 3A). This context-dependent behaviour is expected given the mathematical properties of these measures and does not indicate an inconsistency in our classification approach.

      Importantly, our main findings remain robust: CGM-derived measures systematically fall into three components-value, variability, and autocorrelation. Traditional CGM-derived measures primarily reflect either value or variability, and this categorization is consistently observed across datasets. While specific indices such as HBGI may shift classification depending on population characteristics, the overall structure of CGM data remains stable.

      With respect to negative factor loadings, we agree that they may appear confusing at first. However, in the context of exploratory factor analysis, the magnitude, or absolute value, of the loading is most critical for interpretation, rather than its sign. Following established practice, we considered variables with absolute loadings of at least 0.30 to be meaningful contributors to a given component. Accordingly, although the oral DI had a negative loading of –0.61, its absolute magnitude exceeded the threshold of 0.30, so it was considered in our interpretation of the “value” component. Regarding the reviewer’s observation that MVALUE in Figure S7B shows a strongly negative loading while other indices in the same component show positive loadings, we believe this reflects the relative orientation of the factor solution rather than a substantive difference in interpretation. In factor analysis, the direction of factor loadings is arbitrary: multiplying all the loadings for a given factor by –1 would not change the factor’s statistical identity. Therefore, the important factor is not whether a variable loads positively or negatively but rather the strength of its association with the latent component (i.e., the absolute value of the loading).

      The rationale for removing MAG was based on statistical and methodological considerations. As is common practice in reliability analyses, we examined whether Cronbach’s α would improve if we excluded items with low factor loadings or weak item–total correlations. In the present study, we recalculated Cronbach’s α after removing the MAG item because it had a low loading. Its exclusion did not substantially affect the theoretical interpretation of the factor, which we conceptualize as “secretion” (without CGM). MAG’s removal alone is scientifically justified because it was the only item whose exclusion improved Cronbach's α while preserving interpretability. In contrast, removing other items would have undermined the conceptual clarity of the factor or would not have meaningfully improved α. Furthermore, the MAG item has a high factor 2 loading.

      Each point in Figure S8B (old version) corresponds to an individual participant.

      To address these considerations, we have added the following text to the Discussion, Methods, (lines 388-396, 600-601) and Figure S6B (current version) legend:

      Some indices, such as HBGI, showed variation in classification across datasets, with some populations showing higher factor loadings in the “mean” component and others in the “variance” component. This variation occurs because HBGI calculations depend on the number of glucose readings above a threshold. In populations where mean glucose levels are predominantly below this threshold, the HBGI is more sensitive to glucose variability (Fig. S5A). Conversely, in populations with a wider range of mean glucose levels, the HBGI correlates more strongly with mean glucose levels (Fig. 3A). Despite these differences, our validation analyses confirm that CGM-derived indices consistently cluster into three components: mean, variance, and autocorrelation.

      Variables with absolute factor loadings of ≥ 0.30 were used in interpretation.

      Box plots comparing factors 1 (Mean), 2 (Variance), and 3 (Autocorrelation) between individuals without (-) and with (+) diabetic macrovascular complications. Each point corresponds to an individual. The boxes represent the interquartile range, with the median shown as a horizontal line. Mann–Whitney U tests were used to assess differences between groups, with P values < 0.05 considered statistically significant.

      Minor Concerns:

      (1) NGT is not defined.

      We appreciate the reviewer for pointing out that the term “NGT” was not clearly defined in the original manuscript. We have added the following text to the Methods section (lines 447-451):

      T2DM was defined as HbA1c ≥ 6.5%, fasting plasma glucose (FPG) ≥ 126 mg/dL or 2‑h plasma glucose during a 75‑g OGTT (PG120) ≥ 200 mg/dL. IGT was defined as HbA1c 6.0– 6.4%, FPG 110–125 mg/dL or PG120 140–199 mg/dL. NGT was defined as values below all prediabetes thresholds (HbA1c < 6.0%, FPG < 110 mg/dL and PG120 < 140 mg/dL).

      (2) Is it necessary to list the cumulative percentage (Line 173), it could be clearer to list the percentage explained by each factor instead.

      We appreciate the reviewer’s suggestion to list the percentage explained by each factor rather than the cumulative percentage for improved clarity. According to the reviewer’s suggestion, we have revised the results to show the individual contribution of each factor (39%, 21%, 10%, 5%, 5%) rather than the cumulative percentages (39%, 60%, 70%, 75%, 80%) that were previously listed (lines 220-221).

      (3) Figure S10. How were the coefficients generated for Figure S10? No methods are given.

      We conducted a multiple linear regression analysis in which time in range (TIR) was the dependent variable and the factor scores corresponding to the first three latent components (factor 1 representing the mean, factor 2 representing the variance, and factor 3 representing the autocorrelation) were the independent variables. We have added the following text to the figure legend (Fig. S8C) to provide a more detailed description of how the coefficients were generated:

      Comparison of predicted Time in range (TIR) versus measured TIR using multiple regression analysis between TIR and factor scores in Figure 3. In this analysis, TIR was the dependent variable, and the factor scores corresponding to the first three latent components (factor 1 representing the mean, factor 2 representing the variance, and factor 3 representing the autocorrelation) were the independent variables. Each point corresponds to the values for a single individual.

      (4) In https://cgm-basedregression.streamlit.app/, more explanation should be given about the output of the multiple regression. Regression is spelled incorrectly on the app.

      We appreciate the reviewer for pointing out the need for a clearer explanation of the multiple regression analysis presented in the online tool

      (https://cgmregressionapp2.streamlit.app/). We have added the description about the regression and corrected the typographical error in the spelling of “regression” within the app. 

      (5) The last section of results (starting at line 225) appears to be unrelated to the goal of predicting %NC.

      We appreciate the reviewer’s feedback regarding the relevance of the simulation component of our manuscript. The primary contribution of our study goes beyond demonstrating correlations between CGM-derived measures and %NC; it highlights three fundamental components of glycemic patterns-mean, variance, and autocorrelation-and their independent relationships with coronary plaque characteristics. The simulations are included to illustrate how glycemic patterns with identical means and variability can have different autocorrelation structures. As reviewer 2 pointed out in minor comment #4, temporal autocorrelation can be difficult to interpret, so these visualizations were intended to provide intuitive examples for readers.

      However, we agree with the reviewer’s concern about the coherence of the manuscript. In response, we have streamlined the simulation section by removing technical simulations that do not directly support our primary conclusions (old version of the manuscript, lines 239-246, 502-526), while retaining only those that enhance understanding of the three glycemic components (Fig. 4A).

      (6) Figure S2. The R2 should be reported.

      We appreciate the reviewer for suggesting that we report R² in Figure S2. In the revised version, we have added the correlation coefficients and their 95% confidence intervals to Figure 1E.

      (7) Multiple panels have a correlation line drawn with a slope of 1 which does not reflect the data or r^2 listed. this should be fixed.

      We appreciate the reviewer’s concern that several panels included regression lines with a fixed slope of one that did not reflect the associated R² values. We have corrected Figures 1A–C and 3C to display regression lines representing the estimated slopes derived from the regression analyses.

    1. Homero no solo había dichola verdad acerca de Troya y de la guerra que en ella se habíadesarrollado, sino sobre sus protagonistas. Y con granentusiasmo se puso a buscar, entre las ruinas de Micenas, latumba y el cadáver de Agamenón.

      lo interesante es el espacio que hay entre esta civilización y homero. teniendo en cuenta que sucedió en el siglo xviii a.c y homero escribió la ilíada en el 750, se deduce la importancia que tuvo la tradición oral para que se conservasen todas esas historias

    2. Ariadna está obviamente ligada a Creta a través del mito de Teseo y el Minotauro. Sin embargo, Nilsson observa que ninguna otra heroína murió de tantas maneras diferentes, y que esto solo puede explicarse por un culto de la muerte de Ariadna. Los festivales de Ariadna en Naxos se asemejan a los cultos a la vegetación, bien conocidos por las religiones orientales, pero ausentes en la religión de la Grecia clásica. Sin embargo, en las religiones orientales, los cultos son a un dios en cuestión, mientras que en Naxos es a una diosa. Su muerte se celebraba cada año, consecuencia de la cultura del renacimiento anual. Esta forma divina no es griega u oriental, pudiendo haber sido heredada directamente de las tradiciones minoicas

    Annotators

    1. corrigé

      Le lien vers le corrigé a été oublié. Vous pouvez y accéder en cliquant sur le lien "à onotre projet" plus haut. Il faut regarder le sommaire/menu déroulant.

    2. créez une variable sujet et une variable message

      Il y a une erreur je crois, car dans la correction vous ne parlez pas de ces deux variables.

  2. Dec 2025
    1. Esto reduce su potencial para generar evidencia robusta que fortalezca la investigación, la formulación de políticas públicas y la innovación social

      lo reduce?

    2. se producen y circulan de manera altamente fragmentada, con escasos estándares comunes y mecanismos débiles de coordinación

      poner una cita sobre esta afirmación

    3. Tal contexto introduce desafíos significativos para las ciencias sociales: problemas de representatividad y sesgo en fuentes no previstas para investigación, riesgos sobre privacidad y protección de datos, dificultades de interpretabilidad de modelos complejos, y la necesidad de adaptar marcos teóricos y metodológicos para integrar resultados algorítmicos con explicaciones sociales robustas.

      Estos desafíos quedan desconectados de la primera parte del párrafo siguiente. Me parece que deberían dialogar mejor esas dos partes.

      Me parece que debería ser algo del tipo: Tal contexto introduce desafíos significativos para las ciencias sociales: la necesidad de adaptar marcos teóricos y metodológicos para integrar la diversas fuentes de datos con explicaciones sociales robustas, transformando los datos en datos sociales.

    4. masiva de grandes volúmenes de “big data”

      es un poco redundante esta frase: masiva, grandes volumenes y big data. Queda mejor algo como: la disponibilidad de grandes volumenes de datos, comunmente llamado big data,...

    1. Límites de información

      Centrarse en lo que hay y no en lo que podría haber. No sabemos que de esto es posible de obtener, así es que sugiero no redundar en esto. La mismo elementos fueron mencionados en la sección final de la introducción.

    2. La información que se encuentra disponible responde, en primer lugar, a variables de caracterización sociodemográfica. En este marco, hay información sobre el sexo, colegios de precedencia, comuna de residencia, región de origen, nacionalidad, tipo de ingreso, nivel socioeconómico. Además, se encuentra disponible información de caracterización académica, tal como el tipo de ingreso, la cohorte de ingreso, percentil en PSU-PAES, situación académica y, la más importante, el promedio bruto y el promedio calculado. Por último, el gran grueso de variables que contienen las bases son los cursos, tanto obligatorios como electivos, de las cinco carreras que contempla el análisis 2.

      Detallas variables

    3. Metodología

      Falta uan sección detallada de variables: ¿qué variables son relevantes para cada base de datos? - Definir cada variable detalladamente (variables originales y variables creadas). - Es necesario construir el libro de códigos que ayudará a tener a la vista la información disponible.

    4. Considerando esto, se generó una base madre (base_madre.rds), realizando un merge con la base de nota única y la base que contiene las notas por curso. La base madre terminó con un N = 1983 con 1772 variables 1. Esta base contiene información de los cursos impartidos desde 2021 hasta 2024. En esta base se creó la variable “promedio_calculado” que es el promedio de todas las notas de los cursos que registra un caso. A partir de esta base, se generaron dataframes por cada carrera (df_nombre carrera.rds) con tal de disminuir el peso de la información y que fuera más eficiente analizarla.

      Describir base de datos combinada, en los términos anteriores. Las variables se decriben en la sección de variables.

    5. La base que contiene el promedio de notas en bruto lleva por nombre nota_unica.rds (N = 2578) y tiene un total de 17 variables. Esta base contiene información desde el 2020 hasta el 2025. A partir de esta base se genera un subset por carrera, las cuales se llaman nota_unica_nombre carrera.rds.

      Describir los datos: n, distribución de sexo, edad u otra variable relevante; periodo de tiempo, procedimiento de generación, etc.

    1. esta lógica

      esta lógica se refiere a la relación entre puntaje de ingreso y notas? Si es así, referirse a esto como un resultado más que como lógica.

    2. factores exógenos y factores endógenos

      Bien que vayamos pensando y haciendo estas distinciones desde el principios. Pero me parece que las definiciones sobre los tipos de variables no corresponden tanto a la introducción, sino a la metodología.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Wang, Po-Kai, et al., utilized the de novo polarization of MDCK cells cultured in Matrigel to assess the interdependence between polarity protein localization, centrosome positioning, and apical membrane formation. They show that the inhibition of Plk4 with Centrinone does not prevent apical membrane formation, but does result in its delay, a phenotype the authors attribute to the loss of centrosomes due to the inhibition of centriole duplication. However, the targeted mutagenesis of specific centrosome proteins implicated in the positioning of centrosomes in other cell types (CEP164, ODF2, PCNT, and CEP120) did not affect centrosome positioning in 3D cultured MDCK cells. A screen of proteins previously implicated in MDCK polarization revealed that the polarity protein Par-3 was upstream of centrosome positioning, similar to other cell types.

      Strengths:

      The investigation into the temporal requirement and interdependence of previously proposed regulators of cell polarization and lumen formation is valuable to the community. Wang et al., have provided a detailed analysis of many of these components at defined stages of polarity establishment. Furthermore, the generation of PCNT, p53, ODF2, Cep120, and Cep164 knockout MDCK cell lines is likely valuable to the community.

      Weaknesses:

      Additional quantifications would highly improve this manuscript, for example it is unclear whether the centrosome perturbation affects gamma tubulin levels and therefore microtubule nucleation, it is also not clear how they affect the localization of the trafficking machinery/polarity proteins. For example, in Figure 4, the authors measure the intensity of Gp134 at the apical membrane initiation site following cytokinesis, but there is no measure of Gp134 at the centrosome prior to this.

      We thank the reviewer for this important suggestion. Previous studies have shown that genes encoding appendage proteins and CEP120 do not regulate γ-tubulin recruitment to centrosomes (Betleja, Nanjundappa, Cheng, & Mahjoub, 2018; Vasquez-Limeta & Loncarek, 2021). Although the loss of PCNT reduces γ-tubulin levels, this reduction is partially compensated by AKAP450. Even in the case of PCNT/AKAP450 double knockouts, low levels of γ-tubulin remain at the centrosome (Gavilan et al., 2018), suggesting that it is difficult to completely eliminate γ-tubulin by perturbing centrosomal genes alone.

      To directly address this question, in the revised manuscript (Page 8, Paragraph 4; Figure 4—figure supplement 3), we employed a recently reported method to block γ-tubulin recruitment by co-expressing two constructs: the centrosome-targeting carboxy-terminal domain (C-CTD) of CDK5RAP2 and the γ-tubulin-binding domain of NEDD1 (N-gTBD). This approach effectively depleted γ-tubulin and abolished microtubule nucleation at the centrosome (Vinopal et al., 2023). Interestingly, despite the reduced efficiency of apical vesicle trafficking, these cells were still able to establish polarity, with centrioles positioned apically. These results suggest that microtubule nucleation at the centrosomes (centrosomal microtubules) facilitates—but is not essential for—polarity establishment.

      Regarding Figure 4, we assume the reviewer was referring to Gp135 rather than Gp134. In the revised manuscript (Page 8, Paragraph 2; Figure 4I), we observed a slight decrease in Gp135 intensity near PCNT-KO centrosomes at the pre-Abs stage. However, its localization at the AMIS following cytokinesis remained unaffected. These results suggest that the loss of PCNT has a limited impact on Gp135 localization. 

      Reviewer #2 (Public review):

      Summary:

      The authors decoupled several players that are thought to contribute to the establishment of epithelial polarity and determined their causal relationship. This provides a new picture of the respective roles of junctional proteins (Par3), the centrosome, and endomembrane compartments (Cdc42, Rab11, Gp135) from upstream to downstream.

      Their conclusions are based on live imaging of all players during the early steps of polarity establishment and on the knock-down of their expression in the simplest ever model of epithelial polarity: a cell doublet surrounded by ECM.

      The position of the centrosome is often taken as a readout for the orientation of the cell polarity axis. There is a long-standing debate about the actual role of the centrosome in the establishment of this polarity axis. Here, using a minimal model of epithelial polarization, a doublet of daugthers MDCK cultured in Matrigel, the authors made several key observations that bring new light to our understanding of a mechanism that has been studied for many years without being fully explained:

      (1) They showed that centriole can reach their polarized position without most of their microtubule-anchoring structures. These observations challenge the standard model according to which centrosomes are moved by the production and transmission of forces along microtubules.

      (2) However) they showed that epithelial polarity can be established in the absence of a centriole.

      (3) (Somehow more expectedly) they also showed that epithelial polarity can't be established in the absence of Par3.

      (4) They found that most other polarity players that are transported through the cytoplasm in lipid vesicles, and finally fused to the basal or apical pole of epithelial cells, are moved along an axis which is defined by the position of centrosome and orientation of microtubules.

      (5) Surprisingly, two non-daughter cells that were brought in contact (for 6h) could partially polarize by recruiting a few Par3 molecules but not the other polarity markers.

      (6) Even more surprisingly, in the absence of ECM, Par 3 and centrosomes could move to their proper position close to the intercellular junction after cytokinesis but other polarity markers (at least GP135) localized to the opposite, non-adhesive, side. So the polarity of the centrosome-microtubule network could be dissociated from the localisation of GP135 (which was believed to be transported along this network).

      Strengths:

      (1) The simplicity and reproducibility of the system allow a very quantitative description of cell polarity and protein localisation.

      (2) The experiments are quite straightforward, well-executed, and properly analyzed.

      (3) The writing is clear and conclusions are convincing.

      Weaknesses:

      (1) The simplicity of the system may not capture some of the mechanisms involved in the establishment of cell polarity in more physiological conditions (fluid flow, electrical potential, ion gradients,...).

      We agree that certain mechanisms may not be captured by this simplified system. However, the model enables us to observe intrinsic cellular responses, minimize external environmental variables, and gain new insights into how epithelial cells position their centrosomes and establish polarity. 

      (2) The absence of centriole in centrinone-treated cells might not prevent the coalescence of centrosomal protein in a kind of MTOC which might still orient microtubules and intracellular traffic. How are microtubules organized in the absence of centriole? If they still form a radial array, the absence of a centriole at the center of it somehow does not conflict with classical views in the field.

      Previous studies have shown that in the absence of centrioles, centrosomal proteins can relocate to alternative microtubule-organizing centers (MTOCs), such as the Golgi apparatus (Gavilan et al., 2018). Furthermore, centriole loss leads to increased nucleation of non-centrosomal microtubules (Martin, Veloso, Wu, Katrukha, & Akhmanova, 2018). However, these microtubules typically do not form the classical radial array or a distinct star-like organization. 

      While this non-centrosomal microtubule network can still support polarity establishment, it does so less efficiently—similar to what is observed in p53-deficient cells undergoing centriole-independent mitosis (Meitinger et al., 2016). Thus, although the absence of centrioles does not completely prevent microtubule-based organization or polarity establishment, it impairs their spatial coordination and reduces overall efficiency compared to a centriole-centered microtubule-organizing center (MTOC). 

      (3) The mechanism is still far from clear and this study shines some light on our lack of understanding. Basic and key questions remain:

      (a) How is the centrosome moved toward the Par3-rich pole? This is particularly difficult to answer if the mechanism does not imply the anchoring of MTs to the centriole or PCM.

      Previous studies have shown that Par3 interacts with dynein, potentially anchoring it at the cell cortex (Schmoranzer et al., 2009). This interaction enables dynein, a minus-enddirected motor, to exert pulling forces on microtubules, thereby promoting centrosome movement toward the Par3-enriched pole.

      In our experiments (Figure 4), we attempted to disrupt centrosomal microtubule nucleation by knocking out multiple genes involved in centrosome structure and function, including ODF2 and PCNT. Under these perturbations, γ-tubulin still remained detectable at the centrosome, and we were unable to completely eliminate centrosomal microtubules. 

      To address this question more directly, we employed a strategy to deplete γ-tubulin from centrosomes by co-expressing the centrosome-targeting C-terminal domain (C-CTD) of CDK5RAP2 and the γ-tubulin-binding domain of NEDD1 (N-gTBD). As shown in the new data of the revised manuscript (Page 8, Paragraph 4; Figure 4—figure supplement 3), this approach effectively depleted γ-tubulin from centrosomes, thereby abolishing microtubule nucleation at the centrosome. 

      Surprisingly, even under these conditions, centrioles remained apically positioned (Page 8, Paragraph 4; Figure 4—figure supplement 3), indicating that centrosomal microtubules are not essential for centrosome movement during polarization.

      Given these findings, we agree that the precise mechanism by which the Par3-enriched cortex attracts or guides centrosome movement remains unclear. Although dynein–Par3 interactions may contribute, further studies are needed to elucidate how centrosome repositioning occurs in the absence of microtubule-based pulling forces from the centrosome itself.

      (b) What happens during cytokinesis that organises Par3 and intercellular junction in a way that can't be achieved by simply bringing two cells together? In larger epithelia cells have neighbours that are not daughters, still, they can form tight junctions with Par3 which participates in the establishment of cell polarity as much as those that are closer to the cytokinetic bridge (as judged by the overall cell symmetry). Is the protocol of cell aggregation fully capturing the interaction mechanism of non-daughter cells?

      We speculate that a key difference between cytokinesis and simple cell-cell contact lies in the presence or absence of actomyosin contractility during the process of cell division. Specifically, contraction of the cytokinetic ring generates mechanical forces between the two daughter cells, which are absent when two non-daughter cells are simply brought together. While adjacent epithelial cells can indeed form tight junctions and recruit Par3, the lack of shared cortical tension and contractile actin networks between non-daughter cells may lead to differences in how polarity is initiated. This mechanical input during cytokinesis may serve as an organizing signal for centrosome positioning. This idea is supported by recent work showing that the actin cytoskeleton can influence centrosome positioning (Jimenez et al., 2021), suggesting that contractile actin structures formed during cytokinesis may contribute to spatial organization in a manner that cannot be replicated by simple aggregation. 

      In our experiments, we simply captured two cells that were in contact within Matrigel. We cannot say for sure that it captures all the interaction mechanisms of non-daughter cells, but it does provide a contrast to daughter cells produced by cytokinesis. 

      Reviewer #3 (Public review):

      Here, Wang et al. aim to clarify the role of the centrosome and conserved polarity regulators in apical membrane formation during the polarization of MDCK cells cultured in 3D. Through well-presented and rigorous studies, the authors focused on the emergence of polarity as a single MDCK cell divided in 3D culture to form a two-cell cyst with a nascent lumen. Focusing on these very initial stages, rather than in later large cyst formation as in most studies, is a real strength of this study. The authors found that conserved polarity regulators Gp135/podocalyxin, Crb3, Cdc42, and the recycling endosome component Rab11a all localize to the centrosome before localizing to the apical membrane initiation site (AMIS) following cytokinesis. This protein relocalization was concomitant with a repositioning of centrosomes towards the AMIS. In contrast, Par3, aPKC, and the junctional components E-cadherin and ZO1 localize directly to the AMIS without first localizing to the centrosome. Based on the timing of the localization of these proteins, these observational studies suggested that Par3 is upstream of centrosome repositioning towards the AMIS and that the centrosome might be required for delivery of apical/luminal proteins to the AMIS.

      To test this hypothesis, the authors generated numerous new cell lines and/or employed pharmacological inhibitors to determine the hierarchy of localization among these components. They found that removal of the centrosome via centrinone treatment severely delayed and weakened the delivery of Gp135 to the AMIS and single lumen formation, although normal lumenogenesis was apparently rescued with time. This effect was not due to the presence of CEP164, ODF2, CEP120, or Pericentrin. Par3 depletion perturbed the repositioning of the centrosome towards the AMIS and the relocalization of the Gp135 and Rab11 to the AMIS, causing these proteins to get stuck at the centrosome. Finally, the authors culture the MDCK cells in several ways (forced aggregation and ECM depleted) to try and further uncouple localization of the pertinent components, finding that Par3 can localize to the cell-cell interface in the absence of cell division. Par3 localized to the edge of the cell-cell contacts in the absence of ECM and this localization was not sufficient to orient the centrosomes to this site, indicating the importance of other factors in centrosome recruitment.

      Together, these data suggest a model where Par3 positions the centrosome at the AMIS and is required for the efficient transfer of more downstream polarity determinants (Gp135 and Rab11) to the apical membrane from the centrosome. The authors present solid and compelling data and are well-positioned to directly test this model with their existing system and tools. In particular, one obvious mechanism here is that centrosome-based microtubules help to efficiently direct the transport of molecules required to reinforce polarity and/or promote lumenogenesis. This model is not really explored by the authors except by Pericentrin and subdistal appendage depletion and the authors do not test whether these perturbations affect centrosomal microtubules. Exploring the role of microtubules in this process could considerably add to the mechanisms presented here. In its current state, this paper is a careful observation of the events of MCDK polarization and will fill a knowledge gap in this field. However, the mechanism could be significantly bolstered with existing tools, thereby elevating our understanding of how polarity emerges in this system.

      We agree that further exploration of microtubule dynamics could strengthen the mechanistic framework of our study. In our initial experiments, we disrupted centrosome function through genetic perturbations (e.g., knockout of PCNT, CEP120, CEP164, and ODF2). However, consistent with previous reports (Gavilan et al., 2018; Tateishi et al., 2013), we found that single-gene deletions did not completely eliminate centrosomal microtubules. Furthermore, imaging microtubule organization in 3D culture presents technical challenges. Due to the increased density of microtubules during cell rounding, we were unable to obtain clear microtubule filament structures—either using α-tubulin staining in fixed cells or SiR-tubulin labeling in live cells. Instead, the signal appeared diffusely distributed throughout the cytosol.

      To overcome this, we employed a recently reported approach by co-expressing the centrosome-targeting carboxy-terminal domain (C-CTD) of CDK5RAP2 and the γtubulin-binding domain (gTBD) of NEDD1 to completely deplete γ-tubulin and abolish centrosomal microtubule nucleation (Vinopal et al., 2023). In our new data presented in the revised manuscript (Page 8, Paragraph 4; Figure 4—figure supplement 3), we found that cells lacking centrosomal microtubules were still able to polarize and position the centrioles apically. However, the efficiency of polarized transport of Gp135 vesicles to the apical membrane was reduced. These findings suggest that centrosomal microtubules are not essential for polarity establishment but may contribute to efficient apical transport. 

      Reference

      Betleja, E., Nanjundappa, R., Cheng, T., & Mahjoub, M. R. (2018). A novel Cep120-dependent mechanism inhibits centriole maturation in quiescent cells. Elife, 7. doi:10.7554/eLife.35439

      Gavilan, M. P., Gandolfo, P., Balestra, F. R., Arias, F., Bornens, M., & Rios, R. M. (2018). The dual role of the centrosome in organizing the microtubule network in interphase. EMBO Rep, 19(11). doi:10.15252/embr.201845942

      Jimenez, A. J., Schaeffer, A., De Pascalis, C., Letort, G., Vianay, B., Bornens, M., . . . Thery, M. (2021). Acto-myosin network geometry defines centrosome position. Curr Biol, 31(6), 1206-1220 e1205. doi:10.1016/j.cub.2021.01.002

      Martin, M., Veloso, A., Wu, J., Katrukha, E. A., & Akhmanova, A. (2018). Control of endothelial cell polarity and sprouting angiogenesis by non-centrosomal microtubules. Elife, 7. doi:10.7554/eLife.33864

      Meitinger, F., Anzola, J. V., Kaulich, M., Richardson, A., Stender, J. D., Benner, C., . . . Oegema, K. (2016). 53BP1 and USP28 mediate p53 activation and G1 arrest after centrosome loss or extended mitotic duration. J Cell Biol, 214(2), 155-166. doi:10.1083/jcb.201604081

      Schmoranzer, J., Fawcett, J. P., Segura, M., Tan, S., Vallee, R. B., Pawson, T., & Gundersen, G. G. (2009). Par3 and dynein associate to regulate local microtubule dynamics and centrosome orientation during migration. Curr Biol, 19(13), 1065-1074. doi:10.1016/j.cub.2009.05.065

      Tateishi, K., Yamazaki, Y., Nishida, T., Watanabe, S., Kunimoto, K., Ishikawa, H., & Tsukita, S. (2013). Two appendages homologous between basal bodies and centrioles are formed using distinct Odf2 domains. J Cell Biol, 203(3), 417-425. doi:10.1083/jcb.201303071

      Vasquez-Limeta, A., & Loncarek, J. (2021). Human centrosome organization and function in interphase and mitosis. Semin Cell Dev Biol, 117, 30-41. doi:10.1016/j.semcdb.2021.03.020

      Vinopal, S., Dupraz, S., Alfadil, E., Pietralla, T., Bendre, S., Stiess, M., . . . Bradke, F. (2023). Centrosomal microtubule nucleation regulates radial migration of projection neurons independently of polarization in the developing brain. Neuron, 111(8), 1241-1263 e1216. doi:10.1016/j.neuron.2023.01.020.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Figures:

      (1) Figure 3 B+C - Although in comparison to Figure 2 it appears the p53 mutation does not affect θN-C, or Lo-c. the figure would benefit from direct comparison to control cells.

      We appreciate your suggestion to improve the clarity of the figure. In response, we have revised Figure 3B+C to include control cell data, allowing for clearer side-by-side comparisons in the updated figures. 

      (2) Figure 3D - Clarify if both were normalized to time point 0:00 of the p53 KO. The image used appears that Gp135 intensity increases substantially between 0:00 and 0:15 in the figure, but the graph suggests that the intensity is the same if not slightly lower.

      Figure 3D – The data were normalized to the respective 0:00 time point for each condition. Because the intensity profile was measured along a line connecting the two nuclei, Gp135 signal could only be detected if it appeared along this line. However, the images shown are maximum-intensity projections, meaning that Gp135 signals from peripheral regions are projected onto the center of the image. This may create the appearance of increased intensity at certain time points (e.g., Figure 3A, p53-KO + CN, 0:00–0:15). 

      (3) Figure 4A: The diagram does not accurately represent the effect of the mutations, for example, PCNT mutation likely doesn't completely disrupt PCM (given gamma-tubulin is still visible in the staining), but instead results in its disorganization, Cep164 also wouldn't be expected to completely ablate distal appendages.

      Thank you for your comment. We have modified the figure in the revised manuscript (Figure 4A) to more clearly depict the defective DAs. 

      (4) Figure 4 + Supplements: A more in-depth characterization of the mutations would help address the previous comment and strengthen the manuscript. Especially as these components have previously been implicated in centrosome transport.

      Thank you for your valuable suggestion. As noted in previous studies, CEP164 is essential for distal appendage function and basal body docking, with its loss resulting in blocked ciliogenesis (Tanos et al., 2013); CEP120 is required for centriole elongation and distal appendage formation, and its loss also results in blocked ciliogenesis (Comartin et al., 2013; Lin et al., 2013; Tsai, Hsu, Liu, Chang, & Tang, 2019); ODF2 functions upstream in the formation of subdistal appendages, and its loss eliminates these structures and impairs microtubule anchoring (Tateishi et al., 2013); and PCNT functions as a PCM scaffold, necessary for the recruitment of PCM components and for microtubule nucleation at the centrosome (Fong, Choi, Rattner, & Qi, 2008; Zimmerman, Sillibourne, Rosa, & Doxsey, 2004). 

      Given that the phenotypes of these mutants have been well characterized in the literature. Here, we further focus on their roles in centrosome migration and polarized vesicle trafficking within the specific context of our study. 

      (5) Figure 4: It would be interesting to measure the Gp135 intensity at the centrosomes, given that the model proposes it is trafficked from the centrosomes to the AMIS.

      Thank you for your suggestion. We have included measurements of Gp135 intensity at the centrosomes during the Pre-Abs stage in the revised figure (Figure 4I). Our data show no significant differences in Gp135 intensity between wild-type (WT) and CEP164-, ODF2-, or CEP120-knockout (KO) cell lines. However, a slight decrease in Gp135 intensity was observed in PCNT-KO cells. 

      (6) Figure 6F shows that in suspension culture polarity is reversed, however, in Figure 6G gp135 still localizes to the cytokinetic furrow prior to polarity reversal. Given this paper demonstrates Par-3 is upstream of centrosome positioning, it would be important to have temporal data of how Par-3 localizes prior to the ring observed in 6F.

      Thank you for your comment. We have included a temporal analysis of Par3 localization using fixed-cell staining in the revised figure (Figure 6—figure supplement 1D). This analysis shows that Par3 also localizes to the cytokinesis site during the Pre-Abs stage, prior to ring formation observed during the Post-CK stage (Figure 6F). Interestingly, during the Pre-Abs stage, the centrosomes also migrate toward the center of the cell doublets in suspension culture, and Gp135 surrounding the centrosomes is also recruited to a region near the center (Figure 6—figure supplement 1E). These data suggest that Par3 also is initially recruited to the cytokinesis site before polarity reversal, potentially promoting centrosome migration. The main difference from Matrigel culture is the peripheral localization of Par3 and Gp135 in suspension, which is likely due to the lack of external ECM signaling. 

      Results:

      (1) Page 7 Paragraph 1 - consistently use AMIS (Apical membrane initiation site) rather than "the apical site".

      Thank you for your helpful comment. We have revised the manuscript (Page 7, Paragraph 1) and will now use "AMIS" (Apical Membrane Initiation Site) instead of "the apical site" throughout the text. 

      (2) Page 7 Paragraph 4 - A single sentence explaining why the p53 background had to be used for the Cep120 deletion would be beneficial. Did the cell line have a reduced centrosome number? Does this effect apical membrane initiation similar to centrinone?

      We have revised the text (Page 7, Paragraph 4) to clarify that we were unable to generate a CEP120 KO line in p53-WT cells for unknown reasons. CEP120-KO cells have a normal number of centrosome, but their centrioles are shorter. Because this KO line still contains centrioles, the effect is different from centrinone treatment, which results in a complete loss of centrioles. 

      (3) Page 10 paragraph 4 - This paragraph is confusing to read. I understand that in the cysts and epithelial sheet the cytokinetic furrow is apical, therefore a movement towards the AMIS could be due to its coincidence with the furrow. However, the phrasing "....we found that centrosomes move towards the apical membrane initiation site direction before bridge abscission. Taken together these findings indicate the position is strongly associated with the site of cytokinesis but not with the apical membrane" is confusing to the reader.

      We have revised the manuscript (Page 11, paragraph 4) to change the AMIS as the center of the cell doublet. During de novo epithelial polarization, the apical membrane has not yet formed at the Pre-Abs stage. However, at the Pre-Abs stage, the centrosome has already migrated toward the site of cytokinesis, suggesting that centrosome positioning is correlated with the site of cell division. A similar phenomenon occurs in fully polarized epithelial cysts and sheets, where the centrosomes also migrate before bridge abscission. Thus, we propose that the position of the centrosome is closely associated with the site of cytokinesis and is independent of apical membrane formation. 

      Discussion

      (1) Page 11, Paragraph 2 - citations needed when discussing previous studies.

      Thank you for your suggestion. We have included the necessary references to the discussion of the previous studies in the revised manuscript (Page 12, Paragraph 2). 

      (2) Page 12, Paragraph 2 - This section of the discussion would be strengthened by discussing the role of the actomyosin network in defining centrosome position (Jimenez et al., 2021). It seems plausible that the differences observed in the different conditions could be due to altered actomyosin architecture. Especially where the cells haven't undergone cytokinesis.

      We appreciate the suggestion of a role for the actomyosin network in determining centrosome positioning. Recent studies have indeed highlighted the role of the actomyosin network in regulating centrosome centering and off-centering (Jimenez et al., 2021). During the pre-abscission stage of cell division, the actomyosin network undergoes significant dynamic changes, with the contractile ring forming at the center and actin levels decreasing at the cell periphery. In contrast, under aggregated cell conditions—meaning cells that have not undergone division—the actomyosin network does not exhibit such dynamic changes. The loss of actomyosin remodeling may therefore influence whether the centrosome moves. Thus, alterations in actomyosin architecture may contribute to the differences observed under various conditions, particularly when cells have not yet completed cytokinesis. We have revised Paragraph 2 on Page 13 to briefly mention the referenced study and to propose that the actomyosin network may influence centrosome positioning, contributing to our observed results. This addition strengthens the discussion and clarifies our findings. 

      (3) Page 12 paragraph 3 - Given that centrosome translocation during cytokinesis in MDCK cells (this study) appears to be similar to that observed in HeLa cells and the zebrafish Kupffers vesicle (Krishnan et al., 2022) it would be interesting to discuss why Rab11a and PCNT may not be essential to centrosome positioning in MDCK cells.

      Thank you for your insightful comment. We agree that it is interesting that centrosome translocation during cytokinesis in MDCK cells (as observed in our study) is similar to that observed in HeLa cells and zebrafish Kupffer's vesicle (Krishnan et al., 2022). However, there are notable differences between these systems that may help explain why Rab11a and PCNT are not essential for centrosome positioning in MDCK cells.

      Our study used 3D culture of MDCK cells, while the reference study examined adherent culture of HeLa cells. In the adherent culture, cells attached to the culture surface form large actin stress fibers on their basal side, which weakens the actin networks in the apical and intercellular regions. In contrast, the 3D culture system used in our study better preserves cell polarity and the integrity of the actin network, which might contribute to centrosome positioning independent of Rab11a and PCNT. Differences in culture conditions and actin network architecture may explain why Rab11a and PCNT are not required for centrosome positioning in MDCK cells.

      Furthermore, the referenced study focused on Rab11a and PCNT in zebrafish embryos at 3.3–5 hours post-fertilization (hpf), a time point before the formation of the Kupffer’s vesicle. At this stage, the cells they examined may not yet have become epithelial cells, which may also influence the requirement of Rab11a and PCNT for centrosome positioning. We hypothesize that during the pre-abscission stage, centrosome migration toward the cytokinetic bridge occurs primarily in epithelial cells, and that the polarity and centrosome positioning mechanisms in these cells may differ from those in other cell types, such as zebrafish embryos.

      Furthermore, data from Krishnan et al. (2022) suggest that cytokinesis failure in pcnt+/- heterozygous embryos and Rab11a functional-blocked embryos may be due to the presence of supernumerary centrosomes. Consistent with this, our data show that blocking cytokinesis inhibits centrosome movement in MDCK cells. However, in our MDCK cell lines with PCNT or Rab11a knockdown, we did not observe significant cytokinesis failure, and centrosome migration proceeded normally. 

      Reviewer #2 (Recommendations for the authors):

      Suggestions for experiments:

      (1) A description of the organization of microtubules in the absence of centriole, or in the absence of ECM would be interesting to understand how polarity markers end up where you observed them. This easy experiment may significantly improve our understanding of this system.

      Previous studies have shown that in the absence of centrioles, microtubule organization undergoes significant changes. Specifically, the number of non-centrosomal microtubules increases, and these microtubules are not radially arranged, leading to the absence of focused microtubule organizing centers in centriolar-deficient cells (Martin, Veloso, Wu, Katrukha, & Akhmanova, 2018). This disorganized microtubule network reduces the efficiency of vesicle transport during de novo epithelial polarization at the mitotic preabscission stage. 

      In contrast, the organization of microtubules under ECM-free conditions remains less well characterized. Here, we show that while the ECM plays a critical role in establishing the direction of epithelial polarity, it does not influence the positioning of the centrosome, the microtubule-organizing center (MTOC).  

      (2) Would it be possible to knock down ODF2 and pericentrin to completely disconnect the centrosome from microtubules?

      ODF2 is the base of subdistal appendages. When ODF2 is knocked out, it affects the recruitment of all downstream proteins to the subdistal appendages (Mazo, Soplop, Wang, Uryu, & Tsou, 2016). One study has shown that ODF2 knockout cells almost completely lost subdistal appendage structures and significantly reduced the microtubule asters surrounding the centrioles (Tateishi et al., 2013). However, although pericentrin (PCNT) is the main scaffold of the pericentriolar matrix (PCM) of centrosomes, the microtubule organization ability of centrosomes can be compensated by AKAP450, a paralog of PCNT, after PCNT knockout. A previous study has even shown that in cells with a double knockout of PCNT and AKAP450, γ-tubulin can still be recruited to the centrosomes, and centrosomes can still nucleate microtubules (Gavilan et al., 2018). This suggests that there are other proteins or pathways that promote microtubule nucleation on centrosomes. We are unsure whether the triple knockout of ODF2, PCNT, and AKAP450 can completely disconnect the centrosome from microtubules. However, a recent study reported a simpler approach involving the expression of dominant-negative fragments of the γ-tubulinbinding protein NEDD1 and the activator CDK5RAP2 at the centrosome (Vinopal et al., 2023). In our revised manuscript (Page 8, Paragraph 4; Figure 4—figure supplement 3), we applied this strategy, which resulted in the depletion of nearly all γ-tubulin from the centrosome. This indicates a strong suppression of centrosomal microtubule nucleation and an effective disconnection of the centrosome from the microtubule network. 

      (3) The study does not distinguish the role of cytokinesis from the role of tight junctions, which form only after cytokinesis and not simply by bringing cells into contact. Would it be feasible and interesting to study the polarization after cytokinesis in cells that could not form tight junctions (due to the absence of Ecad or ZO1 for example)?

      Studying cell polarization after cytokinesis in cells unable to form tight junctions is a promising area of research.

      Recent studies have shown that mouse embryonic stem cells (mESCs) cultured in Matrigel can form ZO-1-labelled tight junctions at the midpoint of cell–cell contact even in the absence of cell division. However, in the absence of E-cadherin, ZO-1 localization is significantly impaired. Interestingly, despite the loss of E-cadherin, the Golgi apparatus and centrosomes remain oriented toward the cell–cell interface (Liang, Weberling, Hii, Zernicka-Goetz, & Buckley, 2022). These findings suggest that cell polarity can be maintained independently of tight junction formation, highlighting the potential value of studying cell polarization that lack tight junctions.

      Furthermore, while studies have explored the effects of knocking down tight junction components such as JAM-A and Cingulin on lumen formation in MDCK 3D cultures (Mangan et al., 2016; Tuncay et al., 2015), the role of ZO-1 in this context remains underexplored. Cingulin knockdown has been shown to disrupt endosome targeting and the formation of the AMIS, while both JAM-A and Cingulin knockdown result in actin accumulation at multiple points, leading to the formation of multi-lumen structures rather than a reversal of polarity. However, previous research has not specifically investigated centrosome positioning in JAM-A and Cingulin knockdown cells, an area that could provide valuable insights into how polarity is maintained in the absence of tight junctions. 

      Writing details:

      (1) The migration of the centrosome in the absence of appendages or PCM is proposed to be ensured by compensatory mechanisms ensuring the robustness of microtubule anchoring to the centrosome. It could also be envisaged that the centrosome motion does not require this anchoring and that other yet unknown moving mechanisms, based on an actin network for example, might exist.

      Thank you for your valuable comments. We agree that there may indeed be some unexpected mechanisms that allow centrosomes to move independently of microtubule anchoring to the centrosome, such as mechanisms based on actin filaments or noncentrosomal microtubules; these mechanisms are worth further investigation.

      In response to your suggestion, in the Paragraph 5 of the discussion section, we further clarified that while a microtubule anchoring mechanism might be one explanation, other mechanisms could also influence centrosome movement in the absence of appendages or PCM. Additionally, we revised the Paragraph 4 regarding the possibility of actin network-driven centrosome movement and emphasized the importance of future research for a deeper understanding of these processes. 

      (2) The actual conclusion of the study of Martin et al (eLife 2018) is not simply that centrosome is not involved in cell polarization but that it hinders cell polarization!

      Thank you for your valuable feedback. We agree with the findings of Martin et al. (eLife 2018) that centrosome is not irrelevant to cell polarity, but rather they inhibit cell polarization. Therefore, we have revised the manuscript (Page 2, Paragraph 2) to more accurately reflect this viewpoint. 

      (3) This study recalls some conclusions of the study by Burute et al (Dev Cell 2017), in particular the role of Par3 in driving centrosome toward the intercellular junction of daughter cells after cytokinesis. It would be welcome to comment on the results of this study in light of their work.

      Thank you for your valuable feedback. The study by Burute et al. (Dev Cell, 2017) showed that in micropattern-cultures of MCF10A cells, the cells exhibit polarity and localize their centrosomes towards the intercellular junction, while downregulation of Par3 gene expression disrupts this centrosome positioning. This result is similar to our findings in 3D cultured MDCK cells and consistent with previous studies in C. elegans intestinal cells and migrating NIH 3T3 cells (Feldman & Priess, 2012; Schmoranzer et al., 2009), indicating that Par3 indeed influences centrosome positioning in different cellular systems. However, Par3 does not directly localize to the centrosome; rather, it localizes to the cell cortex or cell-cell junctions. Therefore, Par3 likely regulates centrosome positioning through other intermediary molecules or mechanisms, but the specific mechanism remains unclear and requires further investigation. 

      (4) Could the term apico-basal be used in the absence of a basement membrane to form a basal pole?

      We understand that using the term "apico-basal" in the absence of a basement membrane might raise some questions. Traditionally, the apico-basal axis refers to the polarity of epithelial cells, where the apical surface faces the lumen or external environment, and the basal surface is oriented toward the basement membrane. However, in the absence of a basement membrane, such as in certain in vitro systems or under specific experimental conditions, polarity along a similar axis can still be observed. In such cases, the term "apico-basal" can still be used to describe the polarity between the apical domain and the region where it contacts the substrate or adjacent cells. 

      (5) The absence of centrosome movement to the intercellular bridge in spread cells in culture is not so surprising considering the work of Lafaurie-Janvore et al (Science 2018) about the role of cell spreading in the regulation of bridge tension and abscission delay.

      Thank you for your valuable comment. Indeed, previous studies have shown that in some cell types, the centrosome does move toward the intercellular bridge in spread cells (Krishnan et al., 2022; Piel, Nordberg, Euteneuer, & Bornens, 2001), but other studies have suggested that this movement may not be significant and it may not occur in universally observed across all cell types (Jonsdottir et al., 2010). In our study, we aim to demonstrate that this phenomenon is more pronounced in 3D culture systems compared to 2D spread cell culture systems. Previous studies and our work have observed that centrosome migration occurs during the pre-abscission stage, but whether this migration is directly related to cytokinetic bridge tension or the time of abscission remains an open question. Further research is needed to explore the potential relationship between centrosome positioning, cytokintic bridge tension, and the timing of abscission. 

      (6) GP135 (podocalyxin) has been proposed to have anti-adhesive/lubricant properties (hence its pro-invasive effect). Could it be possible that once localized at the cell surface it is systematically moved away from regions that are anchored to either the ECM or adjacent cells? So its localization away from the centrosome in an ECM-free experiment would not be a consequence of defective targeting but relocalization after reaching the plasma membrane?

      Thank you for your valuable comment. We agree that GP135 may indeed move directly across the cell surface, away from the region where it interacts with the ECM or adjacent cells. This re-localization could be due to its anti-adhesive or lubricating properties, which may facilitate its displacement from these adhesive sites. To validate this, it is necessary to employ higher-resolution real-time imaging system to observe the dynamic behavior of GP135 on the cell surface.

      However, this does not contradict our main conclusion. Under suspension culture conditions without ECM, the centrosome positioning in cell doublets is indeed decoupled from apical membrane orientation. This suggests that the localization of the centrosome and the apical membrane is regulated by different mechanisms. Specifically, the GP135 protein tends to accumulate away from areas of contact with the ECM or adjacent cells, possibly through movement within the cell membrane or by recycling endosome transport. In contrast, centrosome positioning is closely related to the cytokinesis site. Our study clearly elucidates the differences between these two polarity properties. 

      Reviewer #3 (Recommendations for the authors):

      Major:

      (1) To me, a clear implication of these studies is that Gp135, Rab11, etc. are delivered to the AMIS on centrosomal microtubules. The authors do not explore this model except to say that depletion of SD appendage or pericentrin has no effect on the protein relocalization to the AMIS. However, the authors do not observe microtubule association with the centrosome in these KO conditions. This analysis is imperative to interpret existing results since these are new KO conditions in this cell/culture system and parallel pathways (e.g. CDK5RAP2) are known to contribute to microtubule association with the centrosome. An ability to comment on the mechanism by which the centrosome contributes to the efficiency of polarization would greatly enhance the paper.

      Microtubule requirement could also be tested in numerous additional ways requiring varying degrees of new experiments:

      (a) faster live cell imaging at abscission to see if the deposition of those components appears to traffic on MTs;

      (b) live cell imaging with microtubules (e.g. SPY-tubulin) and/or EB1 to determine the origin and polarity of microtubules at the pertinent stages;

      For (a) and (b), because the cells were cultured in Matrigel, they tended to be round up, with a dense internal structure that made observation difficult. In contrast, under adherent culture conditions, the cells were flattened with a more dispersed internal structures, making them easier to observe. We had previously used SPY-tubulin to label microtubules for live cell imaging; however, due to the dense microtubule structure in 3D culture, the image contrast was reduced, and we could not clearly observe the microtubule network within the cells. 

      (c) acute nocodazole treatment at abscission to determine the effect on protein localization.

      Regarding the method of using nocodazole to study microtubule requirements at the abscission stage, we believe that nocodazole treatment may lead to cytokinesis failure. Cell division failure results in the formation of binucleated cells, which are unable to establish cell polarity. Furthermore, nocodazole treatment cannot distinguish between centrosomal and non-centrosomal microtubules, making it unsuitable for studying the specific role of centrosomal microtubules in this process.

      In our new data (Figure 4-figure supplementary 3) presented in the revised manuscript, we employed a recently reported method by co-expressing of the centrosome-targeting carboxy-terminal domain (C-CTD) of CDK5RAP2 and the γ-tubulin-binding domain (gTBD) of NEDD1 to completely deplete γ-tubulin and abolish centrosomal microtubule nucleation (Vinopal et al., 2023). We found that cells lacking centrosomal microtubules were still able to polarize and position the centrioles apically. However, the efficiency of polarized transport of Gp135 vesicles to the apical membrane was reduced. These findings suggest that centrosomal microtubules are not essential for polarity establishment but may contribute to facilitate efficient apical transport. 

      (2) Similar to the expanded analysis of the role of microtubules in this system, it would be excellent if the author could expand on the role of Par3 and the centrosome, although this reviewer recognizes that the authors have already done substantial work. For example, what are the consequences of Gp135 and/or Rab11 getting stuck at the centrosome? Do the authors have any later images to determine when and if these components ever leave the centrosome? Existing literature focuses on the more downstream consequence of Par3 removal on single-lumen formation. 

      Similarly, could the authors expand on the description of polarity disruption following centrinone treatment? It is clear that Gp135 recruitment is disrupted, but how and when do things get fixed and what else is disrupted at the very earliest stages of AMIS formation? The authors have an excellent opportunity to really expand on what is known about the requirements for these conserved components.

      Regarding the use of centrinone in treatment, we speculate that Gp135 can still accumulate at the AMIS over time, although the efficiency of its recruitment may be reduced.

      Furthermore, under similar conditions, other apical membrane components (such as the Crumbs3 protein) may exhibit similar characteristics to Gp135 protein. 

      (3) Perhaps satisfying both of the above asks, could the authors do a faster time-lapse at the relevant time points, i.e. as proteins are being recruited to the AMIS (time points between 1Aiv and v)? This type of imaging again might help shed light on the mechanism.

      We believe the above questions are very important and may require further experimental verification in the future. 

      Minor:

      (1) What is the green patch of Gp135 in Figure 2A that does not colocalize with the centrosome? Is this another source of Gp135 that is being delivered to the AMIS? This type of patch is also visible in Figure 3A 15 and 30-minute panels.

      During mitosis, membrane-composed organelles such as the Golgi apparatus are typically dispersed throughout the cytoplasm. However, during the pre-abscission stage, these organelles begin to reassemble and cluster around the centrosome. Furthermore, they also accumulate in the region between the nucleus and the cytokinetic bridge, corresponding to the “patch” mentioned in Figure 2A. 

      Live cell imaging results showed that this Gp135 patch initially appears in a region not associated with the centrosome. Subsequently, they were either directly transported to the AMIS or fused with the centrosome-associated Gp135 and transported together. Notably, this patch was only observed when Gp135 was overexpressed in cells. No such distinct protein patches were observed when staining endogenous Gp135 protein (Figure 1A), suggesting that overexpression of Gp135 protein may lead to a localized increase in its concentration in that region. 

      (2) I am confused by the "polarity index" quantification as this appears to just be a nucleus centrosome distance measurement and wouldn't, for example, distinguish if the centrosomes separated from the nucleus but were on the basal side of the cell.

      The position of the centrosome within the cell (i.e., its distance from the nucleus) can indeed serve as an indicator of cell polarity (Burute et al., 2017). We acknowledge that this quantitative method does not directly capture the specific direction in which the centrosome deviates from the cell center. To address this limitation, we have incorporated information about the angle between the nucleus and the centrosome, which allows for a more accurate description of changes in cell polarity (Rodriguez-Fraticelli, Auzan, Alonso, Bornens, & Martin-Belmonte, 2012). 

      (3) How is GP135 "at AMIS" measured? Is an arbitrary line drawn? This is important later when comparing to centrinone treatment in Figure 3D where the quantification does not seem to accurately capture the enrichment of Gp135 that is seen in the images.

      To measure the expression level of Gp135 in the "AMIS" region of the cell, we first connected the centers of the two cell nuclei in three-dimensional space to form a straight line. Then, we used the Gp135 expression intensity at the midpoint of this line as the representative value for the AMIS region. This method is based on the assumption that the AMIS region is most likely located between the centers of the two cell nuclei. Therefore, this quantitative method provides a standardized assessment tool for comparing Gp135 expression levels under different conditions. 

      (4) The authors reference cell height (p.7) but no data for this measurement are shown

      Thank you for the comment. Although we did not perform quantitative measurements, the differences in cell height are clearly visible in Figure 3E (p53-KO + CN), which visually illustrates this phenomenon. 

      (5) Can the authors comment on the seeming reduction of Par3 in p53 KO cells?

      We did not observe a reduction of Par3 in p53-KO cells in our experiments.

      (6) Can the authors make sense of the E-cad localization: Figure 5, Supplement 2.

      Our study revealed that E-cadherin begins to accumulate at the cell-cell contact sites during the pre-abscission stage. Its appearance is similar to that of ZO-1, which also appears near the cell division site during this phase. Therefore, the behavior of E-cadherin contrasts sharply with that of Gp135, further highlighting the unique trafficking mechanisms of apical membrane proteins during this process. 

      (7) I find the results in Figure 6G puzzling. Why is ECM signaling required for Gp135 recruitment to the centrosome. Could the authors discuss what this means?

      We appreciate the reviewer’s valuable comments and thank you for the opportunity to clarify this point. The data in Figure 6G do not indicate that ECM signaling is required for the recruitment of Gp135 to the centrosome. Rather, our findings suggest that even in the absence of ECM, the centrosomes can migrate to a polarized position similar to that in Matrigel culture. This suggests that centrosome migration and the orientation of the nucleus–centrosome axis may be independent of ECM signaling and are primarily driven by cytokinesis alone. 

      Regarding the localization of Gp135, previous studies have shown that ECM signaling through integrin promotes endocytosis, which is crucial for the internalization of Gp135 from the cell membrane and its subsequent transport to the AMIS (Buckley & St Johnston, 2022). Our study found that, prior to its accumulation at the AMIS, Gp135 transiently localizes around the centrosome. In the absence of ECM, due to reduced endocytosis, Gp135 primarily remains on the cell membrane and does not undergo intracellular trafficking.  

      (8) The authors end the Discussion stating that these studies may have implication for in vivo settings, yet do not discuss the striking similarities to the C. elegans and Drosophila intestine or the findings from any other more observational studies of tubular epithelial systems in vivo (e.g. mouse kidney polarization, zebrafish neuroepithelium, etc.). These models should be discussed.

      Thank you for your valuable comment. Indeed, all types of epithelial tissues or tubular epithelial systems in vivo share some common features during cell division, which have been well-documented across various species. 

      These features include: during interphase, the centrosome is located at the apical surface of the cells; after the cell enters mitosis, the centrosome moves to the lateral side of the cell to regulate spindle orientation; and during cytokinesis, the cleavage furrow ingresses asymmetrically from the basal to the apical side, with the cytokinetic bridge positioned at the apical surface. Our study using MDCK 3D culture and transwell culture systems successfully mimicked these key features, demonstrating that these in vitro models are of significant value for studying cell polarization dynamics. 

      Based on our observations, we speculate that the centrosome may return to the apical surface after anaphase, just before bridge abscission. This is consistent with our findings from studies using MDCK 3D cultures and transwell systems, which showed that the centrosome relocates prior to the final stages of cytokinesis.

      Additionally, we propose that de novo polarization of the kidney tubule in vivo may not solely depend on the aggregation and mesenchymal-epithelial transition (MET) of the metanephric mesenchyme. It may also be related to the cell division process, which triggers centrosome migration and polarized vesicle trafficking. These processes likely contribute to enhancing cell polarization, as we observed in our in vitro models.

      We hope this will further clarity the potential implications of our findings for in vivo model studies, as well as and their broader impact on the field of tubular epithelial cell polarization research. 

      (9) There are several grammatical issues/typos throughout the paper. A careful readthrough is required. For example:

      this sentence makes no sense "that the centrosome acts as a hub of apical recycling endosomes and centrosome migration during cytokinetic pre-abscission before apical membrane components are targeted to the AMIS"

      We carefully reviewed the paper and made necessary revisions to address the issues raised. In particular, we revised certain sentences to improve clarity and readability (Page 5, Paragraph 3). 

      (10) P.8: have been previously reported [to be] involved in MDCK...

      We appreciate the reviewer's valuable suggestions. We have revised the sentence accordingly (Page 9, Paragraph 2). 

      (11) This sentence seems misplaced: "Cultured conditions influence cellular polarization preferences."

      The sentence itself is fine, but to improve the coherence and clarity of the paragraph, we adjusted the paragraph structure and added some transitional phrases (Page 13, Paragraph 1).  

      (12) "Play a downstream role in Par3 recruitment" doesn't make sense, this should just be downstream of Par3 recruitment.

      Thank you for your suggestion. We have revised the wording accordingly, changing it to "downstream of Par3 recruitment" (Page 10, Paragraph 2).  

      Reference

      Buckley, C. E., & St Johnston, D. (2022). Apical-basal polarity and the control of epithelial form and function. Nat Rev Mol Cell Biol, 23(8), 559-577. doi:10.1038/s41580-022-00465-y

      Burute, M., Prioux, M., Blin, G., Truchet, S., Letort, G., Tseng, Q., . . . Thery, M. (2017). Polarity Reversal by Centrosome Repositioning Primes Cell Scattering during Epithelial-to-Mesenchymal Transition. Dev Cell, 40(2), 168-184. doi:10.1016/j.devcel.2016.12.004

      Comartin, D., Gupta, G. D., Fussner, E., Coyaud, E., Hasegan, M., Archinti, M., . . . Pelletier, L. (2013). CEP120 and SPICE1 cooperate with CPAP in centriole elongation. Curr Biol, 23(14), 13601366.

      doi:10.1016/j.cub.2013.06.002

      Feldman, J. L., & Priess, J. R. (2012). A role for the centrosome and PAR-3 in the hand-off of MTOC function during epithelial polarization. Curr Biol, 22(7), 575-582. doi:10.1016/j.cub.2012.02.044

      Fong, K. W., Choi, Y. K., Rattner, J. B., & Qi, R. Z. (2008). CDK5RAP2 is a pericentriolar protein that functions in centrosomal attachment of the gamma-tubulin ring complex. Mol Biol Cell, 19(1), 115-125. doi:10.1091/mbc.e07-04-0371

      Gavilan, M. P., Gandolfo, P., Balestra, F. R., Arias, F., Bornens, M., & Rios, R. M. (2018). The dual role of the centrosome in organizing the microtubule network in interphase. EMBO Rep, 19(11). doi:10.15252/embr.201845942

      Jimenez, A. J., Schaeffer, A., De Pascalis, C., Letort, G., Vianay, B., Bornens, M., . . . Thery, M. (2021). Acto-myosin network geometry defines centrosome position. Curr Biol, 31(6), 1206-1220 e1205. doi:10.1016/j.cub.2021.01.002

      Jonsdottir, A. B., Dirks, R. W., Vrolijk, J., Ogmundsdottir, H. M., Tanke, H. J., Eyfjord, J. E., & Szuhai, K. (2010). Centriole movements in mammalian epithelial cells during cytokinesis. BMC Cell Biol, 11, 34. doi:10.1186/1471-2121-11-34

      Krishnan, N., Swoger, M., Rathbun, L. I., Fioramonti, P. J., Freshour, J., Bates, M., . . . Hehnly, H. (2022). Rab11 endosomes and Pericentrin coordinate centrosome movement during preabscission in vivo. Life Sci Alliance, 5(7). doi:10.26508/lsa.202201362

      Liang, X., Weberling, A., Hii, C. Y., Zernicka-Goetz, M., & Buckley, C. E. (2022). E-cadherin mediates apical membrane initiation site localisation during de novo polarisation of epithelial cavities. EMBO J, 41(24), e111021. doi:10.15252/embj.2022111021

      Lin, Y. N., Wu, C. T., Lin, Y. C., Hsu, W. B., Tang, C. J., Chang, C. W., & Tang, T. K. (2013). CEP120 interacts with CPAP and positively regulates centriole elongation. J Cell Biol, 202(2), 211219. doi:10.1083/jcb.201212060

      Mangan, A. J., Sietsema, D. V., Li, D., Moore, J. K., Citi, S., & Prekeris, R. (2016). Cingulin and actin mediate midbody-dependent apical lumen formation during polarization of epithelial cells. Nat Commun, 7, 12426. doi:10.1038/ncomms12426

      Martin, M., Veloso, A., Wu, J., Katrukha, E. A., & Akhmanova, A. (2018). Control of endothelial cell polarity and sprouting angiogenesis by non-centrosomal microtubules. Elife, 7. doi:10.7554/eLife.33864

      Mazo, G., Soplop, N., Wang, W. J., Uryu, K., & Tsou, M. F. (2016). Spatial Control of Primary Ciliogenesis by Subdistal Appendages Alters Sensation-Associated Properties of Cilia. Dev Cell, 39(4), 424-437. doi:10.1016/j.devcel.2016.10.006

      Piel, M., Nordberg, J., Euteneuer, U., & Bornens, M. (2001). Centrosome-dependent exit of cytokinesis in animal cells. Science, 291(5508), 1550-1553. doi:10.1126/science.1057330

      Rodriguez-Fraticelli, A. E., Auzan, M., Alonso, M. A., Bornens, M., & Martin-Belmonte, F. (2012). Cell confinement controls centrosome positioning and lumen initiation during epithelial morphogenesis. J Cell Biol, 198(6), 1011-1023. doi:10.1083/jcb.201203075

      Schmoranzer, J., Fawcett, J. P., Segura, M., Tan, S., Vallee, R. B., Pawson, T., & Gundersen, G. G. (2009). Par3 and dynein associate to regulate local microtubule dynamics and centrosome orientation during migration. Curr Biol, 19(13), 1065-1074. doi:10.1016/j.cub.2009.05.065

      Tanos, B. E., Yang, H. J., Soni, R., Wang, W. J., Macaluso, F. P., Asara, J. M., & Tsou, M. F. (2013). Centriole distal appendages promote membrane docking, leading to cilia initiation. Genes Dev, 27(2), 163-168. doi:10.1101/gad.207043.112

      Tateishi, K., Yamazaki, Y., Nishida, T., Watanabe, S., Kunimoto, K., Ishikawa, H., & Tsukita, S. (2013). Two appendages homologous between basal bodies and centrioles are formed using distinct Odf2 domains. J Cell Biol, 203(3), 417-425. doi:10.1083/jcb.201303071

      Tsai, J. J., Hsu, W. B., Liu, J. H., Chang, C. W., & Tang, T. K. (2019). CEP120 interacts with C2CD3 and Talpid3 and is required for centriole appendage assembly and ciliogenesis. Sci Rep, 9(1), 6037. doi:10.1038/s41598-019-42577-0

      Tuncay, H., Brinkmann, B. F., Steinbacher, T., Schurmann, A., Gerke, V., Iden, S., & Ebnet, K. (2015). JAM-A regulates cortical dynein localization through Cdc42 to control planar spindle orientation during mitosis. Nat Commun, 6, 8128. doi:10.1038/ncomms9128

      Vinopal, S., Dupraz, S., Alfadil, E., Pietralla, T., Bendre, S., Stiess, M., . . . Bradke, F. (2023). Centrosomal microtubule nucleation regulates radial migration of projection neurons independently of polarization in the developing brain. Neuron, 111(8), 1241-1263 e1216. doi:10.1016/j.neuron.2023.01.020

      Zimmerman, W. C., Sillibourne, J., Rosa, J., & Doxsey, S. J. (2004). Mitosis-specific anchoring of gamma tubulin complexes by pericentrin controls spindle organization and mitotic entry. Mol Biol Cell, 15(8), 3642-3657. doi:10.1091/mbc.e03-11-0796.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      This paper focuses on understanding how covalent inhibitors of peroxisome proliferator-activated receptor-gamma (PPARg) show improved inverse agonist activities. This work is important because PPARg plays essential roles in metabolic regulation, insulin sensitization, and adipogenesis. Like other nuclear receptors, PPARg, is a ligand-responsive transcriptional regulator. Its important role, coupled with its ligand-sensitive transcriptional activities, makes it an attractive therapeutic target for diabetes, inflammation, fibrosis, and cancer. Traditional non-covalent ligands like thiazolininediones (TZDs) show clinical benefit in metabolic diseases, but utility is limited by off-target effects and transient receptor engagement. In previous studies, the authors characterized and developed covalent PPARg inhibitors with improved inverse agonist activities. They also showed that these molecules engage unique PPARg ligand binding domain (LBD) conformations whereby the c-terminal helix 12 penetrates into the orthosteric binding pocket to stabilize a repressive state. In the nuclear receptor superclass of proteins, helix 12 is an allosteric switch that governs pharmacologic responses, and this new conformation was highly novel. In this study, the authors did a more thorough analysis of how two covalent inhibitors, SR33065 and SR36708 influence the structural dynamics of PPARg LBD. 

      Strengths: 

      (1) The authors employed a compelling integrated biochemical and biophysical approach.  

      (2) The cobinding studies are unique for the field of nuclear receptor structural biology, and I'm not aware of any similar structural mechanism described for this class of proteins.  

      (3) Overall, the results support their conclusions.  

      (4) The results open up exciting possibilities for the development of new ligands that exploit the potential bidirectional relationship between the covalent versus non-covalent ligands studied here. 

      Weaknesses: 

      (1) The major weakness in this work is that it is hard to appreciate what these shifting allosteric ensembles actually look like on the protein structure. Additional graphical representations would really help convey the exciting results of this study. 

      We thank the review for the comments. In response to the specific recommendations below, we added two new figures—Figure 1 and Figure 8 in this resubmission—that hopefully address the weakness identified by the reviewer.

      Reviewer #2 (Public review): 

      Summary: 

      The authors use ligands (inverse agonists, partial agonists) for PPAR, and coactivators and corepressors, to investigate how ligands and cofactors interact in a complex manner to achieve functional outcomes (repressive vs. activating). 

      Strengths: 

      The data (mostly biophysical data) are compelling from well-designed experiments. Figures are clearly illustrated. The conclusions are supported by these compelling data. These results contribute to our fundamental understanding of the complex ligand-cofactor-receptor interactions. 

      Weaknesses: 

      This is not the weakness of this particular paper, but the general limitation in using simplified models to study a complex system. 

      We appreciate the reviewer’s comments. Breaking down a complex system into a simpler model system, when possible, provides a unique lens with which to probe systems with mechanistic insight. While simplified models may not always explain the complexity of systems in cells, for example, our recently published work showed that a simplified model system — biochemical assays using reconstituted PPARγ ligand-binding domain (LBD) protein and peptides derived from coregulator proteins (similar to the assays in this current work) and protein NMR structural biology studies using PPARγ LBD — can explain the activity of ligand-induced PPARγ activation and repression to a high degree (pearson/spearman correlation coefficients ~0.7-0.9):

      MacTavish BS, Zhu D, Shang J, Shao Q, He Y, Yang ZJ, Kamenecka TM, Kojetin DJ. Ligand efficacy shifts a nuclear receptor conformational ensemble between transcriptionally active and repressive states. Nat Commun. 2025 Feb 28;16(1):2065. doi: 10.1038/s41467-025-57325-4. PMID: 40021712; PMCID: PMC11871303.

      Recommendations for the authors

      Reviewer #1 (Recommendations for the authors): 

      (1) More set-up is needed in the results section. The first paragraph is unclear on what is new to this study versus what was done previously. Likewise, a brief description of the assays used and the meaning behind differences in signals would help the general reader along. 

      We modified the last paragraph of the introduction and first results section to hopefully better set the stage for what was done previously vs. what is new/recollected in this study. In our results section, we also include more description about what the assays measure.

      (2) Since this paper is building on previous work, additional figures are needed in the introduction and discussion. Graphical depictions of what was found in the first study on how these ligands uniquely influence PPARg LBD conformation. A new model/depiction in the discussion for what was learned and its context with the rest of the field. 

      Our revised manuscript includes a new Figure 1 describing the possible allosteric mechanism by which a covalent ligand inhibits binding of other non-covalent ligands that was inferred from our previous study; and a new Figure 8 with a model for what has been learned.

      (3) It is stated that the results shown are representative data for at least two biological replicates. However, I do not see the other replicates shown in the supplementary information. 

      We appreciate the Reviewer’s emphasis on data reproducibility and rigor. We confirm that the biochemical and cellular assay data presented are indeed representative of consistent findings observed across two or more biological replicates—and we show representative data in our figures but not the extensive replicate data in supplementary information consistent with standard practices.

      (4) Figure 1a could benefit from labels of antagonists, inverse agonist, etc., next to each chemical structure. Likewise, if any co-crystal or other models are available it would be helpful to include those for comparison. 

      We added the pharmacological labels to Figure 2a (old Figure 1a).

      (5) The figure legends don't seem to match up completely with the figures. For example, Figure 2b states that fitted Ki values +/- standard deviation. are stated in the legend, but it's shown as the log Ki. 

      We revised the figure legends to ensure they display the appropriate errors as reported from the data fitting.

      (6) EC50, IC50, Ki, and Kd values alongside reported errors and R2 values for the fits should be reported in a table. 

      Our revised manuscript now includes a Source Data file (Figure 5—source data 1.xlsx) of the data (n=2) plotted in Figure 5 (old Figure 4) so that readers can regenerate the plots and calculate the errors and R2 values if desired. Otherwise, fitted values and errors are reported in figures when fitting in Prism permitted and reported errors; when Prism was unable to fit data or fit the error, n.d. (not determined) is specified.

      (7) Statistical analysis is missing in some places, for example, Figure 1b. 

      We revised Figure 2b (old Figure 1b) to include statistical testing.

      Reviewer #2 (Recommendations for the authors): 

      I suggest that the authors discuss the following points to broaden the significance of the results: 

      (1) The two partial agonists MRL24 and nTZDpa) are "partial" in the coactivator and corepressor recruitment assays, but are "complete" in the TR-FRET ligand displacement assay (Figure 2). Please explain that a partial agonist is defined based on the functional outcome (cofactor recruitment in this study) but not binding affinity/efficacy. 

      We added the following sentence to describe the partial agonist activity of these compounds: “These high affinity ligands are partial agonists as defined on their functional outcome in coregulator recruitment and cellular transcription; i.e., they are less efficacious than full agonists at recruiting peptides derived from coactivator proteins in biochemical assays (Chrisman et al., 2018; Shang et al., 2019; Shang and Kojetin, 2024) and increasing PPARγ-mediated transcription (Acton et al., 2005; Berger et al., 2003).“

      (2) Will the discovery reported here be broadly applicable? 

      (a) Applicable if other partial agonists and inhibitors are used? 

      (b) Applicable if different coactivators/corepressors, or different segments of the same cofactor, are used?

      (c) Applicable to other NRs (their AF-2 are similar but with sequence variation)?

      (d) The term "allosteric" might mean different things to different people - many readers might think that it means a "distal and unrelated" binding pocket. It might be helpful to point out that in this study, the allosteric site is actually "proximal and related". 

      We expanded our introduction and/or discussion sections to expand upon these concepts; specific answers as follows:

      (a) Orthosteric partial agonists?—yes, because helix 12 would clash with an orthosteiric ligand; other covalent inhibitors?—it depends on whether the covalent inhibitor stabilizes helix 12 in the orthosteric pocket.

      (b) yes with some nuanced exceptions where certain segments of the same coregulator protein bind with high affinity and others apparently do not bind or bind with low affinity

      (c) it is not clear yet if other NRs share a similar ligand-induced conformational ensemble to PPARγ

      (d) we addressed this point in the 4th paragraph of the introduction “...the non-covalent ligand binding event we previously described at the alternate/allosteric site, which is proximal to the orthosteric ligand-binding pocket, …”

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors use various genomics approaches to examine nucleosome acetylation, phosphorylation, and PolII-CTD phosphorylation marks. The results are synthesized into a hypothesis that 'fragile' nucleosomes are associated with active regions of PolII transcription.

      Strengths:

      The manuscript contains a lot of genome-wide analyses of histone acetylation, histone phosphorylation, and PolII-CTD phosphorylation.

      Weaknesses:

      This reviewer's main research expertise is in the in vitro study of transcription and its regulation in purified, reconstituted systems. I am not an expert at the genomics approaches and their interpretation, and overall, I had a very hard time understanding and interpreting the data that are presented in this manuscript. I believe this is due to a problem with the manuscript, in that the presentation of the data is not explained in a way that's understandable and interpretable to a non-expert. For example:

      (1) Figure 1 shows genome-wide distributions of H3K9ac, H4K12ac, Ser2ph-PolII, mRNA, H3S10ph, and H4S1ph, but does not demonstrate correlations/coupling - it is not clear from these data that pan-acetylation and pan-phosphorylation are coupled with Pol II transcription.

      (2) Figure 2 - It's not clear to me what Figure 2 is supposed to be showing.

      (A) Needs better explanation - what is the meaning of the labels at the top of the gel lanes?

      (B) This reviewer is not familiar with this technique, its visualization, or its interpretation - more explanation is needed. What is the meaning of the quantitation graphs shown at the top? How were these calculated (what is on the y-axis)?

      (3) To my knowledge, the initial observation of DRB effects on RNA synthesis also concluded that DRB inhibited initiation of RNA chains (pmid:982026) - this needs to be acknowledged.

      (4) Again, Figures 4B, 4C, 5, and 6 are very difficult to understand - what is shown in these heat maps, and what is shown in the quantitation graphs on top?

    2. Author response:

      Public Reviews: 

      Reviewer #1 (Public review): 

      Summary: 

      This study aims to explore how different forms of "fragile nucleosomes" facilitate RNA Polymerase II (Pol II) transcription along gene bodies in human cells. The authors propose that pan-acetylated, pan-phosphorylated, tailless, and combined acetylated/phosphorylated nucleosomes represent distinct fragile states that enable eFicient transcription elongation. Using CUT&Tagseq, RNA-seq, and DRB inhibition assays in HEK293T cells, they report a genome-wide correlation between histone pan-acetylation/phosphorylation and active Pol II occupancy, concluding that these modifications are essential for Pol II elongation. 

      Strengths: 

      (1) The manuscript tackles an important and long-standing question about how Pol II overcomes nucleosomal barriers during transcription. 

      (2) The use of genome-wide CUT&Tag-seq for multiple histone marks (H3K9ac, H4K12ac, H3S10ph, H4S1ph) alongside active Pol II mapping provides a valuable dataset for the community. 

      (3) The integration of inhibition (DRB) and recovery experiments oFers insight into the coupling between Pol II activity and chromatin modifications. 

      (4) The concept of "fragile nucleosomes" as a unifying framework is potentially appealing and could stimulate further mechanistic studies. 

      Really appreciate the positive or affirmative comments from the reviewer.

      Weaknesses: 

      (1)  Misrepresentation of prior literature 

      The introduction incorrectly describes findings from Bintu et al., 2012. The cited work demonstrated that pan-acetylated or tailless nucleosomes reduce the nucleosomal barrier for Pol II passage, rather than showing no improvement. This misstatement undermines the rationale for the current study and should be corrected to accurately reflect prior evidence. 

      What we said is according to the original report in the publication (Bintu et al., Cell, 2012). Here is the citation from the report:

      Page 739,(Bintu, L. et al., Cell, 2012)(PMID: 23141536)

      “Overall transcription through tailless and acetylated nucleosomes is slightly faster than through unmodified nucleosomes (Figure 1C), with crossing times that are generally under 1 min (39.5 ± 5.7 and 45.3 ± 7.6 s, respectively). Both the removal and acetylation of the tails increase eFiciency of NPS passage:71% for tailless nucleosomes and 63% for acetylated nucleosomes (Figures 1C and S1), in agreement with results obtained using bulk assays of transcription (Ujva´ ri et al., 2008).”

      We will cite this original sentence in our revision.

      (2) Incorrect statement regarding hexasome fragility

      The authors claim that hexasome nucleosomes "are not fragile," citing older in vitro work. However, recent studies clearly showed that hexasomes exist in cells (e.g., PMID 35597239) and that they markedly reduce the barrier to Pol II (e.g., PMID 40412388). These studies need to be acknowledged and discussed. 

      “hexasome” was introduced in the transcription field four decades ago. Later, several groups claimed that “hexasome” is fragile and could be generated in transcription elongation of Pol II. However, their original definition was based on the detection of ~100 bps DNA fragments (MNase resistant) in vivo by Micrococcal nuclease sequencing (MNase-seq), which is the right length to wrap up one hexasome histone subunit (two H3/4 and one H2A/2B) to form the sub-nucleosome of a hexasome. As we should all agree that acetylation or phosphorylation of the tails of histone nucleosomes will lead to the compromised interaction between DNA and histone subunits, which could lead to the intact naïve nucleosome being fragile and easy to disassemble, and easy to access by MNase. Fragile nucleosomes lead to better accessibility of MNase to DNA that wraps around the histone octamer, producing shorter DNA fragments (~100 bps instead of ~140 bps). In this regard, we believe that these ~100 bps fragments are the products of fragile nucleosomes (fragile nucleosome --> hexasome), instead of the other way around (hexasome --> fragile). 

      Actually, two early reports from Dr. David J.  Clark’s group from NIH raised questions about the existence of hexasomes in vivo (PMID: 28157509) (PMID: 25348398).

      From the report of PMID:35597239, depletion of INO80 leads to the reduction of “hexasome” for a group of genes, and the distribution of both “nucleosomes” and “hexasomes” with the gene bodies gets fuzzier (less signal to noise). In a recent theoretical model (PMID: 41425263), the corresponding PI found that chromatin remodelers could act as drivers of histone modification complexes to carry out different modifications along gene bodies. The PI found that INO80 could drive NuA3 (a H3 acetyltransferase) to carry out pan-acetylation of H3 and possibly H2B as well in the later runs of transcription of Pol II for a group of genes (SAGA-dependent). It suggests that the depletion of INO80 will affect (reduce) the pan-acetylation of nucleosomes, which leads to the drop of pan-acetylated fragile nucleosomes, subsequently the drop of “hexasomes”. This explains why depletion of INO80 leads to the fuzzier results of either nucleosomes or “hexasomes” in PMID: 35597239. The result of PMID: 35597239 could be a strong piece of evidence to support the model proposed by the corresponding PI (PMID: 41425263).

      From a recent report: PMID:40412388, the authors claimed that FACT could bind to nucleosomes to generate “hexasomes”, which are fragile for Pol II to overcome the resistance of nucleosomes. It was well established that FACT enhances the processivity of Pol II in vivo via its chaperonin property. However, the exact working mechanism of FACT still remains ambiguous. A report from Dr. Cramer’s group showed that FACT enhances the elongation of regular genes but works just opposite for pausing-regulated genes (PMID: 38810649). An excellent review by Drs. Tim Formosa and Fred Winston showed that FACT is not required for the survival of a group of differentiated cells (PMID: 33104782), suggesting that FACT is not always required for transcription. It is quite tricky to generate naïve hexasomes in vitro according to early reports from the late Dr. Widom’s group. Most importantly, the new data (the speed of Pol II, the best one on bare DNA is ~27 bps/s) from the report of PMID: 40412388, which is much slower than the speed of Pol II in vivo: ~2.5 kbs/min or ~40 bps/s. From our recovering experiments (Fig. 4C, as mentioned by reviewer #3), in 20 minutes (the period between 10 minutes and 30 minutes, due to the property of CUT-&TAG-seq, of which Pol II still active after cells are collected, there is a big delay of complete stop of Pol II during the procedure of CUT&TAG experiments, so the first period of time does not actually reflect the speed of Pol II, which is ~5 kb/min), all Pol IIs move at a uniform speed of ~2.5 kbs/min in vivo. Interestingly, a recent report from Dr. Shixin Liu’s group (PMID: 41310264) showed that adding SPT4/5 to the transcription system with bare DNA (in vitro), the speed of Pol II reaches ~2.5kbs/min, exactly the same one as we derived in vivo. Similar to the original report (PMID: 23141536), the current report of PMID:40412388 does not mimic the conditions in vivo exactly.

      There is an urgent need for a revisit of the current definition of “hexasome”, which is claimed to be fragile and could be generated during the elongation of Pol II in vivo. MNase is an enzyme that only works when the substrate is accessible. In inactive regions of the genome, due to the tight packing of chromatin, MNase is not accessible to individual nucleosomes within the bodies of a gene or upstream of promoters, which is why we only see phased/spacing or clear distribution of nucleosomes at the transcription start sites, but it becomes fuzzy downstream or upstream of promoters. On the other hand, for fragile nucleosomes, the accessibility to MNase should increase dramatically, which leads to the ~100 bps fragments. Based on the uniform rate (2.5 kbs/min) of Pol II for all genes derived from human 293T cells and the similar rate (2.5 kbs/min) of Pol II on bare DNA in vitro, it is unlikely for Pol II to pause in the middle of nucleosomes to generate “hexasomes” to continue during elongation along gene bodies. Similar to RNAPs in bacterial (no nucleosomes) and Archaea (tailless nucleosomes), there should be no resistance when Pol IIs transcribe along all fragile nucleosomes within gene bodies in all eukaryotes, as we characterized in this manuscript. 

      (3)  Inaccurate mechanistic interpretation of DRB 

      The Results section states that DRB causes a "complete shutdown of transcription initiation (Ser5-CTD phosphorylation)." DRB is primarily a CDK9 inhibitor that blocks Pol II release from promoter-proximal pausing. While recent work (PMID: 40315851) suggests that CDK9 can contribute to CTD Ser5/Ser2 di-phosphorylation, the manuscript's claim of initiation shutdown by DRB should be revised to better align with the literature. The data in Figure 4A indicate that 1 M DRB fully inhibits Pol II activity, yet much higher concentrations (10-100 ) are needed to alter H3K9ac and H4K12ac levels. The authors should address this discrepancy by discussing the differential sensitivities of CTD phosphorylation versus histone modification turnover. 

      Yes, it was reported that DRB is also an inhibitor of CDK9. However, if the reviewer agrees with us and the current view in the field, the phosphorylation of Ser5-CTD of Pol II is the initiation of transcription for all Pol II-regulated genes in eukaryotes. CDK9 is only required to work on the already phosphorylated Ser5-CTD of Pol II to release the paused Pol II, which only happens in metazoans. From a series of works by us and others: CDK9 is unique in metazoans, required only for the pausing-regulated genes but not for regular genes. We found that CDK9 works on initiated Pol II (Ser5-CTD phosphorylated Pol II) and generates a unique phosphorylation pattern on CTD of Pol II (Ser2ph-Ser2ph-Ser5ph-CTD of Pol II), which is required to recruit JMJD5 (via CID domain) to generate a tailless nucleosome at +1 from TSS to release paused Pol II (PMID: 32747552). Interestingly, the report from Dr. Jesper Svejstrup’s group (PMID: 40315851) showed that CDK9 could generate a unique phosphorylation pattern (Ser2ph-Ser5ph-CTD of Pol II), which is not responsive to the popular 3E10 antibody that recognizes the single Ser2phCTD of Pol II.  This interesting result is consistent with our early report showing the unique phosphorylation pattern (Ser2ph-Ser2ph-Ser5ph-CTD of Pol II) is specifically generated by CDK9 in animals, which is not recognized by 3E10 either (PMID: 32747552). Actually, an early report from Dr. Dick Eick’s group (PMID: 26799765) showed the difference in the phosphorylation pattern of the CTD of Pol II between animal cells and yeast cells.  We have characterized how CDK9 is released from 7SK snRNP and recruited onto paused Pol II via the coupling of JMJD6 and BRD4 (PMID: 32048991), which was published on eLIFE. It is well established that CDK9 works after CDK7 or CDK8. From our PRO-seq data (Fig. 3) and CUT&TAG-seq data of active Pol II (Fig. 4), adding DRB completely shuts down all genes via inhibiting the initiation of Pol II (generation of Ser5ph-CTD of Pol II). Due to the uniqueness of CDK9 only in metazoans, it is not required for the activation of CDK12 or CDK13 (they are orthologs of CTK1 in yeast), as we demonstrated recently (PMID: 41377501). Instead, we found that CDK11/10 acts as the ortholog of Bur1 kinase from yeast, is essential for the phosphorylation of Spt5, the link of CTD of Pol II, and CDK12 (PMID: 41377501). 

      (4) Insufficient resolution of genome-wide correlations 

      Figure 1 presents only low-resolution maps, which are Insufficient o determine whether pan-acetylation and pan-phosphorylation correlate with Pol II at promoters or gene bodies. The authors should provide normalized metagene plots (from TSS to TTS) across different subgroups to visualize modification patterns at higher resolution. In addition, the genome-wide distribution of another histone PTM with a diFerent localization pattern should be included as a negative control. 

      A popular view in the field is that the majority of genomes are inactive since they do not contain coding RNAs, which are responsible for ~20,000 protein candidates characterized in animals. However, our genomewide characterization using the four histone modification marks, active Pol II, and RNA-seq, shows a different story. Figure 1 shows that most of the human genome of HEK293T is active in producing not only protein-coding RNAs but also non-coding RNAs (the majority of them). We believe that Figure 1 could change our current view of the activity of the entire genome, and should be of great interest to general readers as well as researchers on genomics. Furthermore, it is a basis for Figure 2, which is a zoom-in of Figure 1.  

      (5) Conceptual framing 

      The manuscript frequently extrapolates correlative genome-wide data to mechanistic conclusions (e.g., that pan-acetylation/phosphorylation "generate" fragile nucleosomes). Without direct biochemical or structural evidence. Such causality statements should be toned down.  

      The reviewer is right, we should tone down the strong sentences. However, we believe that our data is strong enough to derive the general conclusion. The reviewer may agree with us that the entire field of transcription and epigenetics has been stagnant in recent decades, but there is an urgent need for fresh ideas to change the current situation. Our novel discoveries, for sure, additional supporting data are needed, should open up a brand new avenue for people to explore. We believe that a new era of transcription will emerge based on our novel discoveries. We hope that this manuscript will attract more people to these topics. As Reviewer #3 pointed out, this story establishes the connection between transcription and epigenetics in the field. 

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, the authors use various genomics approaches to examine nucleosome acetylation, phosphorylation, and PolII-CTD phosphorylation marks. The results are synthesized into a hypothesis that 'fragile' nucleosomes are associated with active regions of PolII transcription. 

      Strengths: 

      The manuscript contains a lot of genome-wide analyses of histone acetylation, histone phosphorylation, and PolII-CTD phosphorylation. 

      Weaknesses: 

      This reviewer's main research expertise is in the in vitro study of transcription and its regulation in purified, reconstituted systems. 

      Actually, the pioneering work of the establishment of in vitro transcription assays at Dr. Robert Roeder’s group led to numerous groundbreaking discoveries in the transcription field. The contributions of in vitro work in the transcription field are the key for us to explore the complexity of transcription in eukaryotes in the early times and remain important currently.

      I am not an expert at the genomics approaches and their interpretation, and overall, I had a very hard time understanding and interpreting the data that are presented in this manuscript.  I believe this is due to a problem with the manuscript, in that the presentation of the data is not explained in a way that's understandable and interpretable to a non-expert.

      Thanks for your suggestions. You are right, we have problems expressing our ideas clearly in this manuscript, which could confuse. We will make modifications accordingly per your suggestions.

      For example: 

      (1) Figure 1 shows genome-wide distributions of H3K9ac, H4K12ac, Ser2phPolII, mRNA, H3S10ph, and H4S1ph, but does not demonstrate correlations/coupling - it is not clear from these data that pan-acetylation and pan-phosphorylation are coupled with Pol II transcription. 

      Figure 1 shows the overall distribution of the four major histone modifications, active Pol II, and mRNA genome-wide in human HEK293T cells. It tells general readers that the entire genome is quite active and far more than people predicted that most of the genome is inactive, since just a small portion of the genome expresses coding RNAs (~20,000 in animals). Figure 1 shows that the majority of the genome is active and expresses not only coded mRNA but also non-coding RNAs. After all, it is the basis of Figure 2, which is a zoom-in of Figure 1. However, it is beyond the scope of this manuscript to discuss the non-coding RNAs. 

      (2) Figure 2 - It's not clear to me what Figure 2 is supposed to be showing. 

      (A) Needs better explanation - what is the meaning of the labels at the top of the gel lanes? 

      Figure 2 is a zoom-in for the individual gene, which shows how histone modifications are coupled with Pol II activity on the individual gene. We will give a more detailed explanation of the figure per the reviewer’s suggestions.

      (B) This reviewer is not familiar with this technique, its visualization, or its interpretation - more explanation is needed. What is the meaning of the quantitation graphs shown at the top? How were these calculated (what is on the y-axis)? 

      Good suggestions, we will do some modifications.

      (3) To my knowledge, the initial observation of DRB eFects on RNA synthesis also concluded that DRB inhibited initiation of RNA chains (pmid:982026) - this needs to be acknowledged. 

      Thanks for the reference, which is the first report to show the DRB inhibits initiation of Pol II in vivo. We will cite it in the revision.  

      (4) Again, Figures 4B, 4C, 5, and 6 are very difficult to understand - what is shown in these heat maps, and what is shown in the quantitation graphs on top? 

      Thanks for the suggestions, we will give a more detailed description of the Figures.  

      Reviewer #3 (Public review): 

      Summary: 

      Li et al. investigated the prevalence of acetylated and phosphorylated histones (using H3K9ac, H4K12ac, H3S10ph & H4S1ph as representative examples) across the gene body of human HEK293T cells, as well as mapping elongating Pol II and mRNA. They found that histone acetylation and phosphorylation were dominant in gene bodies of actively transcribing genes. Genes with acetylation/phosphorylation restricted to the promoter region were also observed. Furthermore, they investigated and reported a correlation between histone modifications and Pol II activity, finding that inhibition of Pol II activity reduced acetylation/phosphorylation levels, while resuming Pol II activity restored them. The authors then proposed a model in which panacetylation or pan-phosphorylation of histones generates fragile nucleosomes; the first round of transcription is accompanied by panacetylation, while subsequent rounds are accompanied by panphosphorylation. 

      Strengths: 

      This study addresses a highly significant problem in gene regulation. The author provided riveting evidence that certain histone acetylation and/or phosphorylation within the gene body is correlated with Pol II transcription. The author furthermore made a compelling case that such transcriptionally correlated histone modification is dynamic and can be regulated by Pol II activity. This work has provided a clearer view of the connection between epigenetics and Pol II transcription. 

      Thanks for the insightful comments, which are exactly what we want to present in this manuscript. 

      Weaknesses: 

      The title of the manuscript, "Fragile nucleosomes are essential for RNA Polymerase II to transcribe in eukaryotes", suggests that fragile nucleosomes lead to transcription. While this study shows a correlation between histone modifications in gene bodies and transcription elongation, a causal relationship between the two has not been demonstrated. 

      Thanks for the suggestions. What we want to express is that the generation of fragile nucleosomes precedes transcription, or, more specifically, transcription elongation. The corresponding PI wrote a hypothetical model on how pan-acetylation is generated by the coupling of chromatin remodelers and acetyltransferase complexes along gene bodies, in which chromatin remodelers act as drivers to carry acetyltransferases along gene bodies to generate pan-acetylation of nucleosomes (PMID: 41425263). We have a series of work to show how “tailless nucleosomes” at +1 from transcription start sites are generated to release paused Pol II in metazoans (PMID: 28847961) (PMID: 29459673) (PMID: 32747552) (PMID: 32048991).   We still do not know how pan-phosphorylation along gene bodies is generated. It should be one of the focuses of our future research.

  3. prototype-dot.web.app prototype-dot.web.app
    1. Pour finir, notons que les constats que nous faisons peuvent d’abord nous conduire à vouloir une décroissance capitaliste ou encore une sobriété numérique mais cela reviendrait à réduire le problème du capitalisme à une question de quantité

      Oui, c'est cette réduction du monde à une "quantité" qu'il faut interroger. Non pas pour y répondre mais affirmer que le sujet politique est par définition un horizon "qualité".

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      In this well-written and timely manuscript, Rieger et al. introduce Squidly, a new deep learning framework for catalytic residue prediction. The novelty of the work lies in the aspect of integrating per-residue embeddings from large protein language models (ESM2) with a biology-informed contrastive learning scheme that leverages enzyme class information to rationally mine hard positive/negative pairs. Importantly, the method avoids reliance on the use of predicted 3D structures, enabling scalability, speed, and broad applicability. The authors show that Squidly outperforms existing ML-based tools and even BLAST in certain settings, while an ensemble with BLAST achieves state-of-the-art performance across multiple benchmarks. Additionally, the introduction of the CataloDB benchmark, designed to test generalization at low sequence and structural identity, represents another important contribution of this work.

      We thank the reviewer for their constructive and encouraging assessment of the manuscript. We appreciate the recognition of Squidly’s biology-informed contrastive learning framework with ESM2 embeddings, its scalability through the avoidance of predicted 3D structures, and the contribution of the CataloDB benchmark. We are pleased that the reviewer finds these aspects to be of value, and their comments will help us in further clarifying the strengths and scope of the work.

      The manuscript acknowledges biases in EC class representation, particularly the enrichment for hydrolases. While CataloDB addresses some of these issues, the strong imbalance across enzyme classes may still limit conclusions about generalization. Could the authors provide per-class performance metrics, especially for underrepresented EC classes?

      We thank the reviewer for raising this point. We agree that per-class performance metrics provide important insight into generalizability across underrepresented EC classes. In response, we have updated Figure 3 to include two additional panels: (i) per-EC F1, precision and recall scores, and (ii) a relative display of true positives against the total number of predictable catalytic residues. These additions allow the class imbalance to be more directly interpretable. We have also revised the text between lines 316-321 to better contextualize our generalizability claims in light of these results.

      An ablation analysis would be valuable to demonstrate how specific design choices in the algorithm contribute to capturing catalytic residue patterns in enzymes.

      We agree an ablation analysis is beneficial to show the benefits of a specific approach. We consider the main design choice in Squidly to be how we select the training pairs, hence we chose a standard design choice for the contrastive learning model. We tested the effect of different pair schemes on performance and report the results in Figure 2A and lines 244258. These results are a targeted ablation in which we evaluate Squidly against AEGAN using the AEGAN training and test datasets, while systematically varying the ESM2 model size and pair-mining scheme. As a baseline, we included the LSTM trained directly on ESM2 embeddings and random pair selection.  We showed that indeed the choice of pairs has a large impact on performance, which is significantly improved when compared to naïve pairing. This comparison suggests that performance gains are attributable to reactioninformed pair-mining strategies. We recognize that the way these results were originally presented made this ablation less clear. We have revised the wording in the Results section (lines 244-247) and updated the caption to Figure 2A to emphasize the purpose of this section of the paper.

      The statement that users can optionally use uncertainty to filter predictions is promising but underdeveloped. How should predictive entropy values be interpreted in practice? Is there an empirical threshold that separates high- from low-confidence predictions? A demonstration of how uncertainty filtering shifts the trade-off between false positives and false negatives would clarify the practical utility of this feature.

      Thank you for the suggestion. Your comment prompted us to consider what is the best way to represent the uncertainty and, additionally, what is the best metric to return to users and how to visualize the results. Based on this, we included several new figures (Figure 3H and Supplementary Figures S3-5). We used these figures to select the cutoffs (mean prediction of 0.6, and variance < 0.225) which were then set as the defaults in Squidly, and used in all subsequent analyses. The effect of these cutoffs is most evident in the tradeoff of precision and recall. Hence users may opt to select their own filters based on the mean prediction and variance across the predictions, and these cutoffs can be passed as command line parameters to Squidly. The choice to use a consistent default cutoff selected using the Uni3175 benchmark has slightly improved the reported performance for the benchmarks seen in table 1, and figure 3C. However, our interpretation remains the same.

      The excerpt highlights computational efficiency, reporting substantial runtime improvements (e.g., 108 s vs. 5757 s). However, the comparison lacks details on dataset size, hardware/software environment, and reproducibility conditions. Without these details, the speedup claim is difficult to evaluate. Furthermore, it remains unclear whether the reported efficiency gains come at the expense of predictive performance

      Thank you for pointing out this limitation in how we presented the runtime results. We have rerun the tests and updated the table. An additional comment is added underneath, which details the hardware/software environment used to run both tools, as well as that the Squidly model is the ensemble version. As per the relationship between efficiency gains and predictive performance, both 3B and 15B models are benchmarked side by side across the paper.

      Compared to the tools we were able to comprehensively benchmark, it does not come at a cost. However, we note that the increased benefits in runtime assume that a structure must be folded, which is not the case for enzymes already present in the PDB. If that is the case, then it is likely already annotated and, in those cases, we recommend using BLAST which is superior in terms of run time than either Squidly or a structure-based tool and highly accurate for homologous or annotated sequences.

      Given the well-known biases in public enzyme databases, the dataset is likely enriched for model organisms (e.g., E. coli, yeast, human enzymes) and underrepresents enzymes from archaea, extremophiles, and diverse microbial taxa. Would this limit conclusions about Squidly's generalizability to less-studied lineages?

      The enrichment for model organisms in public enzyme databases may indeed affect both ESM2 and Squidly when applied to underrepresented lineages such as archaea, extremophiles, and diverse microbial taxa. We agree that this limitation is significant and have adjusted and expanded the previous discussion of benchmarking limitations accordingly (lines 358, 369). We thank the reviewer for highlighting this issue, which has helped us to improve the transparency and balance of the manuscript.

      Reviewer #2:

      The authors aim to develop Squidly, a sequence-only catalytic residue prediction method. By combining protein language model (ESM2) embedding with a biologically inspired contrastive learning pairing strategy, they achieve efficient and scalable predictions without relying on three-dimensional structure. Overall, the authors largely achieved their stated objectives, and the results generally support their conclusions. This research has the potential to advance the fields of enzyme functional annotation and protein design, particularly in the context of screening large-scale sequence databases and unstructured data. However, the data and methods are still limited by the biases of current public databases, so the interpretation of predictions requires specific biological context and experimental validation.

      Strengths:

      The strengths of this work include the innovative methodological incorporation of EC classification information for "reaction-informed" sample pairing, thereby enhancing the discriminative power of contrastive learning. Results demonstrate that Squidly outperforms existing machine learning methods on multiple benchmarks and is significantly faster than structure prediction tools, demonstrating its practicality.

      Weaknesses:

      Disadvantages include the lack of a systematic evaluation of the impact of each strategy on model performance. Furthermore, some analyses, such as PCA visualization, exhibit low explained variance, which undermines the strength of the conclusions.

      We thank the reviewer for their comments and feedback. 

      The authors state that "Notably, the multiclass classification objective and benchmarks used to evaluate EasIFA made it infeasible to compare performance for the binary catalytic residue prediction task." However, EasIFA has also released a model specifically for binary catalytic site classification. The authors should include EasIFA in their comparisons in order to provide a more comprehensive evaluation of Squidly's performance.

      We thank the reviewer for raising this point. EasIFA’s binary classification task includes catalytic, binding, and “other” residues, which differs from Squidly’s strict catalytic residue prediction. This makes direct comparison non-trivial, which is why we originally had opted to not benchmark against EasIFA and instead highlight it in our discussion.

      Given your comment, we did our best to include a benchmark that could give an indication of a comparison between the two tools. To do this, we filtered EasIFA’s multiclass classification test dataset for a non-overlapping subset with Squidly and AEGAN training data and <40% sequence identity to all training sets. This left only 66 catalytic residue– containing sequences that we could use as a held-out test set from both tools. We note it is not directly equal as Squidly and AEGAN had lower average identity to this subset (8.2%) than EasIFA (23.8%), placing them at a relative disadvantage.

      We also identified a potential limitation in EasIFA’s original recall calculation, where sequences lacking catalytic residues were assigned a recall of 0. We adapted this to instead consider only the sequences which do have catalytic residues, which increased recall across all models. With the updated evaluation, EasIFA continues to show strong performance, consistent with it being SOTA if structural inputs are available. Squidly remains competitive given it operates solely from sequence and has a lower sequence identity to this specific test set.

      Due to the small and imbalanced benchmark size, differences in training data overlap, and differences in our analysis compared with the original EasIFA analysis, we present this comparison in a new section (A.4) of the supplementary information rather than in the main text. References to this section have been added in the manuscript at lines 265-268. Additionally, we do update the discussion and emphasize the potential benefits of using EasIFA at lines (353-356).

      The manuscript proposes three schemes for constructing positive and negative sample pairs to reduce dataset size and accelerate training, with Schemes 2 and 3 guided by reaction information (EC numbers) and residue identity. However, two issues remain:

      (a) The authors do not systematically evaluate the impact of each scheme on model performance.

      (b) In the benchmarking results, it is not explicitly stated which scheme was used for comparison with other models (e.g., Table 1, Figure 6, Figure 8). This lack of clarity makes it difficult to interpret the results and assess reproducibility.

      (c) Regarding the negative samples in Scheme 3 in Figure 1, no sampling patterns are shown for residue pairs with the same amino acid, different EC numbers, and both being catalytic residues.

      We thank the reviewer for these suggestions, which enabled us to improve the clarity and presentation of the manuscript. Please find our point by point response:

      (a) We thank the reviewer for highlighting the lack of clarity in the way we have presented our evaluation in the section describing the Uni3175 benchmark. We aimed to systematically evaluate the impact of each scheme using the Uni3175 benchmark and refer to these results at lines 244-258, Additionally, we have adjusted the presentation of this section at lines 244-247 also in line with related comments from reviewer 1 in order to make the intention of this section and benchmark results to allow a comparison of each scheme to baseline models and AEGAN. These results led us to use Scheme 3 in both models for the other benchmarks in Figures 2 and 3. Please let us know if there is anything we can do to further improve the interpretability of Squidly’s performance.

      (b) We thank the reviewer for highlighting this issue and improving the clarity of our manuscript. We agree that after the Uni3175 benchmark was used to evaluate the schemes, we did not clearly state in the other benchmarks that scheme 3 was chosen for both the 3B and 15B models. We have made changes in table 1 and the Figure legends of Figures 2 and 3 to state that scheme 3 was used. In addition, we integrated related results into panel figures (e.g. Figures 2 and 3 now show models trained and tested on consistent benchmark datasets) and standardized figure colors and legend formatting throughout. Furthermore, we suspect that the previous switch from using the individual vs ensembled Squidly models during the paper was not well indicated, and likely to confuse the reader. Therefore, we decided to consistently report the ensembled Squidly models for all benchmarks except in the ablation study (Figure 2A). In line with this, we altered the overview Figure 1A, so that it is clearer that the default and intended version of Squidly is the ensemble.

      (c) We appreciate the reviewer pointing this out. You’re correct, we explicitly did not sample the negatives described by the reviewer in scheme 3 as our focus was on the hard negatives that relate most to the binary objective.  We do think this is a great idea and would be worth exploring further in future versions of Squidly, where we will be expanding the label space used for hard-negative sampling and including binding sites in our prediction. We have updated the discussion at lines 395-396 to highlight this potential direction.

      The PCA visualization (Figure 3) explains very little variance (~5% + 1.8%), but its use to illustrate the separability of embedding and catalytic residues may overinterpret the meaning of the low-dimensional projection. We question whether this figure is appropriate for inclusion in the main text and suggest that it be moved to the Supporting Information.

      We thank the reviewer for this suggestion. We had discussed this as well, and in the end decided to include it in the main manuscript. We agree that the explained variance is low. However, when we first saw the PCA we were surprised that there was any separation at all. This then prompted us to investigate further, so we kept it in the manuscript to be true to the scientific story. However, we do agree that our interpretation could be interpreted as overly conclusive given the minimal variance explained by the top 2 PCs. Therefore, we agree with the assessment that the figure, alongside the accompanying results section, is more appropriately placed in the supplementary information. We moved this section (A.1) to the appendix to still explain the exploratory data analysis process that we used to tackle this problem, so that the general thought process behind Squidly is available for further reading.  

      Minor Comments:

      (1) Figure Quality and Legends a) In Figure 4, the legend is confusing: "Schemes 2 and 3 (S1 and S2) ..." appears inconsistent, and the reference to Scheme 3 (S3) is not clearly indicated.

      (b) In Figure 6, the legend overlaps with the y-axis labels, reducing readability. The authors should revise the figures to improve clarity and ensure consistent notation.

      The reviewer correctly notes inconsistencies in figure presentation. We have revised the legend of Figure 4 (now 2A) to ensure schemes are referred to consistently and Scheme 3 (S3) is clearly indicated. We also adjusted Figure 6 (now 2c) to remove the overlap between the legend and y-axis labels.  

      Conclusion

      We thank the reviewers and editor again for their constructive input. We believe the revisions and clarifications substantially strengthened the manuscript and the resource

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript reports a prospective longitudinal study examining whether infants with high likelihood (HL) for autism differ from low-likelihood (LL) infants in two levels of word learning: brain-to-speech cortical entrainment and implicit word segmentation. The authors report reduced syllable tracking and post-learning word recognition in the HL group relative to the LL group. Importantly, both the syllable-tracking entrainment measure and the word recognition ERP measure are positively associated with verbal outcomes at 18-20 months, as indexed by the Mullen Verbal Developmental Quotient. Overall, I found this to be a thoughtfully designed and carefully executed study that tackles a difficult and important set of questions. With some clarifications and modest additional analyses or discussion on the points below, the manuscript has strong potential to make a substantial contribution to the literature on early language development and autism.

      Strengths:

      This is an important study that addresses a central question in developmental cognitive neuroscience: what mechanisms underlie variability in language learning, and what are the early neural correlates of these individual differences? While language development has a relatively well-defined sensitive period in typical development, the mechanisms of variability - particularly in the context of neurodevelopmental conditions - remain poorly understood, in part because longitudinal work in very young infants and toddlers is rare. The present study makes a valuable contribution by directly targeting this gap and by grounding the work in a strong theoretical tradition on statistical learning as a foundational mechanism for early language acquisition.

      I especially appreciate the authors' meticulous approach to data quality and their clear, transparent description of the methods. The choice of partial least squares correlation (PLS-c) is well motivated, given the multidimensional nature of the data and collinearity among variables, and the manuscript does a commendable job explaining this technique to readers who may be less familiar with it.

      The results reveal interesting developmental changes in syllable tracking and word segmentation from birth to 2 years in both HL and LL infants. Simply mapping these trajectories in both groups is highly valuable. Moreover, the associations between neural indices of brain-to-speech entrainment and word segmentation with later verbal outcomes in the LL group support a critical role for speech perception and statistical learning in early language development, with clear implications for understanding autism. Overall, this is a rich dataset with substantial potential to inform theory.

      Weaknesses:

      (1) Clarifying longitudinal vs. concurrent associations

      Because the current analytical approach incorporates all time points, including the final visit, it is challenging to determine to what extent the brain-language associations are driven by longitudinal relationships vs. concurrent correlations at the last time point. This does not undermine the main findings, but clarifying this issue could significantly enhance the impact of the individual-differences results. If feasible, the authors might consider (a) showing that a model excluding the final visit still predicts verbal outcomes at the last visit in a similar way, or (b) more explicitly acknowledging in the discussion that the observed associations may be partly or largely driven by concurrent correlations. Either approach would help readers interpret the strength and nature of the longitudinal claims.

      (2) Incorporating sleep status into longitudinal models

      Sleep status changes systematically across developmental stages in this cohort. Given that some of the papers cited to justify the paradigm also note limitations in speech entrainment and word segmentation during sleep or in patients with impaired consciousness, it would be helpful to account for sleep more directly. Including sleep status as a factor or covariate in the longitudinal models, or at least elaborating more fully on its potential role and limitations, would further strengthen the conclusions and reassure readers that these effects are not primarily driven by differences in sleep-wake state.

      (3) Use of PLS-c and potential group × condition interactions

      I am relatively new to PLS-c. One question that arose is whether PLS-c could be extended to handle a two-way interaction between group and condition contrasts (STR vs. RND). If so, some of the more complex supplementary models testing developmental trajectories within each group (Page 8, Lines 258-265) might be more directly captured within a single, unified framework. Even a brief comment in the methods or discussion about the feasibility (or limitations) of modeling such interactions within PLS-c would be informative for readers and could streamline the analytic narrative.

      (4) STR-only analyses and the role of RND

      Page 8, Lines 241-245: This analysis is conducted only within the STR condition. The lack of group difference observed here appears consistent with the lack of group difference in word-level entrainment (Page 9, Lines 292-294), suggesting that HL and LL groups may not differ in statistical learning per se, but rather in syllabic-level entrainment. As a useful sanity check and potential extension, it might be informative to explore whether syllable-level entrainment in the RND condition differs between groups to a similar extent as in Figure 2C-D. In other work (e.g., adults vs. children; Moreau et al., 2022), group differences can be more pronounced for syllable-level than for word-level entrainment. Figure S6 seems to hint that a similar pattern may exist here. If feasible, including or briefly reporting such an analysis could help clarify the asymmetry between the two learning measures and further support the interpretation of syllabic-level differences.

      (5) Multi-speaker input and voice perception (Page 15, Lines 475-483)

      The multi-speaker nature of the speech input is an interesting and ecologically relevant feature of the design, but it does add interpretive complexity. The literature on voice perception in autism is still mixed: for example, Boucher et al. (2000) reported no differences in voice recognition and discrimination between children with autism and language-matched non-autistic peers, whereas behavioral work in autistic adults suggests atypical voice perception (e.g., Schelinski et al., 2016; Lin et al., 2015). I found the current interpretation in this paragraph somewhat difficult to follow, partly because the data do not directly test how HL and LL infants integrate or suppress voice information. I think the authors could strengthen this section by slightly softening and clarifying the claims.

      (6) Asymmetry between EEG learning measures

      Page 16, Lines 502-507 touches on the asymmetry between the two EEG learning measures but leaves some questions for the reader. The presence of word recognition ERPs in the LL group suggests that a failure to suppress voice information during learning did not prevent successful word learning. At the same time, there is an interesting complementary pattern in the HL group, who show LL-like word-level entrainment but does not exhibit robust word recognition. Explicitly discussing this asymmetry - why HL infants might show relatively preserved word-level entrainment yet reduced word recognition ERPs, whereas LL infants show both - would enrich the theoretical contribution of the manuscript.

      References:

      (1) Moreau, C. N., Joanisse, M. F., Mulgrew, J., & Batterink, L. J. (2022). No statistical learning advantage in children over adults: Evidence from behaviour and neural entrainment. Developmental Cognitive Neuroscience, 57, 101154. https://doi.org/10.1016/j.dcn.2022.101154

      (2) Boucher, J., Lewis, V., & Collis, G. M. (2000). Voice processing abilities in children with autism, children with specific language impairments, and young typically developing children. Journal of Child Psychology and Psychiatry, 41(7), 847-857. https://doi.org/10.1111/1469-7610.00672

      (3) Schelinski, S., Borowiak, K., & von Kriegstein, K. (2016). Temporal voice areas exist in autism spectrum disorder but are dysfunctional for voice identity recognition. Social Cognitive and Affective Neuroscience, 11(11), 1812-1822. https://doi.org/10.1093/scan/nsw089

      (4) Lin, I.-F., Yamada, T., Komine, Y., Kato, N., Kato, M., & Kashino, M. (2015). Vocal identity recognition in autism spectrum disorder. PLOS ONE, 10(6), e0129451. https://doi.org/10.1371/journal.pone.0129451

    1. Free Acceess to Get Before I Forget

      <span style="color: #0000ff;">by Tory Henwood Hoen</span>

       

      <span style="background-color: #3366ff;"> Download Now </span>

       

       

      Overview :

          A funny, heartfelt, late coming-of-age story that examines the role of memory in holding us back—and in moving us forwardCall it inertia. Call it a quarter-life crisis. Whatever you call it, Cricket Campbell is stuck. Despite working at a zeitgeist-y wellness company, the twenty-six-year-old feels anything but well. Still adrift after a tragedy that upended her world a decade ago, she has entered early adulthood under the weight of a new burden: her father’s Alzheimer’s diagnosis.When Cricket’s older sister Nina announces it is time to move Arthur from his beloved Adirondack lake house into a memory-care facility, Cricket has a better idea. In returning home to become her father’s caretaker, she hopes to repair their strained relationship and shake herself out of her perma-funk. But even deeply familiar places can hold surprises.As Cricket settles back into the family house at Catwood Pond―a place she once loved, but hasn’t visited since she was a teenager―she discovers that her father possesses a rare gift: as he loses his grasp of the past, he is increasingly able to predict the future. Before long, Arthur cements his reputation as an unlikely oracle, but for Cricket, believing in her father’s prophecies might also mean facing the most painful parts of her history. As she begins to remember who she once was, she uncovers a vital truth: the path forward often starts by going back. The themes explored throughout the book are introduced gradually, allowing readers to absorb them naturally. Rather than overwhelming the audience, the author integrates these ideas into the story’s progression. This subtlety enhances impact. The book’s conclusion feels well-aligned with the journey that precedes it. Rather than offering an abrupt or exaggerated ending, it provides a resolution that reflects the themes and character development established earlier. This thoughtful closure enhances overall satisfaction.

      Include Format in : √pdf, √e-book, √book, √ePub

    1. Universidad Autónoma De San Luis Potosí: Situación estudiantil, matríc ulas y graduaciones [Internet]. [cited 2025 Mar 16]. Available from: https://www.economia.gob.mx/datamexico/es/profile/institution/universi dad-autonoma-de-san-luis-potosi 99. Zulfiqar H, Sankari A, Rahman O. Vaping-Associated Pulmonary Injury. In: StatPearls. Treasure Island (FL): StatPearls Publishing; 2025 [cited 2025 Mar 13]. Available from: http://www.ncbi.nlm.nih.gov/books/NBK560656/

      la referencias en Vancouver no van por orden de aparición?

    1. igura 1../. Diagrama de Ishikawa con posibles problemáticas y vías de solución planteadas

      checa los números de figuras y esta no se ve nada

  4. www.scratchapixel.com www.scratchapixel.com
    1. If B is a unit vector, the operation A⋅B yields ||A||cos⁡(θ), signifying the magnitude of A's projection in B's direction, with a negative sign if the direction is reversed. This is termed the scalar projection of A onto B.

      Chat with Gemini:<br /> Lee:

      I'm taking an online course of 3D graphics. Geometry (as it pertains to 3D graphics on computers) is the current focus. The text just said:

      If B is a unit vector, the operation A⋅B yields ∥A∥cos(θ), signifying the magnitude of A's projection in B's direction, with a negative sign if the direction is reversed. This is termed the scalar projection of A onto B. Can you explain this better to me? What does "magnitude of A's (a non-unit vector) projection in B's (a unit vector) direction " mean?

      Gemini:<br /> This is one of the most fundamental concepts in 3D graphics, but the math jargon can sometimes obscure the simple visual reality.

      To understand "the magnitude of A's projection in B's direction," the best tool is the Shadow Analogy.

      1. The Shadow Analogy

      Imagine Vector A is a stick floating in space. Vector B represents the ground (or a specific direction along the ground).

      If you were to shine a light directly from above (perpendicular to the ground B), the stick (A) would cast a shadow onto the ground.

      • The Projection: The "shadow" itself is the projection.
      • The Magnitude: The length of that shadow is the "magnitude of the projection."

      So, when the text says "magnitude of A's projection in B's direction," it is simply asking: "How much of Vector A is pointing in the same direction as Vector B?"

      2. Why does B have to be a Unit Vector?

      This is the crucial part that often trips people up.

      In 3D graphics, a Unit Vector has a length of exactly 1. It is used purely to point the way; it contains direction information but no "scale" or "weight."

      The formula for the dot product is:

      ** MISSING FORMULAS HERE **

      If B is a unit vector, then . The formula simplifies to:

      If B wasn't a unit vector (say, it was 5 units long), the result would be the length of the shadow multiplied by 5. That gives you a useless number if you just wanted to measure the shadow. By ensuring B is length 1, the math gives you the exact, unscaled length of the shadow of A.

      3. Visualizing the Angles (The "Negative" Sign)

      The text mentioned a "negative sign if the direction is reversed." This happens based on the angle () between the two vectors.

      • Acute Angle (): The vectors point generally in the same direction. The shadow falls forward along B. The result is Positive.
      • Right Angle (): The vectors are perpendicular. There is no shadow (it's just a dot). The result is Zero.
      • Obtuse Angle (): The vectors point generally away from each other. The shadow falls backward behind the start of B. The result is Negative.

      Summary Table

      ``` | Concept | The Visual | The Math | | ----------------- | ---------------------------------------- | ------------- | | Vector A | The object casting the shadow. | ∥A∥ | | Vector B (Unit) | The ground/direction we measure against. | ∥B∥=1 | | Dot Product (A⋅B) | The length of the shadow. | A⋅B=∥A∥cos(θ) |

      ```

      Why do you need this in 3D Graphics?

      You will use this constantly. Here are two immediate examples:

      1. Lighting: If B is the surface normal (pointing out of a polygon) and A is the light direction, the dot product tells you how directly the light hits the surface. High projection = Bright light. Negative projection = The light is hitting the back of the polygon (it's dark).<br />
      2. Mario running up a slope: If B is the direction of the slope and A is Mario's velocity, the projection tells you how fast he is actually moving up the hill vs just moving forward.

      Lee:

      So if B was pointing directly along the +X axis (X=1, Y=0, Z=0) and A was angled up from that and longer (X=2, Y=2, Z=0), A's magnitude would be ~2.83. But A⋅B would be 2 (It's X coordinate in this specific scenario). If you made a triangle with A being one side and B specifying the direction of the adjacent side, and drew a straight line from the end of A straight down to and perpendicular to the direction B, the length of that "B" side would be A⋅B assuming ∥B∥ was 1. Does that sound correct?

      Gemini:<br /> Yes, that is exactly correct. You have perfectly grasped the geometric interpretation of the dot product.

      Your example scenario is spot on. Here is the breakdown of why your visualization works, mapping your description to the standard trigonometry terms.

      1. The Math Check

      Using your numbers:

      • Vector A: (Magnitude )
      • Vector B: (Unit vector along X)

      The calculation is:

      You correctly identified that because B is aligned perfectly with the X-axis, the "shadow" of A is simply its X-coordinate (2).

      2. The Triangle Visualization

      You described making a triangle. This is the exact geometric definition used in physics and graphics.

      • The Hypotenuse: This is your Vector A (length ).
      • The "Drop" Line: The line you drew "straight down to and perpendicular to B" is the opposite side of the triangle.
      • The Adjacent Side: The side lying along the direction of B is the Scalar Projection. Its length is exactly the dot product (2).

      In trigonometry terms:

      3. One Crucial Detail

      You added the condition: "assuming ∥B∥ was 1."

      This is the most important part of your insight. If Vector B had been (2, 0, 0) (length of 2), the dot product would have been:

      (2×2) + (2×0) + (0×0) = 4

      The math would return 4, but the "shadow" is physically still only 2 units long. The dot product scales the result by the magnitude of B. That is why we must normalize B (make it a unit vector) before using it as a "measuring stick."


      You have the concept of Scalar Projection (the length of the shadow) down cold.

  5. milenio-nudos.github.io milenio-nudos.github.io
    1. ation.

      Creo que entre medio de este párrafo y el que sigue hay que3 señalar que las autoeficacias tienen un efecto distinto con CIL consistente entre olas y entre paises. Esto porque es el hallazgo más importante y sólido, y queda bien con el relato. De lo contrario no se cacha el por qué de esta separación. El último parrafo lo dedicaría exclusivamente a las relaciones que tienen los constructos con otras variables: NSE, genero, etc etc. De este modo tampoco se mezclan con esta relación que es la principal para este tipo de estudios.

    1. Reviewer #2 (Public review):

      In this manuscript, the authors built upon the Connectome Model literature and proposed SynaptoGen, a differentiable model that explicitly takes into account multiplicity and conductance in neural connectivity. The authors evaluated SynaptoGen through simulated reinforcement learning tasks and established its performance as often superior to two considered baselines. This work is a valuable addition to the field, supported by a solid methodology with some details and limitations missing.

      Major points:

      (1) The genetic features in the X and Y matrices in the CM were originally introduced as combinatorial gene expression patterns that correspond to the presence and even absence of a subset of genes. The authors oversimplify this original scope by only considering single-gene expression features. While this was arguably a reasonable first approximation for a case study of gap junctions in C. elegans, it is by no means expected to be a plausible expectation for chemical synapses. As the authors appear to motivate their model by chemical synapses that have polarities, they should either consider combinatorial rules in the model or at least present this explicitly as a key limitation of the model. Omitting combinatorial effects also renders the presented "bioplausible" baseline much less bioplausible, likely calling for a different name.

      (2) It is not fully explained how Equation (11) is obtained, even conceptually. It is unclear why \bar{B} and \bar{G} should be element-wise multiplied together, both already being expected values. Moreover, the authors acknowledged in lines 147-149 that the components of \bar{G} actually depend on gene expression X, which is a component in \bar{B}, so the logic here seems circular.

      (3) The authors considered two baselines, namely SNES and a bioplausible control. However, it would be of interest to also investigate: a) Vanilla DQN with the same size trained on the same MLP, to judge whether the biological insights behind SynaptoGen parameterization add value to performance. b) Using Equation (7) instead of Equation (11) to construct the weight matrices, to judge whether incorporating the conductance adds value to performance.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary: 

      Zhang and colleagues examine neural representations underlying abstract navigation in the entorhinal cortex (EC) and hippocampus (HC) using fMRI. This paper replicates a previously identified hexagonal modulation of abstract navigation vectors in abstract space in EC in a novel task involving navigating in a conceptual Greeble space. In HC, the authors claim to identify a three-fold signal of the navigation angle. They also use a novel analysis technique (spectral analysis) to look at spatial patterns in these two areas and identify phase coupling between HC and EC. Finally, the authors propose an EC-HPC PhaseSync Model to understand how the EC and HC construct cognitive maps. While the wide array of techniques used is impressive and their creativity in analysis is admirable, overall, I found the paper a bit confusing and unconvincing. I recommend a significant rewrite of their paper to motivate their methods and clarify what they actually did and why. The claim of three-fold modulation in HC, while potentially highly interesting to the community, needs more background to motivate why they did the analysis in the first place, more interpretation as to why this would emerge in biology, and more care taken to consider alternative hypotheses seeped in existing models of HC function. I think this paper does have potential to be interesting and impactful, but I would like to see these issues improved first.

      General comments:

      (1) Some of the terminology used does not match the terminology used in previous relevant literature (e.g., sinusoidal analysis, 1D directional domain).

      We thank the reviewer for this valuable suggestion, which helps to improve the consistency of our terminology with previous literature and to reduce potential ambiguity. Accordingly, we have replaced “sinusoidal analysis” with “sinusoidal modulation” (Doeller et al., 2010; Bao et al., 2019; Raithel et al., 2023) and “1D directional domain” with “angular domain of path directions” throughout the manuscript.

      (2) Throughout the paper, novel methods and ideas are introduced without adequate explanation (e.g., the spectral analysis and three-fold periodicity of HC).

      We thank the reviewer for raising this important point. In the revised manuscript, we have substantially extended the Introduction (paragraphs 2–4) to clarify our hypothesis, explicitly explaining why the three primary axes of the hexagonal grid cell code may manifest as vector fields. We have also revised the first paragraph of the “3-fold periodicity in the HPC” section in the Results to clarify the rationale for using spectral analysis. Please refer to our responses to comment 2 and 3 below for details.

      Reviewer #2 (Public review):

      The authors report results from behavioral data, fMRI recordings, and computer simulations during a conceptual navigation task. They report 3-fold symmetry in behavioral and simulated model performance, 3-fold symmetry in hippocampal activity, and 6-fold symmetry in entorhinal activity (all as a function of movement directions in conceptual space). The analyses are thoroughly done, and the results and simulations are very interesting.

      We sincerely thank the reviewer for the positive and encouraging comments on our study.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) This paper has quite a few spelling and grammatical mistakes, making it difficult to understand at times.

      We apologize for the wordings and grammatical errors. We have thoroughly re-read and carefully edited the entire manuscript to correct typographical and grammatical errors, ensuring improved clarity and readability.

      (2) Introduction - It's not clear why the three primary axes of hexagonal grid cell code would manifest as vector fields.

      We thank the reviewer for raising this important point. In the revised Introduction (paragraphs 2, 3, and 4), we now explicitly explain the rationale behind our hypothesis that the three primary axes of the hexagonal grid cell code manifest as vector fields.

      In paragraph 2, we present empirical evidence from rodent, bat, and human studies demonstrating that mental simulation of prospective paths relies on vectorial representations in the hippocampus (Sarel et al., 2017; Ormond and O’Keefe, 2022; Muhle-Karbe et al., 2023).

      In paragraphs 3 and 4, we introduce our central hypothesis: vectorial representations may originate from population-level projections of entorhinal grid cell activity, based on three key considerations:

      (1) The EC serves as the major source of hippocampal input (Witter and Amaral, 1991; van Groen et al., 2003; Garcia and Buffalo, 2020).

      (2) Grid codes exhibit nearly invariant spatial orientations (Hafting et al., 2005; Gardner et al., 2022), which makes it plausible that their spatially periodic activity can be detected using fMRI.

      (3) A model-based inference: for example, in the simplest case, when one mentally simulates a straight pathway aligned with the grid orientation, a subpopulation of grid cells would be activated. The resulting population activity would form a near-perfect vectorial representation, with constant activation strength along the path. In contrast, if the simulated path is misaligned with the grid orientation, the population response becomes a distorted vectorial code. Consequently, simulating all possible straight paths spanning 0°–360° results in 3-fold periodicity in the activity patterns—due to the 180° rotational symmetry of the hexagonal grid, orientations separated by 180° are indistinguishable.

      We therefore speculate that vectorial representations embedded in grid cell activity exhibit 3-fold periodicity across spatial orientations and serve as a periodic structure to represent spatial direction. Supporting this view, reorientation paradigms in both rodents and young children have shown that subjects search equally in two opposite directions, reflecting successful orientation encoding but a failure to integrate absolute spatial direction (Hermer and Spelke, 1994; Julian et al., 2015; Gallistel, 2017; Julian et al., 2018).

      (3) It took me a few reads to understand what the spectral analysis was. After understanding, I do think this is quite clever. However, this paper needs more motivation to understand why you are performing this analysis. E.g., why not just take the average regressor at the 10º, 70º, etc. bins and compare it to the average regressor at 40º, 100º bins? What does the Fourier transform buy you?

      We are sorry for the confusion. we outline the rationale for employing Fast Fourier Transform (FFT) analysis to identify neural periodicity. In the revised manuscript, we have added these clarifications into the first paragraph of the “3-fold periodicity in the HPC” subsection in the Results.

      First, FFT serves as an independent approach to cross-validate the sinusoidal modulation results, providing complementary evidence for the 6-fold periodicity in EC and the 3-fold periodicity in HPC.

      Second, FFT enables unbiased detection of multiple candidate periodicities (e.g., 3–7-fold) simultaneously without requiring prior assumptions about spatial phase (orientation). By contrast, directly comparing “aligned” versus “misaligned” angular bins (e.g., 10°/70° vs. 40°/100°) would implicitly assume knowledge of the phase offset, which was not known a priori.

      Finally, FFT uniquely allows periodicity analysis of behavioral performance, which is not feasible with standard sinusoidal GLM approaches. This methodological consistency makes it possible to directly compare periodicities across neural and behavioral domains.

      (4) A more minor point: at one point, you say it’s a spectral analysis of the BOLD signals, but the methods description makes it sound like you estimated regressors at each of the bins before performing FFT. Please clarify. 

      We apologize for the confusion. In our manuscript, we use the term spectral analysis to distinguish this approach from sinusoidal modulation analysis. Conceptually, our spectral analysis involves a three-level procedure:

      (1) First level: We estimated direction-dependent activity maps using a general linear model (GLM), which included 36 regressors corresponding to path directions, down-sampled in 10° increments.

      (2) Second level: We applied a Fast Fourier Transform (FFT) to the direction-dependent activity maps derived from the GLM to examine the spectral magnitude of potential spatial periodicities.

      (3) Third level: We conducted group-level statistical analyses across participants to assess the consistency of the observed periodicities.

      We have revised the “Spectral analysis of MRI BOLD signals” subsection in the Methods to clarify this multi-level procedure.

      (5) Figure 4a:

      Why do the phases go all the way to 2*pi if periodicity is either three-fold or six-fold? 

      When performing correlation between phases, you should perform a circular-circular correlation instead of a Pearson's correlation.

      We thank the reviewer for raising this important point. In the original Figure 4a, both EC and HPC phases spanned 0–2π because their sinusoidal phase estimates were projected into a common angular space by scaling them according to their symmetry factors (i.e., multiplying the 3-fold phase by 3 and the 6-fold phase by 6), followed by taking the modulo 2π. However, this projection forced signals with distinct intrinsic periodicities (120° vs. 60° cycles) into a shared 360° space, thereby distorting their relative angular distances and disrupting the one-to-one correspondence between physical directions and phase values. Consequently, this transformation could bias the estimation of their phase relationship.

      In the revised analysis and Figure 4a, we retained the original phase estimates derived from the sinusoidal modulation within their native periodic ranges (0–120° for 3-fold and 0–60° for 6-fold) by applying modulo operations directly. Following your suggestion, the relationship between EC and HPC phases was then quantified using circular–circular correlation (Jammalamadaka & Sengupta, 2001), as implemented in the CircStat MATLAB toolbox. This updated analysis avoids the rescaling artifact and provides a statistically stronger and conceptually clearer characterization of the phase correspondence between EC and HPC.

      (6) Figure 4d needs additional clarification:

      Phase-locking is typically used to describe data with a high temporal precision. I understand you adopted an EEG analysis technique to this reconstructed fMRI time-series data, but it should be described differently to avoid confusion. This needs additional control analyses (especially given that 3 is a multiple of 6) to confirm that this result is specific to the periodicities found in the paper.

      We thank the reviewer for this insightful comment. We have extensively revised the description of the Figure 4 to avoid confusion with EEG-based phase-locking techniques. The revised text now explicitly clarifies that our approach quantifies spatial-domain periodic coupling across path directions, rather than temporal synchronization of neural signals.

      To further address the reviewer’s concern about potential effects of the integer multiple relationship between the 3-fold HPC and 6-fold EC periodicities, we additionally performed two control analyses using the 9-fold and 12-fold EC components, both of which are also integer multiples of the 3-fold HPC periodicity. Neither control analysis showed significant coupling (p > 0.05), confirming that the observed 3-fold–6-fold coupling was specific and not driven by their harmonic relationship.

      The description of the revised Figure 4 has been updated in the “Phase Synchronization Between HPC and EC Activity” subsection of the Results.

      (7) Figure 5a is misleading. In the text, you say you test for propagation to egocentric cortical areas, but I don’t see any analyses done that test this. This feels more like a possible extension/future direction of your work that may be better placed in the discussion.

      We are sorry for the confusion. Figure 5a was intended as a hypothesis-driven illustration to motivate our analysis of behavioral periodicity based on participants’ task performance. However, we agree with the reviewer that, on its own, Figure 5a could be misleading, as it does not directly present supporting analyses.

      To provide empirical support for the interpretation depicted in Figure 5a, we conducted a whole-brain analysis (Figure S8), which revealed significant 3-fold periodic signals in egocentric cortical regions, including the parietal cortex (PC), precuneus (PCU), and motor regions.

      To avoid potential misinterpretation, we have revised the main text to include these results and explicitly referenced Figure S8 in connection with Figure 5a.

      The updated description in the “3-fold periodicity in human behavior” subsection in the Results is as follows:

      “Considering the reciprocal connectivity between the medial temporal lobe (MTL), where the EC and HPC reside, and the parietal cortex implicated in visuospatial perception and action, together with the observed 3-fold periodicity within the DMN (including the PC and PCu; Fig. S8), we hypothesized that the 3-fold periodic representations of path directions extend beyond the MTL to the egocentric cortical areas, such as the PC, thereby influencing participants' visuospatial task performance (Fig. 5a)”.

      Additionally, Figure 5a has been modified to more clearly highlight the hypothesized link between activity periodicity and behavioral periodicity, rather than suggesting a direct anatomical pathway.

      (8) PhaseSync model: I am not an expert in this type of modeling, so please put a lower weight on this comment (especially compared to some of the other reviewers). While the PhaseSync model seems interesting, it’s not clear from the discussion how this compares to current models. E.g., Does it support them by adding the three-fold HC periodicity? Does it demonstrate that some of them can't be correct because they don't include this three-fold periodicity?

      We thank the reviewer for the insightful comment regarding the PhaseSync model. We agree that further clarifying its relationship to existing computational frameworks is important.

      The EC–HPC PhaseSync model is not intended to replace or contradict existing grid–place cell models of navigation (e.g., Bicanski and Burgess, 2019; Whittington et al., 2020; Edvardsen et al., 2020). Instead, it offers a hierarchical extension by proposing that vectorial representations in the hippocampus emerge from the projections of periodic grid codes in the entorhinal cortex. Specifically, the model suggests that grid cell populations encode integrated path information, forming a vectorial gradient toward goal locations.

      To simplify the theoretical account, our model was implemented in an idealized square layout. In more complex real-world environments, hippocampal 3-fold periodicity may interact with additional spatial variables, such as distance, movement speed, and environmental boundaries.

      We have revised the final two paragraphs of the Discussion to clarify this conceptual framework and emphasize the importance of future studies in exploring how periodic activity in the EC–HPC circuit interacts with environmental features to support navigation.

      Reviewer #2 (Recommendations for the authors):

      (1) Please show a histogram of movement direction sampling for each participant.

      We thank the reviewer for this helpful suggestion. We have added a new supplementary figure (Figure S2) showing histograms of path direction sampling for each participant (36 bins of 10°). The figure is also included. Rayleigh tests for circular uniformity revealed no significant deviations from uniformity (all ps > 0.05, Bonferroni-corrected across participants), confirming that path directions were sampled evenly across 0°–360°.

      (2) Why didn’t you use participants’ original trajectories (instead of the trajectories inferred from the movement start and end points) for the hexadirectional analyses? 

      In our paradigm, participants used two MRI-compatible 2-button response boxes (one for each hand) to adjust the two features of the greebles. As a result, the raw adjustment path contained only four cardinal directions (up, down, left, right). If we were to use the raw stepwise trajectories, the analysis would be restricted to these four directions, which would severely limit the angular resolution. By instead defining direction as the vector from the start to the end position in feature space, we can expand the effective range of directions to the full 0–360°. This approach follows previous literature on abstract grid-like coding in humans (e.g., Constantinescu et al., 2016), where direction was similarly defined by the relative change between two feature dimensions rather than the literal stepwise path. We have added this clarification in the “Sinusoidal modulation” subsection of the revised method.

      (3) Legend of Figure 2: the statement "localizing grid cell activity" seems too strong because it is still not clear whether hexadirectional signals indeed result from grid-cell activity (e.g., Bin Khalid et al., eLife, 2024). I would suggest rephrasing this statement (here and elsewhere). 

      Thank you for this helpful suggestion. We have removed the statement “localizing grid cell activity” to avoid ambiguity and revised the legend of Figure 2a to more explicitly highlight its main purpose—defining how path directions and the aligned/misaligned conditions were constructed in the 6-fold modulation. We have also modified similar expressions throughout the manuscript to ensure consistency and clarity.

      (4) Legend of Figure 2: “cluster-based SVC correction for multiple comparisons” - what is the small volume you are using for the correction? Bilateral EC?

      For both Figure 2 and Figure 3, the anatomical mask of the bilateral medial temporal lobe (MTL), as defined by the AAL atlas, was used as the small volume for correction. This has been clarified in the revised Statistical Analysis section of the Methods as “… with small-volume correction (SVC) applied within the bilateral MTL”.

      (5) Legend of Figure 2: "ROI-based analysis" - what kind of ROI are you using? "corrected for multiple comparisons" - which comparisons are you referring to? Different symmetries and also the right/left hemisphere?

      In Figure 2b, the ROI was defined as a functional mask derived from the significant activation cluster in the right entorhinal cortex (EC). Since no robust clusters were observed in the left EC, the functional ROI was restricted to the right hemisphere. We indeed included Figure 2c to illustrate this point; however, we recognize that our description in the text was not sufficiently clear.

      Regarding the correction for multiple comparisons, this refers specifically to the comparisons across different rotational symmetries (3-, 4-, 5-, 6-, and 7-fold). Only the 6-fold symmetry survived correction, whereas no significant effects were detected for the other symmetries.

      We have clarified these points in the “6-fold periodicity in the EC” subsection of the result as “… The ROI was defined as a functional mask of the right EC identified in the voxel-based analysis and further restricted within the anatomical EC. These analyses revealed significant periodic modulation only at 6-fold (Figure  2c; t(32) = 3.56, p = 0.006, two-tailed, corrected for multiple comparisons across rotational symmetries; Cohen’s d = 0.62) …”.

      We have also revised the “3-fold periodicity in the HPC” subsection of the result as “… ROI analysis, using a functional mask of the HPC identified in the spectral analysis and further restricted within the anatomical HPC, indicated that HPC activity selectively fluctuated at 3-fold periodicity (Figure 3e; t(32) = 3.94, p = 0.002, corrected for multiple comparisons across rotational symmetries; Cohen’s d = 0.70) …”.

      (6) Figure 2d: Did you rotationally align 0{degree sign} across participants? Please state explicitly whether (or not) 0{degree sign} aligns with the x-axis in Greeble space.

      We thank the reviewer for this helpful question. Yes, before reconstructing the directional tuning curve in Figure 2d, path directions were rotationally aligned for each participant by subtracting the participant-specific grid orientation (ϕ) estimated from the independent dataset (odd sessions). We have now made this description explicit in the revised manuscript in the “6-fold periodicity in the EC” subsection of the Results, stating “… To account for individual difference in spatial phase, path directions were calibrated by subtracting the participant-specific grid orientation estimated from the odd sessions ...”.

      (7) Clustering of grid orientations in 30 participants: What does “Bonferroni corrected” refer to? Also, the Rayleigh test is sensitive to the number of voxels - do you obtain the same results when using pair-wise phase consistency? 

      “Bonferroni corrected” here refers to correction across participants. We have clarified this in the first paragraph of the “6-fold periodicity in the EC” subsection of the Result and in the legend of Supplementary Figure S5 as “Bonferroni-corrected across participants.”

      To examine whether our findings were sensitive to the number of voxels, we followed the reviewer’s guidance to compute pairwise phase consistency (PPC; Vinck et al., 2010) for each participant. The PPC results replicated those obtained with the Rayleigh test. We have updated the new results into the Supplementary Figure S5. We also updated the “Statistical Analysis” subsection of the Methods to describe PPC as “For the PPC (Vinck et al., 2010), significance was tested using 5,000 permutations of uniformly distributed random phases (0–2π) to generate a null distribution for comparison with the observed PPC”.

      (8) 6-fold periodicity in the EC: Do you compute an average grid orientation across all EC voxels, or do you compute voxel-specific grid orientations?

      Following the protocol originally described by Doeller et al. (2010), we estimated voxel-wise grid orientations within the EC and then obtained a participant-specific orientation by averaging across voxels within a hand-drawn bilateral EC mask. The procedure is described in detail in the “Sinusoidal modulation” subsection of the Methods.

      (9) Hand-drawn bilateral EC mask: What was your procedure for drawing this mask? What results do you get with a standard mask, for example, from Freesurfer or SPM? Why do you perform this analysis bilaterally, given that the earlier analysis identified 6-fold symmetry only in the right EC? What do you mean by "permutation corrected for multiple comparisons"?

      We thank the reviewer for raising these important methodological points. To our knowledge, no standard volumetric atlas provides an anatomically defined entorhinal cortex (EC) mask. For example, the built-in Harvard–Oxford cortical structural atlas in FSL contains only a parahippocampal region that encompasses, but does not isolate, the EC. The AAL atlas likewise does not contain an EC region. In FreeSurfer, an EC label is available, but only in the fsaverage surface space, which is not directly compatible with MNI-based volumetric group-level analyses.

      Therefore, we constructed a bilateral EC mask by manually delineating the EC according to the detailed anatomical landmarks described by Insausti et al. (1998). Masks were created using ITK-SNAP (Version 3.8, www.itksnap.org). For transparency and reproducibility, the mask has been made publicly available at the Science Data Bank (link: https://www.scidb.cn/s/NBriAn), as indicated in the revised Data and Code availability section.

      Regarding the use of a bilateral EC mask despite voxel-wise effects being strongest in the right EC. First, we did not have any a priori hypothesis regarding laterality of EC involvement before performing analyses. Second, previous studies estimated grid orientation using a bilateral EC mask in their sinusoidal analyses (Doeller et al., 2010; Constantinescu et al., 2016; Bao et al., 2019; Wagner et al., 2023; Raithel et al., 2023). We therefore followed this established approach to estimate grid orientation.

      By “permutation corrected for multiple comparisons” we refer to the family-wise error correction applied to the reconstructed directional tuning curves (Figure 2d for the EC, Figure 3f for the HPC). Specifically, directional labels were randomly shuffled 5,000 times, and an FFT was applied to each shuffled dataset to compute spectral power at each fold. This procedure generated null distributions of spectral power for each symmetry. For each fold, the 95th percentile of the maximal power across permutations was used as the uncorrected threshold. To correct across folds, the 95th percentile of the maximal suprathreshold power across all symmetries was taken as the family-wise error–corrected threshold. We have clarified this procedure in the revised “Statistical Analysis” subsection of the Methods.

      (10) Figures 3b and 3d: Why do different hippocampal voxels show significance for the sinusoidal versus spectral analysis? Shouldn’t the analyses be redundant and, thus, identify the same significant voxels? 

      We thank the reviewer for this insightful question. Although both sinusoidal modulation and spectral analysis aim to detect periodic neural activity, the two approaches are methodologically distinct and are therefore not expected to identify exactly the same significant voxels.

      Sinusoidal modulation relies on a GLM with sine and cosine regressors to test for phase-aligned periodicity (e.g., 3-fold or 6-fold), calibrated according to the estimated grid orientation. This approach is highly specific but critically depends on accurate orientation estimation. In contrast, spectral analysis applies Fourier decomposition to the directional tuning profile, enabling the detection of periodic components without requiring orientation calibration.

      Accordingly, the two analyses are not redundant but complementary. The FFT approach allows for an unbiased exploration of multiple candidate periodicities (e.g., 3–7-fold) without predefined assumptions, thereby providing a critical cross-validation of the sinusoidal GLM results. This strengthens the evidence for 6-fold periodicity in EC and 3-fold periodicity in HPC. Furthermore, FFT uniquely facilitates the analysis of periodicities in behavioral performance data, which is not feasible with standard sinusoidal GLM approaches. This methodological consistency enables direct comparison of periodicities across neural and behavioral domains.

      Additionally, the anatomical distributions of the HPC clusters appear more similar between Figure 3b and Figure 3d after re-plotting Figure 3d using the peak voxel coordinates (x = –24, y = –18), which are closer to those used for Figure 3b (x = –24, y = –20), as shown in the revised Figure 3.

      Taken together, the two analyses serve distinct but complementary purposes.

      (11) 3-fold sinusoidal analysis in hippocampus: What kind of small volume are you using to correct for multiple comparisons?

      We thank the reviewer for this comment. The same small volume correction procedure was applied as described in R4. Specifically, the anatomical mask of the bilateral medial temporal lobe (MTL), as defined by the AAL atlas, was used as the small volume for correction. This procedure has been clarified in the revised Statistical Analysis section of the Methods as following: “… with small-volume correction (SVC) applied within the bilateral MTL.”

      (12) Figure S5: “right HPC” – isn’t the cluster in the left hippocampus? 

      We are sorry for the confusion. The brain image was present in radiological orientation (i.e., the left and right orientations are flipped). We also checked the figure and confirmed that the cluster shown in the original Figure S5 (i.e., Figure S6 in the revised manuscript) is correctly labeled as the right hippocampus, as indicated by the MNI coordinate (x = 22), where positive x values denote the right hemisphere. To avoid potential confusion, we have explicitly added the statement “Volumetric results are displayed in radiological orientation” to the figure legends of all volume-based results.

      (13) Figure S5: Why are the significant voxels different from the 3-fold symmetry analysis using 10{degree sign} bins?

      As shown in R10, the apparent differences largely reflect variation in MNI coordinates. After adjusting for display coordinates, the anatomical locations of the significant clusters are in fact highly similar between the 10°-binned (Figure 3d, shown above) and the 20°-binned results (Figure S6).

      Although both analyses rely on sinusoidal modulation, they differ in the resolution of the input angular bins (10° vs. 20°). Combined with the inherent noise in fMRI data, this makes it unlikely that the two approaches would yield exactly the same set of significant voxels. Importantly, both analyses consistently reveal robust 3-fold periodicity in the hippocampus, indicating that the observed effect is not dependent on angular bin size.

      (14) Figure 4a and corresponding text: What is the unit? Phase at which frequency? Are you using a circular-circular correlation to test for the relationship?

      We thank the reviewer for raising this important point. In the revised manuscript, we have clarified that the unit of the phase values is radians, corresponding to the 6-fold periodic component in the EC and the 3-fold periodic component in the HPC. In the original Figure 4a, both EC and HPC phases—estimated from sinusoidal modulation—were analyzed using Pearson correlation. We have since realized issues with this approach, as also noted R5 to Reviewer #1.

      In the revised analysis and Figure 4a (as shown above), we re-evaluated the relationship between EC and HPC phases using a circular–circular correlation (Jammalamadaka & Sengupta, 2001), implemented in the CircStat MATLAB toolbox. The “Phase synchronization between the HPC and EC activity” subsection of the Result has been accordingly updated as following:

      “To examine whether the spatial phase structure in one region could predict that in another, we tested whether the orientations of the 6-fold EC and 3-fold HPC periodic activities, estimated from odd-numbered sessions using sinusoidal modulation with rotationally symmetric parameters (in radians), were correlated across participants. A cross-participant circular–circular correlation was conducted between the spatial phases of the two areas to quantify the spatial correspondence of their activity patterns (EC: purple dots; HPC: green dots) (Jammalamadaka & Sengupta, 2001). The analysis revealed a significant circular correlation (Figure 4a; r = 0.42, p < 0.001) …”.

      In the “Statistical analysis” subsection of the method:

      “… The relationship between EC and HPC phases was evaluated using the circular–circular correlation (Jammalamadaka & Sengupta, 2001) implemented in the CircStat MATLAB toolbox …”.

      (15) Paragraph following “We further examined amplitude-phase coupling...” - please clarify what data goes into this analysis.

      We thank the reviewer for this helpful comment. In this analysis, the input data consisted of hippocampal (HPC) phase and entorhinal (EC) amplitude, both extracted using the Hilbert transform from the reconstructed BOLD signals of the EC and HPC derived through sinusoidal modulation. We have substantially revised the description of the amplitude–phase coupling analysis in the third paragraph of the “Phase Synchronization Between HPC and EC Activity” subsection of the Results to clarify this procedure.

      (16) Alignment between EC 6-fold phases and HC 3-fold phases: Why don't you simply test whether the preferred 6-fold orientations in EC are similar to the preferred 3-fold phases in HC? The phase-amplitude coupling analyses seem sophisticated but are complex, so it is somewhat difficult to judge to what extent they are correct. 

      We thank the reviewer for this thoughtful comment. We employed two complementary analyses to examine the relationship between EC and HPC activity. In the revised Figure 4 (as shown in Figure 4 for Reviewer #1), Figure 4a provides a direct and intuitive measure of the phase relationship between the two regions using circular–circular correlation. Figure 4b–c examines whether the activity peaks of the two regions are aligned across path directions using cross-frequency amplitude–phase coupling, given our hypothesis that the spatial phase of the HPC depends on EC projections. These two analyses are complementary: a phase correlation does not necessarily imply peak-to-peak alignment, and conversely, peak alignment does not always yield a statistically significant phase correlation. We therefore combined multiple analytical approaches as a cross-validation across methods, providing convergent evidence for robust EC–HPC coupling.

      (17) Figure 5: Do these results hold when you estimate performance just based on “deviation from the goal to ending locations” (without taking path length into account)? 

      We thank the reviewer for this thoughtful suggestion. Following the reviewer’s advice, we re-estimated behavioral performance using the deviation between the goal and ending locations (i.e., error size) and path length independently. As shown in the new Figure S9, no significant periodicity was observed in error size (p > 0.05), whereas a robust 3-fold periodicity was found for path length (p < 0.05, corrected for multiple comparisons).

      We employed two behavioral metrics,(1) path length and (2) error size, for complementary reasons. In our task, participants navigated using four discrete keys corresponding to the cardinal directions (north, south, east, and west). This design inherently induces a 4-fold bias in path directions, as described in the “Behavioral performance” subsection of the Methods. To minimize this artifact, we computed the objectively optimal path length and used it to calibrate participants’ path lengths. However, error size could not be corrected in the same manner and retained a residual 4-fold tendency (see Figure S9d).

      Given that both path length and error size are behaviorally relevant and capture distinct aspects of task performance, we decided to retain both measures when quantifying behavioral periodicity. This clarification has been incorporated into the “Behavioral performance” subsection of the Methods, and the 2<sup>nd</sup> paragraph of the “3-fold periodicity in human behavior” subsection of the Results.

      (18) Phase locking between behavioral performance and hippocampal activity: What is your way of creating surrogates here?

      We thank the reviewer for this helpful question. Surrogate datasets were generated by circularly shifting the signal series along the direction axis across all possible offsets (following Canolty et al., 2006). This procedure preserves the internal phase structure within each domain while disrupting consistent phase alignment, thereby removing any systematic coupling between the two signals. Each surrogate dataset underwent identical filtering and coherence computation to generate a null distribution, and the observed coherence strength was compared with this distribution using paired t-tests across participants. The statistical analysis section has been systematically revised to incorporate these methodological details.

      (19) I could not follow why the authors equate 3-fold symmetry with vectorial representations. This includes statements such as “these empirical findings provide a potential explanation for the formation of vectorial representation observed in the HPC.” Please clarify.

      We thank the reviewer for raising this point. Please refer to our response to R2 for Reviewer #1 and the revised Introduction (paragraphs 2–4), where we explicitly explain why the three primary axes of the hexagonal grid cell code can manifest as vector fields.

      (20) It was unclear whether the sentence “The EC provides a foundation for the formation of periodic representations in the HPC” is based on the authors’ observations or on other findings. If based on the authors’ findings, this statement seems too strong, given that no other studies have reported periodic representations in the hippocampus to date (to the best of my knowledge).

      We thank the reviewer for this comment. We agree that the original wording lacked sufficient rigor. We have extensively revised the 3rd paragraph of the Discussion section with more cautious language by reducing overinterpretation and emphasizing the consistency of our findings with prior empirical evidence, as follows: “The EC–HPC PhaseSync model demonstrates how a vectorial representation may emerge in the HPC from the projections of populations of periodic grid codes in the EC. The model was motivated by two observations. First, the EC intrinsically serves as the major source of hippocampal input (Witter and Amaral, 1991; van Groen et al., 2003; Garcia and Buffalo, 2020), and grid codes exhibit nearly invariant spatial orientations (Hafting et al., 2005; Gardner et al., 2022). Second, mental planning, characterized by “forward replay” (Dragoi and Tonegawa, 2011; Pfeiffer, 2020), has the capacity to activate populations of grid cells that represent sequential experiences in the absence of actual physical movement (Nyberg et al., 2022). We hypothesize that an integrated path code of sequential experiences may eventually be generated in the HPC, providing a vectorial gradient toward the goal location. The path code exhibits regular, vector-like representations when the path direction aligns with the orientations of grid axes, and becomes irregular when they misalign. This explanation is consistent with the band-like representations observed in the dorsomedial EC (Krupic et al., 2012) and the irregular activity fields of trace cells in the HPC (Poulter et al., 2021). ”

    1. Author response:

      The following is the authors’ response to the original reviews

      A point by point response included below. Before we turn to that we want to note one change that we decided to introduce, related to generalization on unseen tissues/cell types (Figure 3a in the original submission and related question by Reviewer #2 below). This analysis was based on adding a latent “RBP state” representation during learning of condition/tissue specific splicing. The “RBP state” per condition is captured by a dedicated encoder. Our original plan was to have a paper describing a new RBP-AE model we developed in parallel, which also served as the base to capture this “RBP State”. However, we got delayed in getting this second paper finalized (it was led by other lab members, some of whom have already left the lab). This delay affected the TrASPr manuscript as TrASPr’s code should be available and analysis reproducible upon publication. After much deliberation, we decided that in order to comply with reproducibility standards while not self scooping the RBP-AE paper, we eventually decided to take out the RBP-AE and replace it with a vanilla PCA based embedding for the “RBP-State”. The PCA approach is simpler and reproducible, based on linear transformation of the RBPs expression vector into a lower dimension. The qualitative results included in Figure 3a still hold, and we also produced the new results suggested by Reviewer #2 in other GTEX tissues with this PCA based embedding (below). 

      We don’t believe the switch to PCA based embedding should have any bearing on the current manuscript evaluation but wanted to take this opportunity to explain the reasoning behind this additional change.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The authors propose a transformer-based model for the prediction of condition - or tissue-specific alternative splicing and demonstrate its utility in the design of RNAs with desired splicing outcomes, which is a novel application. The model is compared to relevant existing approaches (Pangolin and SpliceAI) and the authors clearly demonstrate its advantage. Overall, a compelling method that is well thought out and evaluated.

      Strengths:

      (1) The model is well thought out: rather than modeling a cassette exon using a single generic deep learning model as has been done e.g. in SpliceAI and related work, the authors propose a modular architecture that focuses on different regions around a potential exon skipping event, which enables the model to learn representations that are specific to those regions. Because each component in the model focuses on a fixed length short sequence segment, the model can learn position-specific features. Another difference compared to Pangolin and SpliceAI which are focused on modeling individual splice junctions is the focus on modeling a complete alternative splicing event.

      (2) The model is evaluated in a rigorous way - it is compared to the most relevant state-of-the-art models, uses machine learning best practices, and an ablation study demonstrates the contribution of each component of the architecture.

      (3) Experimental work supports the computational predictions.     

      (4) The authors use their model for sequence design to optimize splicing outcomes, which is a novel application.

      We wholeheartedly thank Reviewer #1 for these positive comments regarding the modeling approach we took to this task and the evaluations we performed. We have put a lot of work and thought into this and it is gratifying to see the results of that work acknowledged like this.

      Weaknesses:

      No weaknesses were identified by this reviewer, but I have the following comments:

      (1) I would be curious to see evidence that the model is learning position-specific representations.

      This is an excellent suggestion to further assess what the model is learning. To get a better sense of the position-specific representation we performed the following analyses:

      (1) Switching the transformers relative order: All transformers are pretrained on 3’ and 5’ splice site regions before fine-tunning for the PSI and dPSI prediction task. We hypothesized that if relative position is important, switching the order of the transformers would make a large difference on prediction accuracy. Indeed if we switch the 3’ and 5’ we see as expected a severe drop in performance, with Pearson correlation on test data dropping from 0.82 to 0.11. Next, we switched the two 5’ and 3’ transformers, observing a drop to 0.65 and 0.78 respectively. When focusing only on changing events the drop was from 0.66 to 0.54 (for 3’ SS transformers), 0.48 (for 5’ SS transformers), and 0.13 (when the 3’ and 5’ transformers flanking the alternative exon were switched). 

      (2) Position specific effect of RBPs: We wanted to test whether the model is able to learn position specific effects for RBPs. For this we focused on two RBPs, FOX (a family of three highly related RBPs), and QKI, both have a relatively well defined motif, known condition and position specific effect identified via RBP KD experiments combined with CLIP experiments (e.g. PMID: 23525800, PMID: 24637117, PMID: 32728246). For each, we randomly selected 40 highly and 40 lowly included cassette exons sequences. We then ran in-silico mutagenesis experiments where we replaced small windows of sequences with the RBP motifs (80 for RBFOX and 80 for QKI), then compared TrASPR’s predictions for the average predictions for 5 random sequences inserted in the same location. The results of this are now shown in Figure 4 Supp 3, where the y-axis represents the dPSI effect per position (x-axis), and the color represents the percentile of observed effects over inserting motifs in that position across all 80 sequences tested. We see that both RBPs have strong positional preferences for exerting a strong effect on the alternative exon. We also see differences between binding upstream and downstream of the alternative exon. These results, learned by the model from natural tissue-specific variations, recapitulate nicely the results derived from high-throughput experimental assays. However, we also note that effects were highly sequence specific. For example, RBFOX is generally expected to increase inclusion when binding downstream of the alternative exon and decrease inclusion when binding upstream. While we do observe such a trend we also see cases where the opposite effects are observed. These sequence specific effects have been reported in the literature but may also represent cases where the model errs in the effect’s direction. We discuss these new results in the revised text.

      (3) Assessing BOS sequence edits to achieve tissue-specific splicing: Here we decided to test whether BOS edits in intronic regions (at least 8b away from the nearest splice site) are important for the tissue-specific effect. The results are now included in Figure 6 Supp 1, clearly demonstrating that most of the neuronal specific changes achieved by BOS were based on changing the introns, with a strong effect observed for both up and downstream intron edits.

      (2) The transformer encoders in TrASPr model sequences with a rather limited sequence size of 200 bp; therefore, for long introns, the model will not have good coverage of the intronic sequence. This is not expected to be an issue for exons.

      The reviewer is raising a good question here. On one hand, one may hypothesize that, as the reviewer seems to suggest, TrASPr may not do well on long introns as it lacks the full intronic sequence.

      Conversely, one may also hypothesize that for long introns, where the flanking exons are outside the window of SpliceAI/Pangolin, TrASPr may have an advantage.

      Given this good question and a related one by Reviewer #2, we divided prediction accuracy by intron length and the alternative exon length.

      For short exons  (<100bp) we find TrASPr and Pangolin perform similarly, but for longer exons, especially those > 200, TrASPr results are better. When dividing samples by the total length of the upstream and downstream intron, we find TrASPr outperform all other models for introns of combined length up to 6K, but Pangolin gets better results when the combined intron length is over 10K. This latter result is interesting as it means that contrary to the second hypothesis laid out above, Pangolin’s performance did not degrade for events where the flanking exons were outside its field of view. We note that all of the above holds whether we assess all events or just cases of tissue specific changes. It is interesting to think about the mechanistic causes for this. For example, it is possible that cassette exons involving very long introns evoke a different splicing mechanism where the flanking exons are not as critical and/or there is more signal in the introns which is missed by TrASPr. We include these new results now as Figure 2 - Supp 1,2 and discuss these in the main text.

      (3) In the context of sequence design, creating a desired tissue- or condition-specific effect would likely require disrupting or creating motifs for splicing regulatory proteins. In your experiments for neuronal-specific Daam1 exon 16, have you seen evidence for that? Most of the edits are close to splice junctions, but a few are further away.

      That is another good question. Regarding Daam1 exon 16, in the original paper describing the mutation locations some motif similarities were noted to PTB (CU) and CUG/Mbnl-like elements (Barash et al Nature 2010). In order to explore this question beyond this specific case we assessed the importance of intronic edits by BOS to achieve a tissue specific splicing profile - see above.

      (4) For sequence design, of tissue- or condition-specific effect in neuronal-specific Daam1 exon 16 the upstream exonic splice junction had the most sequence edits. Is that a general observation? How about the relative importance of the four transformer regions in TrASPr prediction performance?

      This is another excellent question. Please see new experiments described above for RBP positional effect and BOS edits in intronic regions which attempt to give at least partial answers to these questions. We believe a much more systematic analysis can be done to explore these questions but such evaluation is beyond the scope of this work.

      (5) The idea of lightweight transformer models is compelling, and is widely applicable. It has been used elsewhere. One paper that came to mind in the protein realm:

      Singh, Rohit, et al. "Learning the language of antibody hypervariability." Proceedings of the National Academy of Sciences 122.1 (2025): e2418918121.

      We definitely do not make any claim this approach of using lighter, dedicated models instead of a large ‘foundation’ model has not been taken before. We believe Rohit et al mentioned above represents a somewhat different approach, where their model (AbMAP) fine-tunes large general protein foundational models (PLM) for antibody-sequence inputs by supervising on antibody structure and binding specificity examples. We added a description of this modeling approach citing the above work and another one which specifically handles RNA splicing (intron retention, PMID: 39792954).

      Reviewer #2 (Public review):

      Summary:

      The authors present a transformer-based model, TrASPr, for the task of tissue-specific splicing prediction (with experiments primarily focused on the case of cassette exon inclusion) as well as an optimization framework (BOS) for the task of designing RNA sequences for desired splicing outcomes.

      For the first task, the main methodological contribution is to train four transformer-based models on the 400bp regions surrounding each splice site, the rationale being that this is where most splicing regulatory information is. In contrast, previous work trained one model on a long genomic region. This new design should help the model capture more easily interactions between splice sites. It should also help in cases of very long introns, which are relatively common in the human genome.

      TrASPr's performance is evaluated in comparison to previous models (SpliceAI, Pangolin, and SpliceTransformer) on numerous tasks including splicing predictions on GTEx tissues, ENCODE cell lines, RBP KD data, and mutagenesis data. The scope of these evaluations is ambitious; however, significant details on most of the analyses are missing, making it difficult to evaluate the strength of the evidence. Additionally, state-of-the-art models (SpliceAI and Pangolin) are reported to perform extremely poorly in some tasks, which is surprising in light of previous reports of their overall good prediction accuracy; the reasoning for this lack of performance compared to TrASPr is not explored.

      In the second task, the authors combine Latent Space Bayesian Optimization (LSBO) with a Transformer-based variational autoencoder to optimize RNA sequences for a given splicing-related objective function. This method (BOS) appears to be a novel application of LSBO, with promising results on several computational evaluations and the potential to be impactful on sequence design for both splicing-related objectives and other tasks.

      We thank Reviewer #2 for this detailed summary and positive view of our work. It seems the main issue raised in this summary regards the evaluations: The reviewer finds details of the evaluations missing and the fact that SpliceAI and Pangolin perform poorly on some of the tasks to be surprising. We made a concise effort to include the required details, including code and data tables. In short, some of the concerns were addressed by adding additional evaluations, some by clarifying missing details, and some by better explaining where Pangolin and SpliceAI may excel vs. settings where these may not do as well. More details are given below. 

      Strengths:

      (1) A novel machine learning model for an important problem in RNA biology with excellent prediction accuracy.

      (2) Instead of being based on a generic design as in previous work, the proposed model incorporates biological domain knowledge (that regulatory information is concentrated around splice sites). This way of using inductive bias can be important to future work on other sequence-based prediction tasks.

      Weaknesses:

      (1) Most of the analyses presented in the manuscript are described in broad strokes and are often confusing. As a result, it is difficult to assess the significance of the contribution.

      We made an effort to make the tasks be specific and detailed,  including making the code and data of those available. We believe this helped improve clarity in the revised version.

      (2) As more and more models are being proposed for splicing prediction (SpliceAI, Pangolin, SpliceTransformer, TrASPr), there is a need for establishing standard benchmarks, similar to those in computer vision (ImageNet). Without such benchmarks, it is exceedingly difficult to compare models. For instance, Pangolin was apparently trained on a different dataset (Cardoso-Moreira et al. 2019), and using a different processing pipeline (based on SpliSER) than the ones used in this submission. As a result, the inferior performance of Pangolin reported here could potentially be due to subtle distribution shifts. The authors should add a discussion of the differences in the training set, and whether they affect your comparisons (e.g., in Figure 2). They should also consider adding a table summarizing the various datasets used in their previous work for training and testing. Publishing their training and testing datasets in an easy-to-use format would be a fantastic contribution to the community, establishing a common benchmark to be used by others.

      There are several good points to unpack here. Starting from the last one, we very much agree that a standard benchmark will be useful to include. For tissue specific splicing quantification we used the GTEx dataset from which we select six representative human tissues (heart, cerebellum, lung, liver, spleen, and EBV-transformed lymphocytes). In total, we collected 38394 cassette exon events quantified across 15 samples (here a ‘sample’ is a cassette exon quantified in two tissues) from the GTEx dataset with high-confidence quantification for their PSIs based on MAJIQ. A detailed description of how this data was derived is now included in the Methods section, and the data itself is made available via the bitbucket repository with the code.

      Next, regarding the usage of different data and distribution shifts for Pangolin: The reviewer is right to note there are many differences between how Pangolin and TrASPr were trained. This makes it hard to determine whether the improvements we saw are not just a result of different training data/labels. To address this issue, we first tried to finetune the pre-trained Pangolin with MAJIQ’s PSI dataset: we use the subset of the GTEx dataset described above, focusing on the three tissues analyzed in Pangolin’s paper—heart, cerebellum, and liver—for a fair comparison. In total, we obtained 17,218 events, and we followed the same training and test split as reported in the Pangolin paper. We got Pearson: 0.78 Spearman: 0.68 which are values similar to what we got without this extra fine tuning. Next, we retrained Pangolin from scratch, with the full tissues and training set used for TrASPr, which was derived from MAJIQ’s quantifications. Since our model only trained on human data with 6 tissues at the same time, we modified Pangolin from original 4 splice site usage outputs to 6 PSI outputs. We tried to take the sequence centered with the first or the second splice site of the mid exon. This test resulted in low performance (3’ SS: pearson 0.21 5’ SS: 0.26.). 

      The above tests are obviously not exhaustive but their results suggest that the differences we observe are unlikely to be driven by distribution shifts. Notably, the original Pangolin was trained on much more data (four species, four tissues each, and sliding windows across the entire genome). This training seems to be important for performance while the fact we switched from Pangolin’s splice site usage to MAJIQ’s PSI was not a major contributor. Other potential reasons for the improvements we observed include the architecture, target function, and side information (see below) but a complete delineation of those is beyond the scope of this work. 

      (3) Related to the previous point, as discussed in the manuscript, SpliceAI, and Pangolin are not designed to predict PSI of cassette exons. Instead, they assign a "splice site probability" to each nucleotide. Converting this to a PSI prediction is not obvious, and the method chosen by the authors (averaging the two probabilities (?)) is likely not optimal. It would be interesting to see what happens if an MLP is used on top of the four predictions (or the outputs of the top layers) from SpliceAI/Pangolin. This could also indicate where the improvement in TrASPr comes from: is it because TrASPr combines information from all four splice sites? Also, consider fine-tuning Pangolin on cassette exons only (as you do for your model).

      Please see the above response. We did not investigate more sophisticated models that adjust Pangolin’s architecture further as such modifications constitute new models which are beyond the scope of this work.

      (4) L141, "TrASPr can handle cassette exons spanning a wide range of window sizes from 181 to 329,227 bases - thanks to its multi-transformer architecture." This is reported to be one of the primary advantages compared to existing models. Additional analysis should be included on how TrASPr performs across varying exon and intron sizes, with comparison to SpliceAI, etc.

      This was a good suggestion, related to another comment made by Reviewer #1. Please see above our response to them with a breakdown by exon/intron length.

      (5) L171, "training it on cassette exons". This seems like an important point: previous models were trained mostly on constitutive exons, whereas here the model is trained specifically on cassette exons. This should be discussed in more detail.

      Previous models were not trained exclusively on constitutive exons and Pangolin specifically was trained with their version of junction usage across tissues. That said, the reviewer’s point is valid (and similar to ones made above) about a need to have a matched training/testing and potential distribution shifts. Please see response and evaluations described above. 

      (6) L214, ablations of individual features are missing.

      These were now added to the table which we moved to the main text (see table also below).

      (7) L230, "ENCODE cell lines", it is not clear why other tissues from GTEx were not included.

      Good question. The task here was to assess predictions in unseen conditions, hence we opted to test on completely different data of human cell lines rather than additional tissue samples. Following the reviewers suggestion we also evaluated predictions on two additional GTEx tissues, Cortex and Adrenal Gland. These new results, as well as the previous ones for ENCODE, were updated to use the PCA based embedding of “RBP-State” as described above. We also compared the predictions using the PCA based embedding of the “RBP-State” to training directly on data (not the test data of course) from these tissues. See updated Figure 3a,b. Figure 3 Supp 1,2.

      (8) L239, it is surprising that SpliceAI performs so badly, and might suggest a mistake in the analysis. Additional analysis and possible explanations should be provided to support these claims. Similarly, the complete failure of SpliceAI and Pangolin is shown in Figure 4d.

      Line 239 refers to predicting relative inclusion levels between competing 3’ and 5’ splice sites. We admit we too expected this to be better for SpliceAI and Pangolin but we were not able to find bugs in our analysis (which is all made available for readers and reviewers alike). Regarding this expectation to perform better, first we note that we are not aware of a similar assessment being done for either of those algorithms (i.e. relative inclusion for 3’ and 5’ alternative splice site events). Instead, our initial expectation, and likely the reviewer’s as well, was based on their detection of splice site strengthening/weakening due to mutations, including cryptic splice site activation. More generally though, it is worth noting in this context that given how SpliceAI, Pangolin and other algorithms have been presented in papers/media/scientific discussions, we believe there is a potential misperception regarding tasks that SpliceAI and Pangolin excel at vs other tasks where they should not necessarily be expected to excel. Both algorithms focus on cryptic splice site creation/disruption. This has been the focus of those papers and subsequent applications.  While Pangolin added tissue specificity to SpliceAI training, the authors themselves admit “...predicting differential splicing across tissues from sequence alone is possible but remains a considerable challenge and requires further investigation”. The actual performance on this task is not included in Pangolin’s main text, but we refer Reviewer #2 to supplementary figure S4 in the Pangolin manuscript to get a sense of Pangolin’s reported performance on this task. Similar to that, Figure 4d in our manuscript is for predicting ‘tissue specific’ regulators. We do not think it is surprising that SpliceAI (tissue agnostic) and Pangolin (slight improvement compared to SpliceAI in tissue specific predictions) do not perform well on this task. Similarly, we do not find the results in Figure 4C surprising either. These are for mutations that slightly alter inclusion level of an exon, not something SpliceAI was trained on - SpiceAI was trained on genomic splice sites with yes/no labels across the genome. As noted elsewhere in our response, re-training Pangolin on this mutagenesis dataset results in performance much closer to that of TrASPr. That is to be expected as well - Pangolin is constructed to capture changes in PSI (or splice site usage as defined by the authors), those changes are not even tissue specific for the CD19 data and the model has no problem/lack of capacity to generalize from the training set just like TrASPr does. In fact, if you only use combinations of known mutations seen during training a simple regression model gives correlation of ~92-95% (Cortés-López et al 2022). In summary, we believe that better understanding of what one can realistically expect from models such as SpliceAI, Pangolin, and TrASPr will go a long way to have them better understood and used effectively. We have tried to make this more clear in the revision.

      (9) BOS seems like a separate contribution that belongs in a separate publication. Instead, consider providing more details on TrASPr.

      We thank the reviewer for the suggestion. We agree those are two distinct contributions/algorithms and we indeed considered having them as two separate papers. However, there is strong coupling between the design algorithm (BOS) and the predictor that enables it (TrASPr). This coupling is both conceptual (TrASPr as a “teacher”) and practical in terms of evaluations. While we use experimental data (experiments done involving Daam1 exon 16, CD19 exon 2) we still rely heavily on evaluations by TrASPr itself. A completely independent evaluation would have required a high-throughput experimental system to assess designs, which is beyond the scope of the current paper. For those reasons we eventually decided to make it into what we hope is a more compelling combined story about generative models for prediction and design of RNA splicing.

      (10) The authors should consider evaluating BOS using Pangolin or SpliceTransformer as the oracle, in order to measure the contribution to the sequence generation task provided by BOS vs TrASPr.

      We can definitely see the logic behind trying BOS with different predictors. That said, as we note above most of BOS evaluations are based on the “teacher”. As such, it is unclear what value replacing the teacher would bring. We also note that given this limitation we focus mostly on evaluations in comparison to existing approaches (genetic algorithm or random mutations as a strawman). 

      Recommendations for the authors: 

      Reviewer #1 (Recommendations for the authors):

      Additional comments:

      (1) Is your model picking up transcription factor binding sites in addition to RBPs? TFs have been recently shown to have a role in splicing regulation:

      Daoud, Ahmed, and Asa Ben-Hur. "The role of chromatin state in intron retention: A case study in leveraging large scale deep learning models." PLOS Computational Biology 21.1 (2025): e1012755.

      We agree this is an interesting point to explore, especially given the series of works from the Ben-Hur’s group. We note though that these works focus on intron retention (IR) which we haven’t focused on here, and we only cover short intronic regions flanking the exons. We leave this as a future direction as we believe the scope of this paper is already quite extensive.

      (2) SpliceNouveau is a recently published algorithm for the splicing design problem:

      Wilkins, Oscar G., et al. "Creation of de novo cryptic splicing for ALS and FTD precision medicine." Science 386.6717 (2024): 61-69.

      Thank you for pointing out Wilkins et al recent publication, we now refer to it as well. 

      (3) Please discuss the relationship between your model and this deep learning model. You will also need to change the following sentence: "Since the splicing sequence design task is novel, there are no prior implementations to reference."

      We revised this statement and now refer to several recent publications that propose similar design tasks.  

      (4) I would suggest adding a histogram of PSI values - they appear to be mostly close to 1 or 0.

      PSI values are indeed typically close to either 0 or 1. This is a known phenomenon illustrated in previous studies of splicing (e.g. Shen et al NAR 2012 ). We are not sure what is meant by the comment to add a histogram but we made sure to point this out in the main text: 

      “...Still, those statistics are dominated by extreme values, such that 33.2\% are smaller than 0.15 and 56.0\% are higher than 0.85. Furthermore, most cassette exons do not change between a given tissue pair (only 14.0\% of the samples in the dataset, \ie a cassette exon measured across two tissues, exhibit ΔΨ| ≥ 0.15).”

      (5) Part of the improvement of TrASPr over Pangolin could be the result of a more extensive dataset.

      Please see above responses and new analysis.

      (6) In the discussion of the roles of alternative splicing, protein diversity is mentioned, but I suggest you also mention the importance of alternative splicing as a regulatory mechanism:

      Lewis, Benjamin P., Richard E. Green, and Steven E. Brenner. "Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans." Proceedings of the National Academy of Sciences 100.1 (2003): 189-192.

      Thank you for the suggestion. We added that point and citation. 

      (7) Line 96: You use dPSI without defining it (although quite clear that it should be Delta PSI).

      Fixed.

      (8) Pretrained transformers: Have you trained separate transformers on acceptor and donor sites, or a single splice junction transformer?

      Single splice junction pre-training.

      (9) "TrASPr measures the probability that the splice site in the center of Se is included in some tissue" - that's not my understanding of what TrASPr is designed to do.

      We revised the above sentence to make it more precise: “Given a genomic sequence context S<sub>e</sub> = (s<sub>e</sub>,...,s<sub>e</sub>), made of  a cassette exon e and flanking intronic/exonic regions, TrASPr predicts for tissue c the fraction of transcripts where exon e is included or skipped over, ΔΨ-<sub>e,c,c’</sub>.”

      (10) Please include the version of the human genome annotations that you used. 

      We used GENCODE v40 human genome hg38- this is now included in the Data section. 

      (11) I did not see a description of the RBP-AE component in the methods section. A bit more detail on the model would be useful as well.

      Please see above details about replacing RBP-AE with a simpler linear PCA “RBP-State” encoding. We added details about how the PCA was performed to the Methods section.

      (12) Typos, grammar:

      -   Fix the following sentence: ATP13A2, a lysosomal transmembrane cation transporter, linked to an early-onset form of Parkinson's Disease (PD) when 306 loss-of-function mutations disrupt its function.

      Sentence was fixed to now read: “The first example is of a brain cerebellum-specific cassette exon skipping event predicted by TrASPr in the ATP13A2 gene (aka PARK9). ATP13A2 is a lysosomal transmembrane cation transporter, for which loss of function mutation has been linked to early-onset of Parkinson’s Disease (PD)”.

      -   Line 501: "was set to 4e−4"(the - is a superscript). 

      Fixed

      -   A couple of citations are missing in lines 580 and 581.

      Thank you for catching this error. Citations in line 580, 581 were fixed.

      (13) Paper title: Generative modeling for RNA splicing predictions and design - it would read better as "Generative modeling for RNA splicing prediction and design", as you are solving the problems of splicing prediction and splicing design.  

      Thank you for the suggestion. We updated the title and removed the plural form.

      Reviewer #2 (Recommendations for the authors):

      (1) Appendices are not very common in biology journals. It is also not clear what purpose the appendix serves exactly - it seems to repeat some of the things said earlier. Consider merging it into the methods or the main text. 

      We merged the appendices into the Methods section and removed redundancy.

      (2) L112, "For instance, the model could be tasked with designing a new version of the cassette exon, restricted to no more than N edit locations and M total base changes." How are N and M different? Is there a difference between an edit location and a base change? 

      Yes, N is the number of locations (one can think of it as a start position) of various lengths (e.g. a SNP is of length 1) and the total number of positions edited is M. The text now reads “For instance, the model could be tasked with designing a new version of the cassette exon, restricted to no more than  $N$ edit locations (\ie start position of one or more consecutive bases) and $M$ total base changes.”

      (3) L122: "DEN was developed for a distinct problem". What prevents one from adapting DEN to your sequence design task? The method should be generic. I do not see what "differs substantially" means here. (Finally, wasn't DEN developed for the task you later refer to as "alternative splice site" (as opposed to "splice site selection")? Use consistent terminology. And in L236 you use "splice site variation" - is that also the same?).

      Indeed, our original description was not clear/precise enough. DEN was designed and trained for two tasks: APA, and 5’ alternative splice site usage. The terms “selection”, “usage”, and “variation” were indeed used interchangeably in different locations and the reviewer was right, noting the lack of precision. We have now revised the text to make sure the term “relative usage” is used. 

      Nonetheless, we hold DEN was indeed defined for different tasks. See figures from Figure 2A, 6A of Linder et al 2020 (the reference was also incorrect as we cited the preprint and not the final paper):

      In both cases DEN is trying to optimize a short region for selecting an alternative PA site (left) or a 5’ splice site (right). This work focused on an MPRA dataset of short synthetic sequences inserted in the designated region for train/test. We hold this is indeed a different type of data and task then the one we focus on here. Yes, one can potentially adopt DEN for our task, but this is beyond the scope of this paper. Finally, we note that a more closely related algorithm recently proposed is Ledidi (Schreiber et al 2025) which was posted as a pre-print. Similar to BOS, Ledidi tries to optimize a given sequence and adopt it with a few edits for a given task. Regardless, we updated the main text to make the differences between DEN and the task we defined here for BOS more clear, and we also added a reference to Ledidi and other recent works in the discussion section.

      (4) L203, exons with DeltaPSI very close to 0.15 are going to be nearly impossible to classify (or even impossible, considering that the DeltaPSI measurements are not perfect). Consider removing such exons to make the task more feasible.

      Yes, this is how it was done. As described in more details below, we defined changing samples as ones where the change was >= 0.15 and non-changing as ones where the change in PSI was < 0.05 to avoid ambiguous cases affecting the classification task.  

      (5) L230, RBP-AE is not explained in sufficient detail (and does not appear in the methods, apparently). It is not clear how exactly it is trained on each new cellular condition.

      Please see response in the opening of this document and Q11 from

      Reviewer 1 

      (6) L230, "significantly improving": the r value actually got worse; it is therefore not clear you can claim any significant improvement. Please mention that fact in the text.

      This is a fair point. We note that we view the “a” statistic as potentially more interesting/relevant here as the Pearson “r” is dominated by points being generally close to 0/1.  Regardless, revisiting this we realized one can also make a point that the term “significant” is imprecise/misplaced since there is no statistical test done here (side note: given the amount of points, a simple null of same distribution yes/no would pass significance but we don’t think this is an interesting/relevant test here). Also, we note that with the transition to PCA instead of RBP-AE we actually get improvements in both a and r values, both for the ENCODE samples shown in Figure 3a and the two new GTEX tissues we tested (see above). We now changed the text to simply state: 

      “...As shown in Figure 3a, this latent space representation allows TrSAPr to generalize from the six GTEX tissues to unseen conditions, including unseen GTEX tissues (top row), and ENCODE cell lines (bottom row). It improves prediction accuracy compared to TrASPr lacking PCA (eg a=88.5% vs a=82.3% for ENCODE cell lines), though naturally training on the additional GTEX and ENCODE conditions can lead to better performance  (eg a=91.7%, for ENCODE, Figure 3a left column).”

      (7) L233, "Notably, previous splicing codes focused solely on cassette exons", Rosenberg et al. focused solely on alternative splice site choice.

      Right - we removed that sentence.. 

      (8) L236, "trained TrASPr on datasets for 3' and 5' splice site variations". Please provide more details on this task. What is the input to TrASPr and what is the prediction target (splice site usage, PSI of alternative isoforms)? What datasets are used for this task?

      The data for this data was the same GTEx tissue data processed, just for alternative 3’ and 5’ splice sites events. We revised the description of this task in the main task and added information in the Methods section. The data is also included in the repo.

      (9) L243, "directly from genomic sequences", and conservation?

      Yes, we changed the sentence to read “...directly from genomic sequences combined with related features” 

      (10) L262, what is the threshold for significant splicing changes?

      The threshold is 0.15 We updated the main text to read the following:

      The total number of mutations hitting each of the 1198 genomic positions across the 6106 sequences is shown in \FIG{mut_effect}b (left), while the distribution of effects ($|\Delta \Psi|$) observed across those 6106 samples is shown in \FIG{mut_effect}b (right). To this data we applied three testing schemes. The first is a standard 5-fold CV where 20\% of combinations of point mutations were hidden in every fold while the second test involved 'unseen mutation' (UM) where we hide any sample that includes mutations in specific positions for a total of 1480 test samples. As illustrated by the CDF in \FIG{mut_effect}b, most samples (each sample may involve multiple positions mutated) do not involve significant splicing changes. Thus, we also performed a third test using only  the 883 samples were mutations cause significant changes ($|\Delta \Psi|\geq 0.15 $). 

      (11) L266, Pangolin performance is only provided for one of the settings (and it is not clear which). Please provide details of its performance in all settings.

      The description was indeed not clear. Pangolin’s performance was similar to SpliceAI as mentioned above but retraining it on the CD19 data yielded much closer performance to TrASPr. We include all the matching tests for Pangolin after retraining in Figure 4 Supp Figure 1. 

      (12) Please specify "n=" in all relevant plots. 

      Fixed.

      (13) Figure 3a, "The tissues were first represented as tokens, and new cell line results were predicted based on the average over conditions during training." Please explain this procedure in more detail. What are these tokens and how are they provided to the model? Are the cell line predictions the average of the predictions for the training tissues?

      Yes, we compared to simply the average over the predictions for the training tissues for that specific event as baseline to assess improvements (see related work pointing for the need to have similar baselines in DL for genomics in https://pubmed.ncbi.nlm.nih.gov/33213499/). Regarding the tokens - we encode each tissue type as a possible value and feed the two tissues as two tokens to the transformer.

      (14) Figure 4b, the total count in the histogram is much greater than 6106. Please explain the dataset you're using in more detail, and what exactly is shown here.

      We updated the text to read: 

      “...we used 6106 sequence samples where each sample may have multiple positions mutated (\ie mutation combinations) in exon 2 of CD19 and its flanking introns and exons (Cortes et al 2022). The total number of mutations hitting each of the 1198 genomic positions across the 6106 sequences is shown in Figure 4b (left).”

      (15) Figure 5a, how are the prediction thresholds (TrASPr passed, TrASPr stringent, and TrASPr very stringent) defined?

      Passed: dpsi>0.1, Stringent: dpsi>0.15, Very stringent: dpsi>0.2 This is now included in the main text.

      (16) L417, please include more detail on the relative size of TrASPr compared to other models (e.g. number of parameters, required compute, etc.).

      SpliceAI is a general-purpose splicing predictor with 32-layer deep residual neural network to capture long-range dependencies in genomic sequences. Pangolin is a deep learning model specifically designed for predicting tissue-specific splicing with similar architecture as SpliceAI. The implementation of SpliceAI that can be found here https://huggingface.co/multimolecule/spliceai involves an ensemble of 5 such models for a total of ~3.5M parameters. TrASPr, has 4 BERT transformers (each 6 layers and 12 heads) and MLP a top of those for a total of ~189M parameters. Evo 2, a genomic ‘foundation’ model has 40B parameters, DNABERT has ~86M (a single BERT with 12 layers and 12 heads), and Borzoi has 186M parameters (as stated in https://www.biorxiv.org/content/10.1101/2025.05.26.656171v2).  We note that the difference here is not just in model size but also the amount of data used to train the model. We edited the original L417 to reflect that.

      (17) L546, please provide more detail on the VAE. What is the dimension of the latent representation?

      We added more details in the Methods section like the missing dimension (256) and definitions for P(Z) and P(S). 

      (18) Consider citing (and possibly comparing BOS to) Ghari et al., NeurIPS 2024 ("GFlowNet Assisted Biological Sequence Editing").

      Added.

      (19) Appendix Figure 2, and corresponding main text: it is not clear what is shown here. What is dPSI+ and dPSI-? What pairs of tissues are you comparing? Spearman correlation is reported instead of Pearson, which is the primary metric used throughout the text.

      The dPSI+ and dPSI- sets were indeed not well defined in the original submission. Moreover, we found our own code lacked consistency due to different tests executed at different times/by different people. We apologize for this lack of consistency and clarity which we worked to remedy in the revised version. To answer the reviewer’s question, given two tissues ($c,c'$), dPSI+ and dPSI- is for correctly classifying the exons that are significantly differentially included or excluded. Specifically, differential included exons are those for which  $\Delta \Psi_{e,c1,c2} = \Psi_\Psi_{e,c1} - \Psi_{e,c2}  \geq 0.15$, compared to those that are not  ($\Delta \Psi_{e,c1,c2} < 0.05). Similarly, dPSI- is for correctly classifying the exons that are significantly differentially excluded in the first tissue or included in the second tissue ($\Delta \Psi_{e,c1,c2} = \Psi_\Psi_{e,c1} - \Psi_{e,c2}  \leq -0.15$) compared to those that are not  ($\Delta \Psi_{e,c1,c2} > -0.05). This means dPSI+ and dPSI- are dependent on the order of c1, c2. In addition, we also define a direction/order agnostic test for changing vs non changing events i.e. $|\Delta \Psi_{e,c1,c2}| \geq 0.15$ vs $|\Delta \Psi_{e,c1,c2}| < 0.05$. These test definitions are consistent with previous publications (e.g. Barash et al Nature 2010, Jha et al 2017) and also answer different biological questions: For example “Exons that go up in brain” and “Exons that go up in Liver” can reflect distinct mechanisms, while changing exons capture a model’s ability to identify regulated exons even if the direction of prediction may be wrong. The updated Appendix Figure 2 is now in the main text as Figure 2d and uses Pearson, while AUPRC and AUROC refer to the changing vs no-changing classification task described above such that we avoid dPSI+ and dPSI- when summarizing in this table over 3 pairs of tissues . Finally, we note that making sure all tests comply with the above definition also resulted in an update to Figure 2b/c labels and values, where TrASPr’s improvements over Pangolin reaches up to 1.8fold in AUPRC compared to 2.4fold in the earlier version. We again apologize for having a lack of clarity and consistent evaluations in the original submission.

      (20) Minor typographical comments:

      -   Some plots could use more polishing (e.g., thicker stroke, bigger font size, consistent style (compare 4a to the other plots)...).

      Agreed. While not critical for the science itself we worked to improve figure polishing in the revision to make those more readable and pleasant. 

      -   Consider using 2-dimensional histograms instead of the current kernel density plots, which tend to over-smooth the data and hide potentially important details. 

      We were not sure what the exact suggestion is here and opted to leave the plots as is.

      -   L53: dPSI_{e, c, c'} is never formally defined. Is it PSI_{e, c} - PSI_{e, c'} or vice versa?  

      Definition now included (see above).

      -   L91: Define/explain "transformer" and provide reference. 

      We added the explanation and related reference of the transformer in the introduction section and BERT in the method section.  

      -   L94: exons are short. Are you referring here to the flanking introns? Please explain. 

      We apologize for the lack of clarity. We are referring to a cassette exon alternative splicing event as is commonly defined by the splice junctions involved that is from the 5’ SS of the upstream exon to the 3’ SS of the downstream exon. The text now reads:

      “...In contrast, 24% of the cassette exons analyzed in this study span a region between the flanking exons' upstream 3' and downstream 5' splice sites that are larger than 10 kb.”

      -   L132: It's unclear whether a single, shared transformer or four different transformers (one for each splice site) are being pre-trained. One would at least expect 5' and 3' splice sites to have a different transformer. In Methods, L506, it seems that each transformer is pre-trained separately. 

      We updated the text to read:

      “We then center a dedicated transformer around each of the splice sites of the cassette exon and its upstream and downstream (competing) exons (four separate transformers for four splice sites in total).”

      -   L471: You explain here that it is unclear what tasks 'foundation' models are good for. Also in L128, you explain that you are not using a 'foundation' model. But then in L492, you describe the BERT model you're using as a foundation model! 

      Line 492 was simply a poor choice of wording as “foundation” is meant here simply as the “base component”. We changed it accordingly.

      -   L169, "pre-training ... BERT", explain what exactly this means. Is it using masking? Is it self-supervised learning? How many splice sites do you provide? Also explain more about the BERT architecture and provide references. 

      We added more details about the BERT architecture and training in the Methods section.

      -   L186 and later, the values for a and r provided here and in the below do not correspond to what is shown in Figure 2. 

      Fixed, thank you for noticing this.

      -   L187,188: What exactly do you mean by "events" and "samples"? Are they the same thing? If so, are they (exon, tissue) pairs? Please use consistent terminology. Moreover, when you say "changing between two conditions": do you take all six tissues whenever there is a 0.15 spread in PSI among them? Or do you take just the smallest PSI tissue and the largest PSI tissue when there is a 0.15 spread between them? Or something else altogether?

      Reviewer #2 is yet again correct that the definitions were not precise. A “sample” involves a specific exon skipping “event” measured in two tissues.  The text now reads: 

      “....most cassette exons do not change between a given tissue pair (only 14.0% of the samples in the dataset, i.e., a cassette exon measured across two tissues, exhibit |∆Ψ| ≥ 0.15). Thus, when we repeat this analysis only for samples involving exons that exhibited a change in inclusion (|∆Ψ| ≥ 0.15) between at least two tissues, performance degrades for all three models, but the differences between them become more striking (Figure 2a, right column).”

      -   Figure 1a, explain the colors in the figure legend. The 3D effect is not needed and is confusing (ditto in panel C).

      Color explanation is now added: “exons and introns are shown as blue rectangles and black lines. The blue dashed line indicates the inclusive pattern and the red junction indicates an alternative splicing pattern.” 

      These are not 3D effects but stacks to indicate multiple events/cases. We agree these are not needed in Fig1a to illustrate types of AS and removed those. However, in Fig1c and matching caption we use the stacks to  indicate HT data captures many such LSVs over which ML algorithms can be trained. 

      -   Figure 1b, this cartoon seems unnecessary and gives the wrong impression that this paper explores mechanistic aspects of splicing. The only relevant fact (RBPs serving as splicing factors) can be explained in the text (and is anyway not really shown in this figure).

      We removed Figure 1b cartoon.

      -   Figure 1c, what is being shown by the exon label "8"? 

      This was meant to convey exon ID, now removed to simplify the figure. 

      -   Figure 1e, left, write "Intron Len" in one line. What features are included under "..."? Based on the text, I did not expect more features.

      Also, the arrows emanating from the features do not make sense. Is "Embedding" a layer? I don't think so. Do not show it as a thin stripe. Finally, what are dPSI'+ and dPSI'-? are those separate outputs? are those logits of a classification task?

      We agree this description was not good and have updated it in the revised version. 

      -   Figure 1e, the right-hand side should go to a separate figure much later, when you introduce BOS.

      We appreciate the suggestion. However, we feel that Figure 1e serves as a visual representation of the entire framework. Just like we opted to not turn this work into two separate papers (though we fully agree it is a valid option that would also increase our publication count), we also prefer to leave this unified visual representation as is.

      -   Figure 2, does the n=2456 refer to the number of (exons, tissues) pairs? So each exon contributes potentially six times to this plot? Typo "approximately". 

      The “n” refers to the number of samples which is a cassette event measured in two tissues. The same cassette event may appear in multiple samples if it was confidently quantified in more than two tissues. We updated the caption to reflect this and corrected the typo.

      -   Figure 2b, typo "differentially included (dPSI+) or excluded" .

      Fixed.

      -   L221, "the DNABERT" => "DNABERT".

      Fixed.

      -   L232, missing percent sign.

      -    

      Fixed.

      -   L246, "see Appendix Section 2 for details" seems to instead refer to the third section of the appendix.

      We do not have this as an Appendix, the reference has been updated.

      -   Figure 3, bottom panels, PSI should be "splice site usage"? 

      PSI is correct here - we hope the revised text/definitions make it more clear now.

      -   Figure 3b: typo: "when applied to alternative alternative 3'".

      Fixed.

      -   p252, "polypyrimidine" (no capitalization).

      Fixed.

      -   Strange capitalization of tissue names (e.g., "Brain-Cerebellum"). The tissue is called "cerebellum" without capitalization.

      We used EBV (capital) for the abbreviation and lower case for the rest.

      -   Figure 4c: "predicted usage" on the left but "predicted PSI" on the right. 

      Right. We opted to leave it as is since Pangolin and SpliceAI do predict their definition of “usage” and not directly PSI, we just measure correlations to observed PSI as many works have done in the past. 

      -   Figure 4 legend typo: "two three".

      Fixed.

      -   L351, typo: "an (unsupervised)" (and no need to capitalize Transformer).

      Fixed.

      -   L384, "compared to other tissues at least" => "compared to other tissues of at least".

      Fixed.

      -   L549, P(Z) and P(S) are not defined in the text.

      Fixed.

      -   L572, remove "Subsequently". Add missing citations at the end of the paragraph.

      Fixed.

      -   L580-581, citations missing.

      Fixed.

      -   L584-585, typo: "high confidince predictions"

      Fixed.

      -   L659-660, BW-M and B-WM are both used. Typo?

      Fixed.

      -   L895, "calculating the average of these two", not clear; please rewrite.

      Fixed.

      -   L897, "Transformer" and "BERT", do these refer to the same thing? Be consistent.  

      BOS is a transformer and not a BERT but TrASPr uses the BERT architecture. BERT is a type of transformer as the reviewer is surely well aware so the sentence is correct. Still, to follow the reviewer’s recommendation for consistency/clarity we changed it here to state BERT.

      -   Appendix Figure 5: The term dPSI appears to be overloaded to also represent the difference between predicted PSI and measured PSI, which is inconsistent with previous definitions. 

      Indeed! We thank the reviewer again for their sharp eye and attention to details that we missed. We changed Supp Figure 5, now Figure 4 Supplementary Figure 2, to |PSI’-PSI| and defined those as the difference between TrASPr’s predictions (PSI’) and MAJIQ based PSI quantifications.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      Liu et al. provided evidence of the interaction between endocytosis and VAMP8-mediated endocytic recycling of clathrin-mediated endocytosis (CME) cargo through a knockdown approach combined with total internal reflection fluorescence (TIRF) microscopy, western blotting, and functional assays in a mammalian cell line system. They demonstrated that VAMP8 impairs the initial stages of CME, such as the initiation, stabilization, and invagination of clathrin-coated pits (CCPs). VAMP8 indirectly regulates CME by facilitating endocytic recycling. The depletion of VAMP8 alters endosomal recycling, as shown here by the transferrin receptor, towards lysosomal degradation, thereby inhibiting clathrin-coated vesicle (CCV) formation. Overall, I found this study to be highly engaging because of its elucidation of the unexpected role of R-Snare in influencing the levels of cargo proteins within the context of clathrin-mediated endocytosis (CME). This MS will be helpful for researchers in endocytosis and protein trafficking fields. It appears to me that VAMP8 interacts with multiple targets within the endo-lysosomal pathway, collectively influencing the clathrin-mediated endocytosis (CME). Therefore, the contribution of lysosomes in this context should be evaluated. This matter should be addressed experimentally and discussed in the MS before considering publication.

      Major comments:

      1. Figure 4D demonstrates that the knockdown of VAMP8 leads to an increase in lysosome numbers and lysosomal perinuclear clustering, as evidenced by LAMP1 staining (Figure 5A). Additionally, the knockdown of VAMP8 results in the downregulation of most surface receptors, as illustrated in Figure 3A, which typically follows the lysosomal degradation pathway. The observed reduction in TfR cargo could be attributable to the decreased presence of the Tfn Receptor in siVAMP8-treated cells compared to that in control cells. How do the authors explain this phenomenon? Upon reviewing these observations, I suggest that the mechanism outlined in the manuscript-specifically, "Depletion of VAMP8 skews endosomal recycling of CME cargo, exemplified here by transferrin receptor, toward lysosomal degradation, thereby inhibiting CCV formation"-may serve as a secondary rather than a primary cause. This can be ruled out by the following experiments:
        • Assessment of lysosomal biogenesis markers through RT-PCR or Western blotting following VAMP8 knockdown.
        • Assessment of transferrin receptor stability under VAMP8 knockdown conditions using cycloheximide.
        • Previous studies have indicated that perinuclear clustering of lysosomes is correlated with increased degradative activity. Therefore, assessing the lysosomal perinuclear index in the images presented in Figure 5A (LAMP1) effectively determines the presence or absence of this phenomenon.
      2. Given that VAMP8 is implicated in lysosomal fusion events, I hypothesized that VAMP8 undergoes degradation via the lysosomal pathway. However, Figure 4F indicates that there was no restoration of VAMP8 following leupeptin treatment. Could you please provide an explanation for this discrepancy or is it trafficked to proteasomal degradation pathway?
      3. Figure 5A and 5C demonstrate that the restoration of TfnR in siVAMP8 under leupeptin conditions was similar to the levels observed in the sicontrol without leupeptin. However, no enhancement in TfnR uptake (Figure 5F) was detected in cells treated with siVAMP8 under leupeptin treatment conditions. How can these observations be reconciled with each other?

      Minor comments:

      1. The manuscript does not provide details of the western blotting method and quantification criteria.
      2. Fig1A &B) - The siVAMP8 #1 blot indicates a reduction exceeding 90%, whereas the bar graph depicts a reduction of 70-80%. It is advisable to elucidate the quantification criteria in the Methods section to prevent potential confusion. Were the protein levels normalized to the loading control?
      3. Enhancing the readability of the graph could be achieved by labeling the Y-axis as either 'All CCP' or 'Bonafide CCP' of CME analysis graphs.
      4. The legends of panels 1M and N do not correlate with the corresponding figures. Need corrections.
      5. Fig 4D- Is the technique employed for electron immunogold staining utilizing a lysosome-specific antibody? How do the authors substantiate their assertion that the darkly stained structures are lysosomes and not other cellular compartments?
      6. Electron micrographs of siVAMP8 cells revealed the presence of dark-stained bodies near the plasma membrane. The implications of this observation should be explained in the discussion section.
      7. Fig5A- Provide the color code for the merged images.
      8. Fig5G- schematic needs to be improved to demonstrate the contribution of increased lysosomal content.

      Significance

      VAMP8 is an R-SNARE critical for late endosome/lysosome fusion and regulates exocytosis, especially in immune and secretory cells. It pairs with Q-SNAREs to mediate vesicle fusion, and its dysfunction alters immunity, inflammation, and secretory processes. This study revealed that the SNARE protein VAMP8 influences clathrin-mediated endocytosis (CME) by managing the recycling of endocytic cargo rather than being directly recruited to clathrin-coated vesicles. This study advances our understanding of cellular trafficking mechanisms and underscores the essential role of recycling pathways in maintaining membrane dynamics. This is an excellent piece of work, and the experiments were designed meticulously; however, the mechanism is not convincing enough at this point. This MS will surely benefit the general audience, specifically the membrane and protein trafficking and cell biology community.

    1. yielding predictable outputs for given inputs. In contrast, machine learning systems are probabilistic architectures

      Similar to coding specific rules, if parameter =x and value > y then z.

    1. Une reformulation de la banalité du mal: la pensée est dialogue , y compris à l'intérieur de soi, comme confrontation, comme deux en un. Quand elle s'éteint, il n'y a plus que l'obéissance et la banalité.

      C'est ça les intellectuels qui virent la montée du nazisme et ... Eichmann.

      Est ce la même chose aujourd'hui, où les intellectuels ont la responsabilité de la montée des populismes ?

    1. Guide pratique : S'engager sans s'épuiser, cultiver un militantisme durable

      Introduction : La double facette de l'engagement moderne

      Face à une urgence écologique et sociale de plus en plus palpable, nous assistons à une multiplication des formes d'engagement citoyen.

      Des actions de désobéissance civile aux initiatives de sensibilisation, en passant par la création de médias indépendants, cet élan collectif est vital pour faire face aux défis de notre époque.

      Cependant, cette mobilisation intense expose les individus et les organisations à un risque élevé d'épuisement physique et psychologique, un phénomène souvent désigné sous le nom de « burnout militant ».

      Loin d'être un signe de faiblesse, cet épuisement est une conséquence logique d'une lutte exigeante contre des systèmes profondément ancrés.

      Ce guide se veut une ressource pragmatique et encourageante, synthétisant les stratégies, les changements de perspective et les leçons partagées par des militants expérimentés pour préserver son énergie et cultiver sa motivation sur le long terme.

      En tant que psychologue observant ces dynamiques, ce guide vise à outiller les acteurs du changement pour qu'ils puissent aligner leur action extérieure avec leur résilience intérieure.

      --------------------------------------------------------------------------------

      1. Comprendre la flamme de l'engagement : Les racines de l'action

      Avant de chercher à protéger la flamme de l'engagement, il est fondamental de comprendre ce qui l'a allumée.

      Identifier ses motivations profondes, cette « étincelle » initiale qui pousse à l'action, est la première étape pour construire un engagement résilient et authentique.

      C'est en se reconnectant à ce « pourquoi » viscéral que l'on peut trouver la force de traverser les moments de doute et de fatigue.

      Cette section explore les divers détonateurs de l'action, tels que vécus et partagés par des personnes engagées aux parcours variés.

      1.1. L'étincelle initiale : Identifier votre « pourquoi »

      Les chemins qui mènent à l'engagement sont multiples, souvent personnels et profondément transformateurs. Ils naissent d'une rencontre entre une sensibilité individuelle et une réalité qui devient intolérable.

      La prise de conscience soudaine : Pour certains, l'engagement naît d'un choc, d'une information qui brise les certitudes.

      C'est le cas de l'arboriste-grimpeur Thomas Braille, qui a été « coupé dans ses jambes » en réalisant que l'échéance de l'urgence climatique n'était plus une projection lointaine mais une réalité imminente :

      « 20 ans, c'est demain ».

      Cette prise de conscience a été catalysée par la peur viscérale pour l'avenir de son fils.

      Le sentiment d'injustice personnel : L'expérience vécue de l'injustice est un moteur puissant et durable.

      Pour la réalisatrice Flore Vasseur, le « foyer de la flamme » se trouve dans une injustice personnelle vécue durant l'enfance.

      Cette blessure initiale, bien que longtemps enfouie, est devenue la source d'une quête de réparation et d'une sensibilité aiguë aux injustices du monde.

      La passion confrontée à la réalité : L'engagement peut aussi émerger lorsque la passion d'une vie se heurte à l'inaction et à l'absurdité du système.

      L'agroclimatologue Serge Zaka, passionné par la météo depuis l'enfance, a basculé dans un engagement public en constatant les impacts concrets du changement climatique (des végétaux brûlés à 46°C) et l'ignorance des décideurs politiques face à des études qu'ils avaient eux-mêmes commandées.

      La quête de cohérence et la fin de la solitude : Parfois, l'engagement est une flamme qui couvait depuis longtemps mais peinait à trouver un exutoire.

      Pour Anaïs Terrien, présidente de La Fresque du Climat, un engagement précoce mais solitaire a trouvé un nouvel élan grâce à un outil lui permettant enfin de structurer le dialogue, de briser l'isolement et d'être comprise dans ses préoccupations.

      1.2. Le moteur psychologique de l'action

      Selon l'analyse de l'écopsychologue Emmanuel Delrieu, l'engagement n'est pas un simple choix intellectuel, mais une transformation profonde qui répond à des mécanismes psychologiques précis.

      1. L'interaction des forces : Pour persévérer, un engagement doit mobiliser une synergie de trois types de forces.

      Les forces affectives (ce qui nous touche, la sensibilité à la souffrance du monde), les forces comportementales (la capacité à agir et à persévérer dans la durée) et les forces cognitives (la capacité à analyser et à réconcilier les aspects positifs et négatifs de la lutte).

      2. La résolution de la dissonance cognitive : S'engager est souvent un moyen de réduire la tension interne entre ses valeurs et les paradigmes dominants de la société (capitalisme, patriarcat, colonialisme).

      Face à cette dissonance, l'action permet de « remettre de l'ordre dans sa vie » en alignant ses comportements avec ses convictions profondes.

      3. La transformation par l'enracinement : Plus l'engagement est profond, plus l'individu se transforme et se « radicalise », au sens étymologique du terme :

      il s'enracine dans ses convictions. Cet enracinement crée des liens, un « mycélium » avec d'autres luttes, renforçant la solidarité et la position de chacun.

      Cependant, cette même puissance qui ancre l'individu dans ses convictions le rend aussi plus vulnérable.

      En s'alignant si profondément avec sa cause, il s'expose frontalement à la résistance, à l'inertie et à la violence du système qu'il combat, créant un terrain propice à l'usure.

      --------------------------------------------------------------------------------

      2. Naviguer les tempêtes : Reconnaître et gérer le risque d'épuisement

      Loin d'être un échec personnel ou un signe de faiblesse, les moments de fatigue, de doute et même d'effondrement sont des étapes quasi inévitables du parcours militant.

      Ils sont le reflet de l'intensité de la lutte et de la violence de ce qui est combattu.

      L'enjeu stratégique n'est donc pas d'éviter ces moments à tout prix, mais d'apprendre à en reconnaître les signes avant-coureurs et à y répondre de manière constructive et bienveillante.

      2.1. Les symptômes avant-coureurs du burnout militant

      Être à l'écoute de soi est la première ligne de défense. Voici quelques signaux d'alerte, basés sur les analyses et témoignages, qui doivent inciter à la prudence :

      Fatigue physique et mentale : Une irritabilité croissante et une fatigue persistante qui ne se résorbe pas avec le repos sont des premiers signes clairs que les réserves d'énergie s'épuisent (Emmanuel Delrieu).

      Perte de sens et envie de retrait : Après une action extrême – 40 jours de grève de la faim suivis d'une grève de la soif – Thomas Braille a ressenti le besoin de s'isoler : « je ne voulais plus voir d'êtres humains ».

      Ce sentiment que le sacrifice est vain et que « tout le monde s'en fout » est un symptôme critique.

      Sentiment de submerssion : L'impression que « le vase était presque plein et menaçait de casser » a poussé Anaïs Terrien à annuler ses engagements.

      Cette sensation d'être submergé par les responsabilités et les urgences est un indicateur majeur.

      Confrontation à l'indifférence et au cynisme : La frustration face à l'inaction générale, comme l'a vécue Flore Vasseur après les révélations d'Edward Snowden, peut user la motivation et mener à un sentiment d'impuissance destructeur.

      2.2. Le burnout comme un cycle, et non comme une fin

      Il est crucial de déconstruire l'idée que le burnout est un point final. C'est avant tout un signal et une étape de transformation.

      L'effondrement est un « moment transformatoire nécessaire ».

      L'écopsychologue Emmanuel Delrieu insiste : plus on résiste à la fatigue et au besoin de changement, plus l'effondrement est douloureux.

      L'accepter comme une étape nécessaire permet de le traverser plus sereinement.

      L'engagement n'est pas linéaire mais cyclique. Il s'apparente à une spirale.

      Les phases de « down » ne sont pas des régressions, mais des moments où l'on plonge pour « chercher des forces encore plus grandes d'ancrage ».

      Chaque cycle permet de se transformer et de repartir sur des bases plus solides.

      L'erreur est de « toujours vouloir être parfait et aller bien ». Comme le souligne Flore Vasseur, la société nous pousse à masquer nos vulnérabilités.

      Or, la libération que représentent les émotions, les larmes et l'acceptation de ses failles est une source de résilience immense.

      L'enjeu stratégique est donc de cultiver un réseau de soutien solide, capable de vous accueillir lors de ces phases d'effondrement pour qu'elles deviennent des sources de transformation, et non de destruction.

      2.3. Les facteurs aggravants spécifiques au militantisme

      Au-delà du surmenage classique, le militantisme expose à des sources de stress uniques qui accélèrent le risque d'épuisement.

      1. La violence des attaques personnelles : L'exposition publique s'accompagne souvent d'une violence décomplexée.

      Les insultes constantes reçues par Serge Zaka sur son physique (allant jusqu'à la création du sobriquet « Grosaka ») ou sa crédibilité (son chapeau) sont une forme de harcèlement visant à déstabiliser et à user psychologiquement.

      2. L'invisibilisation institutionnelle : Comme l'analyse Emmanuel Delrieu, les structures politiques et sociales nient ou minimisent systématiquement les luttes.

      Cette non-reconnaissance est une source d'injustice profonde et d'épuisement, car elle oblige à se battre non seulement pour sa cause, mais aussi pour la légitimité même de son combat.

      3. La confrontation à la force du système : Les militants se heurtent à la capacité du système à absorber et neutraliser la critique.

      Flore Vasseur a constaté que « plus vous tapez dedans, plus il est fort ».

      Le système peut transformer la dénonciation en spectacle, la vidant de sa substance et laissant le militant avec un sentiment d'impuissance.

      --------------------------------------------------------------------------------

      3. Entretenir la flamme : Stratégies pour un engagement durable

      Un engagement durable ne se résume pas à la gestion des crises d'épuisement.

      Il repose sur la mise en place de stratégies proactives pour nourrir sa motivation, protéger son énergie et construire sa propre résilience.

      Les quatre piliers suivants, complémentaires et interdépendants, offrent des pistes concrètes pour y parvenir.

      3.1. Stratégie 1 : La force du collectif et du soutien

      Le premier et le plus puissant rempart contre l'épuisement est la qualité des liens humains. L'isolement est le terreau du burnout.

      S'appuyer sur le collectif : Anaïs Terrien l'affirme sans détour : elle a été sauvée du burnout par son conseil d'administration.

      Le groupe agit comme un filet de sécurité, permettant de prendre le relais lorsque l'un de ses membres flanche.

      Savoir demander de l'aide : Reconnaître ses propres limites et oser demander du soutien n'est pas une faiblesse, mais une compétence stratégique essentielle pour durer.

      C'est un acte de confiance envers le collectif.

      Cultiver le « prendre soin du lien » : Comme le propose Emmanuel Delrieu, il est crucial d'instaurer au sein des groupes une pratique active de soutien mutuel.

      Cela signifie créer des espaces où la vulnérabilité est acceptée et où l'on prend soin les uns des autres autant que de la cause défendue.

      3.2. Stratégie 2 : La justesse de la perspective

      La manière dont on perçoit son action et ses objectifs peut radicalement diminuer la pression et le risque d'épuisement.

      Adopter « l'esprit des cathédrales » : Partagée par Flore Vasseur via Edward Snowden, cette métaphore est libératrice.

      Elle invite à accepter de ne pas voir le résultat final de ses actions, mais à se concentrer sur sa contribution : poser sa « brique » avec la confiance que d'autres construiront dessus.

      Lutter « pour » plutôt que « contre » : Ce changement de paradigme, également proposé par Flore Vasseur, rend l'engagement plus positif et moins autodestructeur.

      Il s'agit de se battre pour un monde désirable, pour la vie, pour l'avenir de ses enfants — des moteurs qui génèrent une énergie positive et renouvelable, à l'inverse de la lutte contre un système qui peut se révéler corrosive.

      Renoncer à l'attente d'un résultat immédiat :

      L'attente d'une victoire rapide est l'une des principales sources de dépression et de désillusion pour les militants.

      L'esprit des cathédrales aide à se détacher de cette tyrannie du résultat.

      3.3. Stratégie 3 : L'alignement et l'action authentique

      Un engagement qui dure est un engagement qui vient du cœur, pas de l'ego.

      Se connecter à son injustice profonde : Comme le conseille Flore Vasseur, les blessures personnelles, les humiliations, les trahisons vécues sont le « fioul » le plus durable.

      C'est en allant chercher ce qui nous touche viscéralement que l'on trouve une énergie inépuisable.

      S'engager pour se réparer soi-même : Plutôt que de s'engager pour la reconnaissance sociale ou l'image, ce qui mène inévitablement à l'épuisement, l'engagement le plus durable est celui qui est aussi une démarche intime.

      Comme l'explique Flore Vasseur, « on y va pour se réparer soi. Ce qu'on vient réparer c'est soi et en se réparant soi on répare le monde ».

      Diversifier ses projets et ses sources d'énergie : Pour ne pas dépendre d'une seule source de gratification, il est sain de « ne pas mettre tous ses œufs dans le même panier », comme le pratique Anaïs Terrien.

      Avoir d'autres projets (collectif d'habitation, jardinage, art) permet de se ressourcer et de maintenir un équilibre.

      3.4. Stratégie 4 : La culture du soin personnel

      Prendre soin de soi n'est pas un luxe ou un acte égoïste ; c'est une condition indispensable pour pouvoir continuer à prendre soin du monde.

      « Faire silence d'humain » : Ce conseil d'Emmanuel Delrieu invite à se reconnecter régulièrement et profondément à la nature, loin du bruit et de l'agitation humaine, pour se ressourcer et retrouver une perspective plus large.

      Se détacher de la peur du jugement : Thomas Braille illustre une source de force immense :

      « Je n'ai pas peur du jugement des hommes, j'ai peur uniquement du jugement de mon fils ».

      Se libérer de la peur du regard social permet d'agir avec une plus grande liberté et une plus grande force.

      Le plus grand renoncement : renoncer à plaire. Cette phrase puissante de Flore Vasseur résume un acte de libération essentiel.

      Un militant ne peut pas plaire à tout le monde. L'accepter, c'est se libérer d'un poids immense.

      Se nourrir de la joie : Malgré les difficultés, l'engagement est aussi une source de joies intenses.

      Flore Vasseur rappelle rencontrer « plus souvent des moments de joie quasi extatique que des moments de burnout ».

      Le lien, la solidarité et les petites victoires sont des nourritures essentielles.

      --------------------------------------------------------------------------------

      4. Conclusion : L'engagement, un marathon pour la vie

      En définitive, l'engagement sur le long terme s'apparente bien plus à un marathon qu'à un sprint. Les stratégies pour durer ne sont pas des distractions ou des luxes, mais des composantes essentielles de la lutte elle-même.

      Prendre soin de soi, cultiver la force du collectif, ajuster sa perspective et agir depuis un lieu d'authenticité sont les conditions de la victoire.

      En acceptant la nature cyclique de l'énergie et en se rappelant constamment son « pourquoi », il devient possible non seulement de tenir, mais aussi de s'épanouir dans l'action.

      Comme le disait Baden-Powell, cité par Anaïs Terrien, l'objectif n'est peut-être pas de sauver le monde seul et tout de suite, mais plus humblement et plus durablement d'« essayer de laisser le monde un peu meilleur que quand vous êtes arrivé ».

    1. for (i = 0; i < DIM; i++) { z[i] = x[i] - y[i]; }

      ヒント

      for文で入力される配列の動き

      1週目: z[0] = x[0] - y[0]    z[0] = 1 - 2

      2週目: z[1] = x[1] - y[1]    z[1] = (-2) - 0

      3週目: z[2]= x[2] - y[2]    z[2] = 1 - (-2)

      このように,配列z[ ]に引き算の結果が入る.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1* (Evidence, reproducibility and clarity (Required)):

      Summary: In this study, the authors used proximity proteomics in U2OS cells to identify several E3 ubiquitin ligases recruited to stress granules (SGs), and they focused on MKRN2 as a novel regulator. They show that MKRN2 localization to SGs requires active ubiquitination via UBA1. Functional experiments demonstrated that MKRN2 knockdown increases the number of SG condensates, reduces their size, slightly raises SG liquidity during assembly, and slows disassembly after heat shock. Overexpression of MKRN2-GFP combined with confocal imaging revealed co-localization of MKRN2 and ubiquitin in SGs. By perturbing ubiquitination (using a UBA1 inhibitor) and inducing defective ribosomal products (DRiPs) with O-propargyl puromycin, they found that both ubiquitination inhibition and MKRN2 depletion lead to increased accumulation of DRiPs in SGs. The authors conclude that MKRN2 supports granulostasis, the maintenance of SG homeostasis , through its ubiquitin ligase activity, preventing pathological DRiP accumulation within SGs.

      Major comments: - Are the key conclusions convincing? The key conclusions are partially convincing. The data supporting the role of ubiquitination and MKRN2 in regulating SG condensate dynamics are coherent, well controlled, and consistent with previous literature, making this part of the study solid and credible. However, the conclusions regarding the ubiquitin-dependent recruitment of MKRN2 to SGs, its relationship with UBA1 activity, the functional impact of the MKRN2 knockdown for DRiP accumulation are less thoroughly supported. These aspects would benefit from additional mechanistic evidence, validation in complementary model systems, or the use of alternative methodological approaches to strengthen the causal connections drawn by the authors. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? The authors should qualify some of their claims as preliminary. 1) MKRN2 recruitment to SGs (ubiquitin-dependent): The proteomics and IF data are a reasonable starting point, but they do not yet establish that MKRN2 is recruited from its physiological localization to SGs in a ubiquitin-dependent manner. To avoid overstating this point the authors should qualify the claim and/or provide additional controls: show baseline localization of endogenous MKRN2 under non-stress conditions (which is reported in literature to be nuclear and cytoplasmatic), include quantification of nuclear/cytoplasmic distribution, and demonstrate a shift into bona fide SG compartments after heat shock. Moreover, co-localization of overexpressed GFP-MKRN2 with poly-Ub (FK2) should be compared to a non-stress control and to UBA1-inhibition conditions to support claims of stress- and ubiquitination-dependent recruitment. *

      Authors: We will stain cells for endogenous MKRN2 and quantify nuc/cyto ratio of MKRN2 without heat stress, without heat stress + TAK243, with HS and with HS + TAK243. We will do the same in the MKRN2-GFP overexpressing line while also staining for FK2.

      *2) Use and interpretation of UBA1 inhibition: UBA1 inhibition effectively blocks ubiquitination globally, but it is non-selective. The manuscript should explicitly acknowledge this limitation when interpreting results from both proteomics and functional assays. Proteomics hits identified under UBA1 inhibition should be discussed as UBA1-dependent associations rather than as evidence for specific E3 ligase recruitment. The authors should consider orthogonal approaches before concluding specificity. *

      Authors: We have acknowledged the limitation of using only TAK243 in our study by rephrasing statements about dependency on “ubiquitination” to “UBA1-dependent associations”.

      * 3) DRiP accumulation and imaging quality: The evidence presented in Figure 5 is sufficient to substantiate the claim that DRiPs accumulate in SGs upon ubiquitination inhibition or MKRN2 depletion but to show that the event of the SGs localization and their clearance from SGs during stress is promoted by MKRN3 ubiquitin ligase activity more experiments would be needed. *

      Authors: We have acknowledged the fact that our experiments do not include DRiP and SG dynamics assays using ligase-dead mutants of MKRN2 by altering our statement regarding MKRN2-mediated ubiquitination of DRiPs in the text (as proposed by reviewer 1).

      *- Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Yes, a few targeted experiments would strengthen the conclusions without requiring the authors to open new lines of investigation. 1) Baseline localization of MKRN2: It would be important to show the baseline localization of endogenous and over-expressed MKRN2 (nuclear and cytoplasmic) under non-stress conditions and prior to ubiquitination inhibition. This would provide a reference to quantify redistribution into SGs and demonstrate recruitment in response to heat stress or ubiquitination-dependent mechanisms. *

      Authors: We thank the reviewer for bringing this important control. We will address it in revisions.

      We will quantify the nuclear/cytoplasmic distribution of endogenous and GFP-MKRN2 under control, TAK243, heat shock, and combined conditions, and assess MKRN2–ubiquitin colocalization by FK2 staining in unstressed cells.

      * 2) Specificity of MKRN2 ubiquitin ligase activity: to address the non-specific effects of UBA1 inhibition and validate that observed phenotypes depend on MKRN2's ligase activity, the authors could employ a catalytically inactive MKRN2 mutant in rescue experiments. Comparing wild-type and catalytic-dead MKRN2 in the knockdown background would clarify the causal role of MKRN2 activity in SG dynamics and DRiP clearance. *

      Authors: We thank the reviewer for this suggestion and have altered the phrasing of some of our statements in the text accordingly.


      * 3) Ubiquitination linkage and SG marker levels: While the specific ubiquitin linkage type remains unknown, examining whether MKRN2 knockdown or overexpression affects total levels of key SG marker proteins would be informative. This could be done via Western blotting of SG markers along with ubiquitin staining, to assess whether MKRN2 influences protein stability or turnover through degradative or non-degradative ubiquitination. Such data would strengthen the mechanistic interpretation while remaining within the current study's scope. *

      Authors: We thank the reviewer for requesting and will address it by performing MKRN2 KD and perform Western blot for G3BP1.

      *

      • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. The experiments suggested in points 1 and 3 are realistic and should not require substantial additional resources beyond those already used in the study. • Point 1 (baseline localization of MKRN2): This involves adding two control conditions (no stress and no ubiquitination inhibition) for microscopy imaging. The setup is essentially the same as in the current experiments, with time requirements mainly dependent on cell culture growth and imaging. Overall, this could be completed within a few weeks. • Point 3 (SG marker levels and ubiquitination): This entails repeating the existing experiment and adding a Western blot for SG markers and ubiquitin. The lab should already have the necessary antibodies, and the experiment could reasonably be performed within a couple of weeks. • Point 2 (catalytically inactive MKRN2 mutant and rescue experiments): This is likely more time-consuming. Designing an effective catalytic-dead mutant depends on structural knowledge of MKRN2 and may require additional validation to confirm loss of catalytic activity. If this expertise is not already present in the lab, it could significantly extend the timeline. Therefore, this experiment should be considered only if similarly recommended by other reviewers, as it represents a higher resource and time investment.

      Overall, points 1 and 3 are highly feasible, while point 2 is more substantial and may require careful planning.

      • Are the data and the methods presented in such a way that they can be reproduced? Yes. The methodologies used in this study to analyze SG dynamics and DRiP accumulation are well-established in the field and should be reproducible, particularly by researchers experienced in stress granule biology. Techniques such as SG assembly and disassembly assays, use of G3BP1 markers, and UBA1 inhibition are standard and clearly described. The data are generally presented in a reproducible manner; however, as noted above, some results would benefit from additional controls or complementary experiments to fully support specific conclusions.

      • Are the experiments adequately replicated and statistical analysis adequate? Overall, the experiments in the manuscript appear to be adequately replicated, with most assays repeated between three and five times, as indicated in the supplementary materials. The statistical analyses used are appropriate and correctly applied to the datasets presented. However, for Figure 5 the number of experimental replicates is not reported. This should be clarified, and if the experiment was not repeated sufficiently, additional biological replicates should be performed. Given that this figure provides central evidence supporting the conclusion that DRiP accumulation depends on ubiquitination-and partly on MKRN2's ubiquitin ligase activity-adequate replication is essential. *

      Authors: We thank the reviewer for noting this accidental omission. We now clarify in the legend of Figure 5 that the experiments with DRiPs were replicated three times.

      Minor comments: - Specific experimental issues that are easily addressable. • For the generation and the validation of MKRN2 knockdown in UOS2 cells data are not presented in the results or in the methods sections to demonstrate the effective knockdown of the protein of interest. This point is quite essential to demonstrate the validity of the system used

      Authors: We thank the reviewer for requesting and will address it by performing MKRN2 KD and perform Western blot and RT-qPCR.

      • * In the supplementary figure 2 it would be useful to mention if the Western Blot represent the input (total cell lysates) before the APEX-pulldown or if it is the APEX-pulldown loaded for WB. There is no consistence in the difference of biotynilation between different replicates shown in the 2 blots. For example in R1 and R2 G3BP1-APX TAK243 the biotynilation is one if the strongest condition while on the left blot, in the same condition comparison samples R3 and R4 are less biotinilated compared to others. It would be useful to provide an explanation for that to avoid any confusion for the readers. * Authors: We have added a mention in the legend of Figure S2 that these are total cell lysates before pulldown. The apparent differences in biotin staining are small and not sufficient to question the results of our APEX-proteomics.

      • * In Figure 2D, endogenous MKRN2 localization to SGs appears reduced following UBA1 inhibition. However, it is not clear whether this reduction reflects a true relocalization or a decrease in total MKRN2 protein levels. To support the interpretation that UBA1 inhibition specifically affects MKRN2 recruitment to SGs rather than its overall expression, the authors should provide data showing total MKRN2 levels remain unchanged under UBA1 inhibition, for example via Western blot of total cell lysates. * Authors: Based on first principles in regulation of gene expression, it is unlikely that total MKRN2 expression levels would decrease appreciably through transcriptional or translational regulation within the short timescale of these experiments (1 h TAK243 pretreatment followed by 90 min of heat stress).

      • * DRIPs accumulation is followed during assembly but in the introduction is highlighted the fact that ubiquitination events, other reported E3 ligases and in this study data on MKRN2 showed that they play a crucial role in the disassembly of SGs which is also related with cleareance of DRIPs. Authors could add tracking DRIPs accumulation during disassembly to be added to Figure 5. I am not sure about the timeline required for this but I am just adding as optional if could be addressed easily. * Authors: We thank the reviewer for proposing this experimental direction. However, in a previous study (Ganassi et al., 2016; 10.1016/j.molcel.2016.07.021), we demonstrated that DRiP accumulation during the stress granule assembly phase drives conversion to a solid-like state and delays stress granule disassembly. It is therefore critical to assess DRiP enrichment within stress granules immediately after their formation, rather than during the stress recovery phase, as done here.

      • * The authors should clarify in the text why the cutoff used for the quantification in Figure 5D (PC > 3) differs from the cutoff used elsewhere in the paper (PC > 1.5). Providing a rationale for this choice will help the reader understand the methodological consistency and ensure that differences in thresholds do not confound interpretation of the results. * Authors: We thank the reviewer for this question. The population of SGs with a DRiP enrichment > 1.5 represents SGs with a significant DRiP enrichment compared to the surrounding (background) signal. As explained in the methods, the intensity of DRiPs inside each SG is corrected by the intensity of DRiPs two pixels outside of each SG. Thus, differences in thresholds between independent experimental conditions (5B versus 5D) do not confound interpretation of the results but depend on overall staining intensity that can different between different experimental conditions. Choosing the cut-off > 3 allows to specifically highlight the population of SGs that are strongly enriched with DRiPs. MKRN2 silencing caused a strong DRiP enrichment in the majority of the SGs analyzed and therefore we chose this way of data representation. Note that the results represent the average of the analysis of 3 independent experiments with high numbers of SGs automatically segmented and analyzed/experiment. Figure 5A, B: n = 3 independent experiments; number of SGs analyzed per experiment: HS + OP-puro (695; 1216; 952); TAK-243 + HS + OP-puro (1852; 2214; 1774). Figure 5C, D: n = 3 independent experiments; number of SGs analyzed per experiment: siRNA control, HS + OP-puro (1984; 1400; 1708); siRNA MKRN2, HS + OP-puro (912; 1074; 1532).

      • * For Figure 3G, the authors use over-expressed MKRN2-GFP to assess co-localization with ubiquitin in SGs. Given that a reliable antibody for endogenous MKRN2 is available and that a validated MKRN2 knockdown line exists as an appropriate control, this experiment would gain significantly in robustness and interpretability if co-localization were demonstrated using endogenous MKRN2. In the current over-expression system, MKRN2-GFP is also present in the nucleus, whereas the endogenous protein does not appear nuclear under the conditions shown. This discrepancy raises concerns about potential over-expression artifacts or mislocalization. Demonstrating co-localization using endogenous MKRN2 would avoid confounding effects associated with over-expression. If feasible, this would be a relatively straightforward experiment to implement, as it relies on tools (antibody and knockdown line) already described in the manuscript.

      * Authors: We thank the reviewer for requesting and will address it by performing MKRN2 KD, FK2 immunofluorescence microscopy and perform SG partition coefficient analysis.

      * - Are prior studies referenced appropriately? • From line 54 to line 67, the manuscript in total cites eight papers regarding the role of ubiquitination in SG disassembly. However, given the use of UBA1 inhibition in the initial MS-APEX experiment and the extensive prior literature on ubiquitination in SG assembly and disassembly under various stress conditions, the manuscript would benefit from citing additional relevant studies to provide more specifc examples. Expanding the references would provide stronger context, better connect the current findings to prior work, and emphasize the significance of the study in relation to established literature *

      Authors: We have added citations for the relevant studies.

      • *

      At line 59, it would be helpful to note that G3BP1 is ubiquitinated by TRIM21 through a Lys63-linked ubiquitin chain. This information provides important mechanistic context, suggesting that ubiquitination of SG proteins in these pathways is likely non-degradative and related to functional regulation of SG dynamics rather than protein turnover. * Authors: The reviewer is correct. We have added to the text that G3BP1 is ubiquitinated through a Lys63-linked ubiquitin chain.

      • *

      When citing references 16 and 17, which report that the E3 ligases TRIM21 and HECT regulate SG formation, the authors should provide a plausible explanation for why these specific E3 ligases were not detected in their proteomics experiments. Differences could arise from the stress stimulus used, cell type, or experimental conditions. Similarly, since MKRN2 and other E3 ligases identified in this study have not been reported in previous works, discussing these methodological or biological differences would help prevent readers from questioning the credibility of the findings. It would also be valuable to clarify in the Conclusion that different types of stress may activate distinct ubiquitination pathways, highlighting context-dependent regulation of SG assembly and disassembly. * Authors: We thank the reviewer for this suggestion. We added to the discussion plausible explanations for why our study identified new E3 ligases.

      • *

      Line 59-60: when referring to the HECT family of E3 ligases involved in ubiquitination and SG disassembly, it would be more precise to report the specific E3 ligase identified in the cited studies rather than only the class of ligase. This would provide clearer mechanistic context and improve accuracy for readers. * Authors: We have added this detail to the discussion.

      • *

      The specific statement on line 182 "SG E3 ligases that depend on UBA1 activity are RBULs" should be supported by reference. * Authors: We have added citations to back up our claim that ZNF598, CNOT4, MKRN2, TRIM25 and TRIM26 exhibit RNA-binding activity.

      *- Are the text and figures clear and accurate?

      • In Supplementary Figure 1, DMSO is shown in green and the treatment in red, whereas in the main figures (Figure 1B and 1F) the colours in the legend are inverted. To avoid confusion, the colour coding in figure legends should be consistent across all figures throughout the manuscript. *

      Authors: We have made the colors consistent across the main and supplementary figures.

      • *

      At line 79, the manuscript states that "inhibition of ubiquitination delayed fluorescence recovery dynamics of G3BP1-mCherry, relative to HS-treated cells (Figure 1F, Supplementary Fig. 6A)." However, the data shown in Figure 1F appear to indicate the opposite effect: the TAK243-treated condition (green curve) shows a faster fluorescence recovery compared to the control (red curve). This discrepancy between the text and the figure should be corrected or clarified, as it may affect the interpretation of the role of ubiquitination in SG dynamics. * Authors: Good catch. We now fixed the graphical mistake (Figure 1F and S6).

      • * Line 86: adjust a missing bracket * Authors: Thank you, we fixed it.

      • *

      There appears to be an error in the legend of Supplementary Figure 3: the legend states that the red condition (MKRN2) forms larger aggregates, but both the main Figure 3C of the confocal images and the text indicate that MKRN2 (red) forms smaller aggregates. Please correct the legend and any corresponding labels so they are consistent with the main figure and the text. The authors should also double-check that the figure panel order, color coding, and statistical annotations match the legend and the descriptions in the Results section to avoid reader confusion.

      * Authors: This unfortunate graphical mistake has been corrected.

      • * At lines 129-130, the manuscript states that "FRAP analysis demonstrated that MKRN2 KD resulted in a slight increase in SG liquidity (Fig. 3F, Supplementary Fig. 6B)." However, the data shown in Figure 3F appear to indicate the opposite trend: the MKRN2 KD condition (red curve) exhibits a faster fluorescence recovery compared to the control (green curve). This discrepancy between the text and the figure should be corrected or clarified, as it directly affects the interpretation of MKRN2's role in SG disassembly. Ensuring consistency between the written description and the plotted FRAP data is essential for accurate interpretation. * Authors: We thank the reviewer and clarify in the legend of Figure 3F and the Results the correct labels: indeed faster fluorescence recovery seen in MKRN2 KD is correctly interpreted as increased liquidity in the text.

      • *

      At lines 132-133, the manuscript states: "Then, to further test the impact of MKRN2 on SG dynamics, we overexpressed MKRN2-GFP and observed that it was recruited to SG (Fig. 3G)." This description should be corrected or clarified, as the over-expressed MKRN2-GFP also appears to localize to the nucleus. * Authors: The text has been modified to reflect both the study of MKRN2 localization to SGs and of nuclear localization.

      • *

      At lines 134-135, the manuscript states that the FK2 antibody detects "free ubiquitin." This is incorrect. FK2 does not detect free ubiquitin; it recognizes only ubiquitin conjugates, including mono-ubiquitinated and poly-ubiquitinated proteins. The text should be corrected accordingly to avoid misinterpretation of the immunostaining data. * Authors: Thank you for pointing out this error. We have corrected it.

      • * Figure 5A suffers from poor resolution, and no scale bar is provided, which limits interpretability. Additionally, the ROI selected for the green channel (DRIPs) appears to capture unspecific background staining, while the most obvious DRIP spots are localized in the nucleus. The authors should clarify this in the text, improve the image quality if possible, and ensure that the ROI accurately represents DRIP accumulation - in SGs rather than background signal. * Authors: We thank the reviewer for pointing the sub-optimal presentation of this figure. We modified Figure 5A to improve image quality and interpretation. Concerning the comment that “the most obvious DRIP spots are localized in the nucleus”, this is in line with our previous findings demonstrating that a fraction of DRiPs accumulates in nucleoli (Mediani et al. 2019 10.15252/embj.2018101341). To avoid misinterpretation, we modified Figure 5A as follows: (i) we provide a different image for control cells, exposed to heat shock and OP-puro; (ii) we select a ROI that only shows a few stress granules; (iii) we added arrowheads to indicate the nucleoli that are strongly enriched for DRiPs; (iv) we include a dotted line to show the nuclear membrane, helping to distinguish cytoplasm and nucleus in the red and green channel. We also include the scale bars (5 µm) in the image.

      * Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • In the first paragraph following the APEX proteomics results, the authors present validation data exclusively for MKRN2, justifying this early focus by stating that MKRN2 is the most SG-depleted E3 ligase. However, in the subsequent paragraph they introduce the RBULs and present knockdown data for MKRN2 along with two additional E3 ligases identified in the screen, before once again emphasizing that MKRN2 is the most SG-depleted ligase and therefore the main focus of the study. For clarity and logical flow, the manuscript would benefit from reordering the narrative. Specifically, the authors should first present the validation data for all three selected E3 ligases, and only then justify the decision to focus on MKRN2 for in-depth characterization. In addition to the extent of its SG depletion, the authors may also consider providing biologically relevant reasons for prioritizing MKRN2 (e.g., domain architecture, known roles in stress responses, or prior evidence of ubiquitination-related functions). Reorganizing this section would improve readability and better guide the reader through the rationale for the study's focus.*

      Authors: We thank the reviewer for this suggested improvement to our “storyline”. As suggested by the reviewer, we have moved the IF validation of MKRN2 to the following paragraph in order to improve the flow of the manuscript. We added additional justification to prioritizing MKRN2 citing (Youn et al. 2018 and Markmiller et al. 2018).

      • *

      At lines 137-138, the manuscript states: "Together these data indicate that MKRN2 regulates the assembly dynamics of SGs by promoting their coalescence during HS and can increase SG ubiquitin content." While Figure 3G shows some co-localization of MKRN2 with ubiquitin, immunofluorescence alone is insufficient to claim an increase in SG ubiquitin content. This conclusion should be supported by orthogonal experiments, such as Western blotting, in vitro ubiquitination assays, or immunoprecipitation of SG components. Including a control under no-stress conditions would also help demonstrate that ubiquitination increases specifically in response to stress. The second part of the statement should therefore be rephrased to avoid overinterpretation, for example:"...and may be associated with increased ubiquitination within SGs, as suggested by co-localization, pending further validation by complementary assays." * Authors: The statement has been rephrased in a softer way as suggested by the reviewer.

      • At line 157, the statement: "Therefore, we conclude that MKRN2 ubiquitinates a subset of DRiPs, avoiding their accumulation inside SGs" should be rephrased as a preliminary observation. While the data support a role for MKRN2 in SG disassembly and a reduction of DRIPs, direct ubiquitination of DRIPs by MKRN2 has not been demonstrated. A more cautious phrasing would better reflect the current evidence and avoid overinterpretation. * * *Authors: We thank the reviewer for this suggestion and have altered the phrasing of this statement accordingly.

      *Reviewer #1 (Significance (Required)):

      General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

      • This study provides a valuable advancement in understanding the role of ubiquitination in stress granule (SG) dynamics and the clearance of SGs formed under heat stress. A major strength is the demonstration of how E3 ligases identified through proteomic screening, particularly MKRN2, influence SG assembly and disassembly in a ubiquitination- and heat stress-dependent manner. The combination of proteomics, imaging, and functional assays provides a coherent mechanistic framework linking ubiquitination to SG homeostasis. Limitations of the study include the exclusive use of a single model system (U2OS cells), which may limit generalizability. Additionally, some observations-such as MKRN2-dependent ubiquitination within SGs and changes in DRIP accumulation under different conditions-would benefit from orthogonal validation experiments (e.g., Western blotting, immunoprecipitation, or in vitro assays) to confirm and strengthen these findings. Addressing these points would enhance the robustness and broader applicability of the conclusions.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      • The closest related result in literature is - Yang, Cuiwei et al. "Stress granule homeostasis is modulated by TRIM21-mediated ubiquitination of G3BP1 and autophagy-dependent elimination of stress granules." Autophagy vol. 19,7 (2023): 1934-1951. doi:10.1080/15548627.2022.2164427 - demonstrating that TRIM21, an E3 ubiquitin ligase, catalyzes K63-linked ubiquitination of G3BP1, a core SG nucleator, under oxidative stress. This ubiquitination by TRIM21 inhibits SG formation, likely by altering G3BP1's propensity for phase separation. In contrast, the MKRN2 study identifies a different E3 (MKRN2) that regulates SG dynamics under heat stress and appears to influence both assembly and disassembly. This expands the role of ubiquitin ligases in SG regulation beyond those previously studied (like TRIM21).

      • Gwon and colleagues (Gwon Y, Maxwell BA, Kolaitis RM, Zhang P, Kim HJ, Taylor JP. Ubiquitination of G3BP1 mediates stress granule disassembly in a context-specific manner. Science. 2021;372(6549):eabf6548. doi:10.1126/science.abf6548) have shown that K63-linked ubiquitination of G3BP1 is required for SG disassembly after heat stress. This ubiquitinated G3BP1 recruits the segregase VCP/p97, which helps extract G3BP1 from SGs for disassembly. The MKRN2 paper builds on this by linking UBA1-dependent ubiquitination and MKRN2's activity to SG disassembly. Specifically, they show MKRN2 knockdown affects disassembly, and suggest MKRN2 helps prevent accumulation of defective ribosomal products (DRiPs) in SGs, adding a new layer to the ubiquitin-VCP model.

      • Ubiquitination's impact is highly stress- and context-dependent (different chain types, ubiquitin linkages, and recruitment of E3s). The MKRN2 work conceptually strengthens this idea: by showing that MKRN2's engagement with SGs depends on active ubiquitination via UBA1, and by demonstrating functional consequences (SG dynamics + DRIP accumulation), the study highlights how cellular context (e.g., heat stress) can recruit specific ubiquitin ligases to SGs and modulate their behavior.

      • There is a gap in the literature: very few (if any) studies explicitly combine the biology of DRIPs, stress granules, and E3 ligase mediated ubiquitination, especially in mammalian cells. There are relevant works about DRIP biology in stress granules, but those studies focus on chaperone-based quality control, not ubiquitin ligase-mediated ubiquitination of DRIPs. This study seems to be one of the first to make that connection in mammalian (or human-like) SG biology. A work on the plant DRIP-E3 ligase TaSAP5 (Zhang N, Yin Y, Liu X, et al. The E3 Ligase TaSAP5 Alters Drought Stress Responses by Promoting the Degradation of DRIP Proteins. Plant Physiol. 2017;175(4):1878-1892. doi:10.1104/pp.17.01319 ) shows that DRIPs can be directly ubiquitinated by E3s in other biological systems - which supports the plausibility of the MKRN2 mechanism, but it's not the same context.

      • A very recent review (Yuan, Lin et al. "Stress granules: emerging players in neurodegenerative diseases." Translational neurodegeneration vol. 14,1 22. 12 May. 2025, doi:10.1186/s40035-025-00482-9) summarizes and reinforces the relationship among SGs and the pathogenesis of different neurodegenerative diseases (NDDs). By identifying MKRN2 as a new ubiquitin regulator in SGs, the current study could have relevance for neurodegeneration and proteotoxic diseases, providing a new candidate to explore in disease models.

      Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

      The audience for this paper is primarily specialized, including researchers in stress granule biology, ubiquitin signaling, protein quality control, ribosome biology, and cellular stress responses. The findings will also be of interest to scientists working on granulostasis, nascent protein surveillance, and proteostasis mechanisms. Beyond these specific fields, the study provides preliminary evidence linking ubiquitination to DRIP handling and SG dynamics, which may stimulate new research directions and collaborative efforts across complementary areas of cell biology and molecular biology.

      • Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I work in ubiquitin biology, focusing on ubiquitination signaling in physiological and disease contexts, with particular expertise in the identification of E3 ligases and their substrates across different cellular systems and in vivo models. I have less expertise in stress granule dynamics and DRiP biology, so my evaluation of those aspects is more limited and relies on interpretation of the data presented in the manuscript.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      This study identifies the E3 ubiquitin ligase Makorin 2 (MKRN2) as a novel regulator of stress granule (SG) dynamics and proteostasis. Using APEX proximity proteomics, the authors demonstrate that inhibition of the ubiquitin-activating enzyme UBA1 with TAK243 alters the SG proteome, leading to depletion of several E3 ligases, chaperones, and VCP cofactors. Detailed characterization of MKRN2 reveals that it localizes to SGs in a ubiquitination-dependent manner and is required for proper SG assembly, coalescence, and disassembly. Functionally, MKRN2 prevents the accumulation of defective ribosomal products (DRiPs) within SGs, thereby maintaining granulostasis. The study provides compelling evidence that ubiquitination, mediated specifically by MKRN2, plays a critical role in surveilling stress-damaged proteins within SGs and maintaining their dynamic liquid-like properties. Major issues: 1. Figures 1-2: Temporal dynamics of ubiquitination in SGs. The APEX proteomics was performed at a single timepoint (90 min heat stress), yet the live imaging data show that SG dynamics and TAK243 effects vary considerably over time: • The peak or SG nucleation was actually at 10-30 min (Figure 1B). • TAK243 treatment causes earlier SG nucleation (Figure 1B) but delayed disassembly (Figure 1A-B, D). A temporal proteomic analysis at multiple timepoints (e.g., 30 min, 60 min, 90 min of heat stress, and during recovery) would reveal whether MKRN2 and other ubiquitination-dependent proteins are recruited to SGs dynamically during the stress response. It would also delineate whether different E3 ligases predominate at different stages of the SG lifecycle. While such experiments may be beyond the scope of the current study, the authors should at minimum discuss this limitation and acknowledge that the single-timepoint analysis may miss dynamic changes in SG composition. *

      Authors: We thank the reviewer for identifying this caveat in our methodology. We now discuss this limitation and acknowledge that the single-timepoint analysis may miss dynamic changes in SG composition.

      * Figures 2D-E, 3G: MKRN2 localization mechanism requires clarification. The authors demonstrate that MKRN2 localization to SGs is dependent on active ubiquitination, as TAK243 treatment significantly reduces MKRN2 partitioning into SGs (Figure 2D-E). However, several mechanistic questions remain: • Does MKRN2 localize to SGs through binding to ubiquitinated substrates within SGs, or does MKRN2 require its own ubiquitination activity to enter SGs? • The observation that MKRN2 overexpression increases SG ubiquitin content (Figure 3G-H) could indicate either: (a) MKRN2 actively ubiquitinates substrates within SGs, or (b) MKRN2 recruitment brings along pre-ubiquitinated substrates from the cytoplasm. • Is MKRN2 localization to SGs dependent on its E3 ligase activity? A catalytically inactive mutant of MKRN2 would help distinguish whether MKRN2 must actively ubiquitinate proteins to remain in SGs or whether it binds to ubiquitinated proteins independently of its catalytic activity. The authors should clarify whether MKRN2's SG localization depends on its catalytic activity or on binding to ubiquitinated proteins, as this would fundamentally affect the interpretation of its role in SG dynamics. *

      Authors: We thank the reviewer for this experimental suggestion. We will perform an analysis of the SG partitioning coefficient between WT-MKRN2 and a RING mutant of MKRN2.

      * Figures 3-4: Discrepancy between assembly and disassembly phenotypes. MKRN2 knockdown produces distinct phenotypes during SG assembly versus disassembly. During assembly: smaller, more numerous SGs that fail to coalesce (Figure 3A-E), while during disassembly: delayed SG clearance (Figure 4A-D). These phenotypes may reflect different roles for MKRN2 at different stages, but the mechanism underlying this stage-specificity is unclear: • Does MKRN2 have different substrates or utilize different ubiquitin chain types during assembly versus disassembly? • The increased SG liquidity upon MKRN2 depletion (Figure 3F) seems paradoxical with delayed disassembly- typically more liquid condensates disassemble faster. The authors interpret this as decreased coalescence into "dense and mature SGs," but this requires clarification. • How does prevention of DRiP accumulation relate to the assembly defect? One would predict that DRiP accumulation would primarily affect disassembly (by reducing liquidity), yet MKRN2 depletion impacts both assembly dynamics and DRiP accumulation. The authors should discuss how MKRN2's role in preventing DRiP accumulation mechanistically connects to both the assembly and disassembly phenotypes. *

      Authors: We thank the reviewer and will add to the Discussion a mention of a precedent for this precise phenotype from our previous work (Seguin et al., 2014).

      * Figure 5: Incomplete characterization of MKRN2 substrates. While the authors convincingly demonstrate that MKRN2 prevents DRiP accumulation in SGs (Figure 5C-D), the direct substrates of MKRN2 remain unknown. The authors acknowledge in the limitations that "the direct MKRN2 substrates and ubiquitin-chain types (K63/K48) are currently unknown." However, several approaches could strengthen the mechanistic understanding: • Do DRiPs represent direct MKRN2 substrates? Co-immunoprecipitation of MKRN2 followed by ubiquitin-chain specific antibodies (K48 vs K63) could reveal whether MKRN2 mediates degradative (K48) or non-degradative (K63) ubiquitination. *

      Authors: The DRiPs generated in the study represent truncated versions of all the proteins that were in the process of being synthesized by the cell at the moment of the stress, and therefore include both MKRN2 specific substrates and MKRN2 independent substrates. Identifying specific MKRN2 substrates, while interesting as a new research avenue, is not within the scope of the present study.

      • * Given that VCP cofactors (such as UFD1L, PLAA) are depleted from SGs upon UBA1 inhibition (Figure 2C) and these cofactors recognize ubiquitinated substrates, does MKRN2 function upstream of VCP recruitment? Testing whether MKRN2 depletion affects VCP cofactor localization to SGs would clarify this pathway. * Authors: We thank the reviewer for requesting and will address it by performing MKRN2 KD, VCP immunofluorescence microscopy and perform SG partition coefficient analysis.

      • * The authors note that MKRN2 knockdown produces a phenotype reminiscent of VCP inhibition-smaller, more numerous SGs with increased DRiP partitioning. This similarity suggests MKRN2 may function in the same pathway as VCP. Direct epistasis experiments would strengthen this connection. * Authors: This study is conditional results of the above study. If VCP partitioning to SGs is reduced upon MKRN2 KD, which we do not know at this point, then MKRN2/VCP double KD experiment will be performed to strengthen this connection.

      * Alternative explanations for the phenotype of delayed disassembly with TAK243 or MKRN2 depletion- the authors attribute this to DRiP accumulation, but TAK243 affects global ubiquitination. Could impaired degradation of other SG proteins (not just DRiPs) contribute to delayed disassembly? Does proteasome inhibition (MG-132 treatment) phenocopy the MKRN2 depletion phenotype? This would support that MKRN2-mediated proteasomal degradation (via K48 ubiquitin chains) is key to the phenotype. *

      Authors: We are happy to provide alternative explanations in the Discussion in line with Reviewer #2 suggestion. The role of the proteosome is out of the scope of our study.

      • Comparison with other E3 ligases (Supplementary Figure 5): The authors show that CNOT4 and ZNF598 depletion also affect SG dynamics, though to lesser extents than MKRN2. However: • Do these E3 ligases also prevent DRiP accumulation in SGs? Testing OP-puro partitioning in CNOT4- or ZNF598-depleted cells would reveal whether DRiP clearance is a general feature of SG-localized E3 ligases or specific to MKRN2. *

      • * Are there redundant or compensatory relationships between these E3 ligases? Do double knockdowns have additive effects? * Authors: Our paper presents a study of the E3 ligase MKRN2. Generalizing these observations to ZNF598, CNOT4 and perhaps an even longer list of E3s, may be an interesting question, outside the scope of our mission.

      • * The authors note that MKRN2 is "the most highly SG-depleted E3 upon TAK243 treatment"-does this mean MKRN2 has the strongest dependence on active ubiquitination for its SG localization, or simply that it has the highest basal level of SG partitioning? * Authors: We thank the reviewer for this smart question. MKRN2 has the strongest dependence on active ubiquitination as we now clarify better in the Results.

      *Reviewer #2 (Significance (Required)):

      This is a well-executed study that identifies MKRN2 as an important regulator of stress granule dynamics and proteostasis. The combination of proximity proteomics, live imaging, and functional assays provides strong evidence for MKRN2's role in preventing DRiP accumulation and maintaining granulostasis. However, key mechanistic questions remain, particularly regarding MKRN2's direct substrates, the ubiquitin chain types it generates, and how its enzymatic activity specifically prevents DRiP accumulation while promoting both SG coalescence and disassembly. Addressing the suggested revisions, particularly those related to MKRN2's mechanism of SG localization and substrate specificity, would significantly strengthen the manuscript and provide clearer insights into how ubiquitination maintains the dynamic properties of stress granules under proteotoxic stress.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      In this paper, Amzallag et al. investigate the relationship between ubiquitination and the dynamics of stress granules (SGs). They utilize proximity ligation coupled mass spectrometry to identify SG components under conditions where the proteasome is inhibited by a small drug that targets UBiquitin-like modifier Activating enzyme 1 (UBA1), which is crucial for the initial step in the ubiquitination of misfolded proteins. Their findings reveal that the E3 ligase Makorin2 (MKRN2) is a novel component of SGs. Additionally, their data suggest that MKRN2 is necessary for processing damaged ribosome-associated proteins (DRIPs) during heat shock (HS). In the absence of MKRN2, DRIPs accumulate in SGs, which affects their dynamics. Major comments: Assess the knockdown efficiency (KD) for CNOT1, ZNF598, and MKRN2 to determine if the significant effect observed on SG dynamics upon MKRN2 depletion is due to the protein's function rather than any possible differences in KD efficiency. *

      Authors: To address potential variability in knockdown efficiency, we will quantify CNOT4, ZNF598, and MKRN2 mRNA levels by RT-qPCR following siRNA knockdown.

      * Since HS-induced stress granules (SGs) are influenced by the presence of TAK-243 or MKRN2 depletion, could it be that these granules become more mature and thus acquire more defective ribosomal products (DRIPs)? Do HS cells reach the same level of DRIPs, as assessed by OP-Puro staining, at a later time point? *

      Authors: an interesting question. Mateju et al. carefully characterized the time course of DRiP accumulation in stress granules during heat shock, decreasing after the 90 minutes point (Appendix Figure S7; 10.15252/embj.201695957). We therefore interpret DRiP accumulation in stress granules following TAK243 treatment as a pathological state, reflecting impaired removal and degradation of DRiPs, rather than a normal, more “mature” stress granule state.

      * Incorporating OP-Puro can lead to premature translation termination, potentially confounding results. Consider treating cells with a short pulse (i.e., 5 minutes) of OP-Puro just before fixation. *

      Authors: Thank you for this suggestion. Treating the cell with a short pulse of OP-Puro just before fixation will lead to the labelling of a small amount of proteins, likely undetectable using conventional microscopy or Western blotting. Furthermore, it will lead to the unwanted labeling of stress responsive proteins that are translated with non canonical cap-independent mechanisms upon stress.

      * Is MKRN2's dependence limited to HS-induced SGs? *

      Authors: We will test sodium arsenite–induced stress and use immunofluorescence at discrete time points to assess whether the heat shock–related observations generalize to other stress types.

      *

      Minor comments: Abstract: Introduce UBA1. Introduction: The reference [2] should be replaced with 25719440. Results: Line 70, 'G3BP1 and 2 genes,' is somewhat misleading. Consider rephrasing into 'G3BP1 and G3BP2 genes'. Line 103: considers rephrasing 'we orthogonally validated the ubiquitin-dependent interaction' to 'we orthogonally validated the ubiquitin-dependent stress granule localization'. Line 125: '(fig.3C, EI Supplementary fig. 3)' Remove 'I'. Methods: line 260: the reference is not linked (it should be ref. [26]). Line 225: Are all the KDs being performed using the same method? Please specify. *

      Authors: The text has been altered to reflect the reviewer’s suggestions.

      *Fig.2C: Consider adding 'DEPLETED' on top of the scheme.

      Reviewer #3 (Significance (Required)):

      The study offers valuable insights into the degradative processes associated with SGs. The figures are clear, and the experimental quality is high. The authors do not overstate or overinterpret their findings, and the results effectively support their claims. However, the study lacks orthogonal methods to validate the findings and enhance the results. For instance, incorporating biochemical and reporter-based methods to measure degradation-related intermediate products (DRIPs) would be beneficial. Additionally, utilizing multiple methods to block ubiquitination, studying the dynamics of MKRN2 on SGs, and examining the consequences of excessive DRIPs on the cell fitness of SGs would further strengthen the research. *

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      In this study, the authors used proximity proteomics in U2OS cells to identify several E3 ubiquitin ligases recruited to stress granules (SGs), and they focused on MKRN2 as a novel regulator. They show that MKRN2 localization to SGs requires active ubiquitination via UBA1. Functional experiments demonstrated that MKRN2 knockdown increases the number of SG condensates, reduces their size, slightly raises SG liquidity during assembly, and slows disassembly after heat shock. Overexpression of MKRN2-GFP combined with confocal imaging revealed co-localization of MKRN2 and ubiquitin in SGs. By perturbing ubiquitination (using a UBA1 inhibitor) and inducing defective ribosomal products (DRiPs) with O-propargyl puromycin, they found that both ubiquitination inhibition and MKRN2 depletion lead to increased accumulation of DRiPs in SGs. The authors conclude that MKRN2 supports granulostasis, the maintenance of SG homeostasis , through its ubiquitin ligase activity, preventing pathological DRiP accumulation within SGs.

      Major comments:

      • Are the key conclusions convincing?

      The key conclusions are partially convincing. The data supporting the role of ubiquitination and MKRN2 in regulating SG condensate dynamics are coherent, well controlled, and consistent with previous literature, making this part of the study solid and credible. However, the conclusions regarding the ubiquitin-dependent recruitment of MKRN2 to SGs, its relationship with UBA1 activity, the functional impact of the MKRN2 knockdown for DRiP accumulation are less thoroughly supported. These aspects would benefit from additional mechanistic evidence, validation in complementary model systems, or the use of alternative methodological approaches to strengthen the causal connections drawn by the authors. - Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? The authors should qualify some of their claims as preliminary.

      1) MKRN2 recruitment to SGs (ubiquitin-dependent): The proteomics and IF data are a reasonable starting point, but they do not yet establish that MKRN2 is recruited from its physiological localization to SGs in a ubiquitin-dependent manner. To avoid overstating this point the authors should qualify the claim and/or provide additional controls: show baseline localization of endogenous MKRN2 under non-stress conditions (which is reported in literature to be nuclear and cytoplasmatic), include quantification of nuclear/cytoplasmic distribution, and demonstrate a shift into bona fide SG compartments after heat shock. Moreover, co-localization of overexpressed GFP-MKRN2 with poly-Ub (FK2) should be compared to a non-stress control and to UBA1-inhibition conditions to support claims of stress- and ubiquitination-dependent recruitment.

      2) Use and interpretation of UBA1 inhibition: UBA1 inhibition effectively blocks ubiquitination globally, but it is non-selective. The manuscript should explicitly acknowledge this limitation when interpreting results from both proteomics and functional assays. Proteomics hits identified under UBA1 inhibition should be discussed as UBA1-dependent associations rather than as evidence for specific E3 ligase recruitment. The authors should consider orthogonal approaches before concluding specificity.

      3) DRiP accumulation and imaging quality: The evidence presented in Figure 5 is sufficient to substantiate the claim that DRiPs accumulate in SGs upon ubiquitination inhibition or MKRN2 depletion but to show that the event of the SGs localization and their clearance from SGs during stress is promoted by MKRN3 ubiquitin ligase activity more experiments would be needed. - Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation. Yes, a few targeted experiments would strengthen the conclusions without requiring the authors to open new lines of investigation.

      1) Baseline localization of MKRN2: It would be important to show the baseline localization of endogenous and over-expressed MKRN2 (nuclear and cytoplasmic) under non-stress conditions and prior to ubiquitination inhibition. This would provide a reference to quantify redistribution into SGs and demonstrate recruitment in response to heat stress or ubiquitination-dependent mechanisms.

      2) Specificity of MKRN2 ubiquitin ligase activity: to address the non-specific effects of UBA1 inhibition and validate that observed phenotypes depend on MKRN2's ligase activity, the authors could employ a catalytically inactive MKRN2 mutant in rescue experiments. Comparing wild-type and catalytic-dead MKRN2 in the knockdown background would clarify the causal role of MKRN2 activity in SG dynamics and DRiP clearance.

      3) Ubiquitination linkage and SG marker levels: While the specific ubiquitin linkage type remains unknown, examining whether MKRN2 knockdown or overexpression affects total levels of key SG marker proteins would be informative. This could be done via Western blotting of SG markers along with ubiquitin staining, to assess whether MKRN2 influences protein stability or turnover through degradative or non-degradative ubiquitination. Such data would strengthen the mechanistic interpretation while remaining within the current study's scope. - Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. The experiments suggested in points 1 and 3 are realistic and should not require substantial additional resources beyond those already used in the study. - Point 1 (baseline localization of MKRN2): This involves adding two control conditions (no stress and no ubiquitination inhibition) for microscopy imaging. The setup is essentially the same as in the current experiments, with time requirements mainly dependent on cell culture growth and imaging. Overall, this could be completed within a few weeks. - Point 3 (SG marker levels and ubiquitination): This entails repeating the existing experiment and adding a Western blot for SG markers and ubiquitin. The lab should already have the necessary antibodies, and the experiment could reasonably be performed within a couple of weeks. - Point 2 (catalytically inactive MKRN2 mutant and rescue experiments): This is likely more time-consuming. Designing an effective catalytic-dead mutant depends on structural knowledge of MKRN2 and may require additional validation to confirm loss of catalytic activity. If this expertise is not already present in the lab, it could significantly extend the timeline. Therefore, this experiment should be considered only if similarly recommended by other reviewers, as it represents a higher resource and time investment.

      Overall, points 1 and 3 are highly feasible, while point 2 is more substantial and may require careful planning. - Are the data and the methods presented in such a way that they can be reproduced?

      Yes. The methodologies used in this study to analyze SG dynamics and DRiP accumulation are well-established in the field and should be reproducible, particularly by researchers experienced in stress granule biology. Techniques such as SG assembly and disassembly assays, use of G3BP1 markers, and UBA1 inhibition are standard and clearly described. The data are generally presented in a reproducible manner; however, as noted above, some results would benefit from additional controls or complementary experiments to fully support specific conclusions. - Are the experiments adequately replicated and statistical analysis adequate?

      Overall, the experiments in the manuscript appear to be adequately replicated, with most assays repeated between three and five times, as indicated in the supplementary materials. The statistical analyses used are appropriate and correctly applied to the datasets presented. However, for Figure 5 the number of experimental replicates is not reported. This should be clarified, and if the experiment was not repeated sufficiently, additional biological replicates should be performed. Given that this figure provides central evidence supporting the conclusion that DRiP accumulation depends on ubiquitination-and partly on MKRN2's ubiquitin ligase activity-adequate replication is essential.

      Minor comments:

      • Specific experimental issues that are easily addressable.
        • For the generation and the validation of MKRN2 knockdown in UOS2 cells data are not presented in the results or in the methods sections to demonstrate the effective knockdown of the protein of interest. This point is quite essential to demonstrate the validity of the system used
        • In the supplementary figure 2 it would be useful to mention if the Western Blot represent the input (total cell lysates) before the APEX-pulldown or if it is the APEX-pulldown loaded for WB. There is no consistence in the difference of biotynilation between different replicates shown in the 2 blots. For example in R1 and R2 G3BP1-APX TAK243 the biotynilation is one if the strongest condition while on the left blot, in the same condition comparison samples R3 and R4 are less biotinilated compared to others. It would be useful to provide an explanation for that to avoid any confusion for the readers.
        • In Figure 2D, endogenous MKRN2 localization to SGs appears reduced following UBA1 inhibition. However, it is not clear whether this reduction reflects a true relocalization or a decrease in total MKRN2 protein levels. To support the interpretation that UBA1 inhibition specifically affects MKRN2 recruitment to SGs rather than its overall expression, the authors should provide data showing total MKRN2 levels remain unchanged under UBA1 inhibition, for example via Western blot of total cell lysates.
        • DRIPs accumulation is followed during assembly but in the introduction is highlighted the fact that ubiquitination events, other reported E3 ligases and in this study data on MKRN2 showed that they play a crucial role in the disassembly of SGs which is also related with cleareance of DRIPs. Authors could add tracking DRIPs accumulation during disassembly to be added to Figure 5. I am not sure about the timeline required for this but I am just adding as optional if could be addressed easily.
        • The authors should clarify in the text why the cutoff used for the quantification in Figure 5D (PC > 3) differs from the cutoff used elsewhere in the paper (PC > 1.5). Providing a rationale for this choice will help the reader understand the methodological consistency and ensure that differences in thresholds do not confound interpretation of the results.
        • For Figure 3G, the authors use over-expressed MKRN2-GFP to assess co-localization with ubiquitin in SGs. Given that a reliable antibody for endogenous MKRN2 is available and that a validated MKRN2 knockdown line exists as an appropriate control, this experiment would gain significantly in robustness and interpretability if co-localization were demonstrated using endogenous MKRN2. In the current over-expression system, MKRN2-GFP is also present in the nucleus, whereas the endogenous protein does not appear nuclear under the conditions shown. This discrepancy raises concerns about potential over-expression artifacts or mislocalization. Demonstrating co-localization using endogenous MKRN2 would avoid confounding effects associated with over-expression. If feasible, this would be a relatively straightforward experiment to implement, as it relies on tools (antibody and knockdown line) already described in the manuscript.
      • Are prior studies referenced appropriately?

        • From line 54 to line 67, the manuscript in total cites eight papers regarding the role of ubiquitination in SG disassembly. However, given the use of UBA1 inhibition in the initial MS-APEX experiment and the extensive prior literature on ubiquitination in SG assembly and disassembly under various stress conditions, the manuscript would benefit from citing additional relevant studies to provide more specifc examples. Expanding the references would provide stronger context, better connect the current findings to prior work, and emphasize the significance of the study in relation to established literature
        • At line 59, it would be helpful to note that G3BP1 is ubiquitinated by TRIM21 through a Lys63-linked ubiquitin chain. This information provides important mechanistic context, suggesting that ubiquitination of SG proteins in these pathways is likely non-degradative and related to functional regulation of SG dynamics rather than protein turnover.
        • When citing references 16 and 17, which report that the E3 ligases TRIM21 and HECT regulate SG formation, the authors should provide a plausible explanation for why these specific E3 ligases were not detected in their proteomics experiments. Differences could arise from the stress stimulus used, cell type, or experimental conditions. Similarly, since MKRN2 and other E3 ligases identified in this study have not been reported in previous works, discussing these methodological or biological differences would help prevent readers from questioning the credibility of the findings. It would also be valuable to clarify in the Conclusion that different types of stress may activate distinct ubiquitination pathways, highlighting context-dependent regulation of SG assembly and disassembly.
        • Line 59-60: when referring to the HECT family of E3 ligases involved in ubiquitination and SG disassembly, it would be more precise to report the specific E3 ligase identified in the cited studies rather than only the class of ligase. This would provide clearer mechanistic context and improve accuracy for readers.
        • The specific statement on line 182 "SG E3 ligases that depend on UBA1 activity are RBULs" should be supported by reference.
        • Are the text and figures clear and accurate?
        • In Supplementary Figure 1, DMSO is shown in green and the treatment in red, whereas in the main figures (Figure 1B and 1F) the colours in the legend are inverted. To avoid confusion, the colour coding in figure legends should be consistent across all figures throughout the manuscript.
        • At line 79, the manuscript states that "inhibition of ubiquitination delayed fluorescence recovery dynamics of G3BP1-mCherry, relative to HS-treated cells (Figure 1F, Supplementary Fig. 6A)." However, the data shown in Figure 1F appear to indicate the opposite effect: the TAK243-treated condition (green curve) shows a faster fluorescence recovery compared to the control (red curve). This discrepancy between the text and the figure should be corrected or clarified, as it may affect the interpretation of the role of ubiquitination in SG dynamics.
        • Line 86: adjust a missing bracket
        • There appears to be an error in the legend of Supplementary Figure 3: the legend states that the red condition (MKRN2) forms larger aggregates, but both the main Figure 3C of the confocal images and the text indicate that MKRN2 (red) forms smaller aggregates. Please correct the legend and any corresponding labels so they are consistent with the main figure and the text. The authors should also double-check that the figure panel order, color coding, and statistical annotations match the legend and the descriptions in the Results section to avoid reader confusion.
        • At lines 129-130, the manuscript states that "FRAP analysis demonstrated that MKRN2 KD resulted in a slight increase in SG liquidity (Fig. 3F, Supplementary Fig. 6B)." However, the data shown in Figure 3F appear to indicate the opposite trend: the MKRN2 KD condition (red curve) exhibits a faster fluorescence recovery compared to the control (green curve). This discrepancy between the text and the figure should be corrected or clarified, as it directly affects the interpretation of MKRN2's role in SG disassembly. Ensuring consistency between the written description and the plotted FRAP data is essential for accurate interpretation.
        • At lines 132-133, the manuscript states: "Then, to further test the impact of MKRN2 on SG dynamics, we overexpressed MKRN2-GFP and observed that it was recruited to SG (Fig. 3G)." This description should be corrected or clarified, as the over-expressed MKRN2-GFP also appears to localize to the nucleus.
        • At lines 134-135, the manuscript states that the FK2 antibody detects "free ubiquitin." This is incorrect. FK2 does not detect free ubiquitin; it recognizes only ubiquitin conjugates, including mono-ubiquitinated and poly-ubiquitinated proteins. The text should be corrected accordingly to avoid misinterpretation of the immunostaining data.
        • Figure 5A suffers from poor resolution, and no scale bar is provided, which limits interpretability. Additionally, the ROI selected for the green channel (DRIPs) appears to capture unspecific background staining, while the most obvious DRIP spots are localized in the nucleus. The authors should clarify this in the text, improve the image quality if possible, and ensure that the ROI accurately represents DRIP accumulation - in SGs rather than background signal.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • In the first paragraph following the APEX proteomics results, the authors present validation data exclusively for MKRN2, justifying this early focus by stating that MKRN2 is the most SG-depleted E3 ligase. However, in the subsequent paragraph they introduce the RBULs and present knockdown data for MKRN2 along with two additional E3 ligases identified in the screen, before once again emphasizing that MKRN2 is the most SG-depleted ligase and therefore the main focus of the study. For clarity and logical flow, the manuscript would benefit from reordering the narrative. Specifically, the authors should first present the validation data for all three selected E3 ligases, and only then justify the decision to focus on MKRN2 for in-depth characterization. In addition to the extent of its SG depletion, the authors may also consider providing biologically relevant reasons for prioritizing MKRN2 (e.g., domain architecture, known roles in stress responses, or prior evidence of ubiquitination-related functions). Reorganizing this section would improve readability and better guide the reader through the rationale for the study's focus.
      • At lines 137-138, the manuscript states: "Together these data indicate that MKRN2 regulates the assembly dynamics of SGs by promoting their coalescence during HS and can increase SG ubiquitin content." While Figure 3G shows some co-localization of MKRN2 with ubiquitin, immunofluorescence alone is insufficient to claim an increase in SG ubiquitin content. This conclusion should be supported by orthogonal experiments, such as Western blotting, in vitro ubiquitination assays, or immunoprecipitation of SG components. Including a control under no-stress conditions would also help demonstrate that ubiquitination increases specifically in response to stress. The second part of the statement should therefore be rephrased to avoid overinterpretation, for example:"...and may be associated with increased ubiquitination within SGs, as suggested by co-localization, pending further validation by complementary assays."
      • At line 157, the statement: "Therefore, we conclude that MKRN2 ubiquitinates a subset of DRiPs, avoiding their accumulation inside SGs" should be rephrased as a preliminary observation. While the data support a role for MKRN2 in SG disassembly and a reduction of DRIPs, direct ubiquitination of DRIPs by MKRN2 has not been demonstrated. A more cautious phrasing would better reflect the current evidence and avoid overinterpretation.

      Significance

      General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

      • This study provides a valuable advancement in understanding the role of ubiquitination in stress granule (SG) dynamics and the clearance of SGs formed under heat stress. A major strength is the demonstration of how E3 ligases identified through proteomic screening, particularly MKRN2, influence SG assembly and disassembly in a ubiquitination- and heat stress-dependent manner. The combination of proteomics, imaging, and functional assays provides a coherent mechanistic framework linking ubiquitination to SG homeostasis. Limitations of the study include the exclusive use of a single model system (U2OS cells), which may limit generalizability. Additionally, some observations-such as MKRN2-dependent ubiquitination within SGs and changes in DRIP accumulation under different conditions-would benefit from orthogonal validation experiments (e.g., Western blotting, immunoprecipitation, or in vitro assays) to confirm and strengthen these findings. Addressing these points would enhance the robustness and broader applicability of the conclusions.

      Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

      • The closest related result in literature is - Yang, Cuiwei et al. "Stress granule homeostasis is modulated by TRIM21-mediated ubiquitination of G3BP1 and autophagy-dependent elimination of stress granules." Autophagy vol. 19,7 (2023): 1934-1951. doi:10.1080/15548627.2022.2164427 - demonstrating that TRIM21, an E3 ubiquitin ligase, catalyzes K63-linked ubiquitination of G3BP1, a core SG nucleator, under oxidative stress. This ubiquitination by TRIM21 inhibits SG formation, likely by altering G3BP1's propensity for phase separation. In contrast, the MKRN2 study identifies a different E3 (MKRN2) that regulates SG dynamics under heat stress and appears to influence both assembly and disassembly. This expands the role of ubiquitin ligases in SG regulation beyond those previously studied (like TRIM21).
      • Gwon and colleagues (Gwon Y, Maxwell BA, Kolaitis RM, Zhang P, Kim HJ, Taylor JP. Ubiquitination of G3BP1 mediates stress granule disassembly in a context-specific manner. Science. 2021;372(6549):eabf6548. doi:10.1126/science.abf6548) have shown that K63-linked ubiquitination of G3BP1 is required for SG disassembly after heat stress. This ubiquitinated G3BP1 recruits the segregase VCP/p97, which helps extract G3BP1 from SGs for disassembly. The MKRN2 paper builds on this by linking UBA1-dependent ubiquitination and MKRN2's activity to SG disassembly. Specifically, they show MKRN2 knockdown affects disassembly, and suggest MKRN2 helps prevent accumulation of defective ribosomal products (DRiPs) in SGs, adding a new layer to the ubiquitin-VCP model.
      • Ubiquitination's impact is highly stress- and context-dependent (different chain types, ubiquitin linkages, and recruitment of E3s). The MKRN2 work conceptually strengthens this idea: by showing that MKRN2's engagement with SGs depends on active ubiquitination via UBA1, and by demonstrating functional consequences (SG dynamics + DRIP accumulation), the study highlights how cellular context (e.g., heat stress) can recruit specific ubiquitin ligases to SGs and modulate their behavior.
      • There is a gap in the literature: very few (if any) studies explicitly combine the biology of DRIPs, stress granules, and E3 ligase mediated ubiquitination, especially in mammalian cells. There are relevant works about DRIP biology in stress granules, but those studies focus on chaperone-based quality control, not ubiquitin ligase-mediated ubiquitination of DRIPs. This study seems to be one of the first to make that connection in mammalian (or human-like) SG biology. A work on the plant DRIP-E3 ligase TaSAP5 (Zhang N, Yin Y, Liu X, et al. The E3 Ligase TaSAP5 Alters Drought Stress Responses by Promoting the Degradation of DRIP Proteins. Plant Physiol. 2017;175(4):1878-1892. doi:10.1104/pp.17.01319 ) shows that DRIPs can be directly ubiquitinated by E3s in other biological systems - which supports the plausibility of the MKRN2 mechanism, but it's not the same context.
      • A very recent review (Yuan, Lin et al. "Stress granules: emerging players in neurodegenerative diseases." Translational neurodegeneration vol. 14,1 22. 12 May. 2025, doi:10.1186/s40035-025-00482-9) summarizes and reinforces the relationship among SGs and the pathogenesis of different neurodegenerative diseases (NDDs). By identifying MKRN2 as a new ubiquitin regulator in SGs, the current study could have relevance for neurodegeneration and proteotoxic diseases, providing a new candidate to explore in disease models.

      Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

      The audience for this paper is primarily specialized, including researchers in stress granule biology, ubiquitin signaling, protein quality control, ribosome biology, and cellular stress responses. The findings will also be of interest to scientists working on granulostasis, nascent protein surveillance, and proteostasis mechanisms. Beyond these specific fields, the study provides preliminary evidence linking ubiquitination to DRIP handling and SG dynamics, which may stimulate new research directions and collaborative efforts across complementary areas of cell biology and molecular biology.

      Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      I work in ubiquitin biology, focusing on ubiquitination signaling in physiological and disease contexts, with particular expertise in the identification of E3 ligases and their substrates across different cellular systems and in vivo models. I have less expertise in stress granule dynamics and DRiP biology, so my evaluation of those aspects is more limited and relies on interpretation of the data presented in the manuscript.

    1. Segregation

      Estructura sugerida: - Contexto macroeconómico Chileno, baja movilidad y privatización/mercantilización del bienestar para cerrar con educación - Descripción del sistema educacional - Consecuencias del sistema educacional (mercado educativo, investigaciones sobre justicia escolar y desigualdades) - Cerrar con por qué es relevante el caso Chileno

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      Summary:

      Zhou and colleagues developed a computational model of replay that heavily builds on cognitive models of memory in context (e.g., the context-maintenance and retrieval model), which have been successfully used to explain memory phenomena in the past. Their model produces results that mirror previous empirical findings in rodents and offers a new computational framework for thinking about replay.

      Strengths:

      The model is compelling and seems to explain a number of findings from the rodent literature. It is commendable that the authors implement commonly used algorithms from wakefulness to model sleep/rest, thereby linking wake and sleep phenomena in a parsimonious way. Additionally, the manuscript's comprehensive perspective on replay, bridging humans and non-human animals, enhanced its theoretical contribution.

      Weaknesses:

      This reviewer is not a computational neuroscientist by training, so some comments may stem from misunderstandings. I hope the authors would see those instances as opportunities to clarify their findings for broader audiences.

      (1) The model predicts that temporally close items will be co-reactivated, yet evidence from humans suggests that temporal context doesn't guide sleep benefits (instead, semantic connections seem to be of more importance; Liu and Ranganath 2021, Schechtman et al 2023). Could these findings be reconciled with the model or is this a limitation of the current framework?

      We appreciate the encouragement to discuss this connection. Our framework can accommodate semantic associations as determinants of sleep-dependent consolidation, which can in principle outweigh temporal associations. Indeed, prior models in this lineage have extensively simulated how semantic associations support encoding and retrieval alongside temporal associations. It would therefore be straightforward to extend our model to simulate how semantic associations guide sleep benefits, and to compare their contribution against that conferred by temporal associations across different experimental paradigms. In the revised manuscript, we have added a discussion of how our framework may simulate the role of semantic associations in sleep-dependent consolidation.

      “Several recent studies have argued for dominance of semantic associations over temporal associations in the process of human sleep-dependent consolidation (Schechtman et al., 2023; Liu and Ranganath 2021; Sherman et al., 2025), with one study observing no role at all for temporal associations (Schechtman et al., 2023). At first glance, these findings appear in tension with our model, where temporal associations drive offline consolidation. Indeed, prior models have accounted for these findings by suppressing temporal context during sleep (Liu and Ranganath 2024; Sherman et al., 2025). However, earlier models in the CMR lineage have successfully captured the joint contributions of semantic and temporal associations to encoding and retrieval (Polyn et al., 2009), and these processes could extend naturally to offline replay. In a paradigm where semantic associations are especially salient during awake learning, the model could weight these associations more and account for greater co-reactivation and sleep-dependent memory benefits for semantically related than temporally related items. Consistent with this idea, Schechtman et al. (2023) speculated that their null temporal effects likely reflected the task’s emphasis on semantic associations. When temporal associations are more salient and task-relevant, sleep-related benefits for temporally contiguous items are more likely to emerge (e.g., Drosopoulos et al., 2007; King et al., 2017).”

      The reviewer’s comment points to fruitful directions for future work that could employ our framework to dissect the relative contributions of semantic and temporal associations to memory consolidation.

      (2) During replay, the model is set so that the next reactivated item is sampled without replacement (i.e., the model cannot get "stuck" on a single item). I'm not sure what the biological backing behind this is and why the brain can't reactivate the same item consistently.

      Furthermore, I'm afraid that such a rule may artificially generate sequential reactivation of items regardless of wake training. Could the authors explain this better or show that this isn't the case?

      We appreciate the opportunity to clarify this aspect of the model. We first note that this mechanism has long been a fundamental component of this class of models (Howard & Kahana 2002). Many classic memory models (Brown et al., 2000; Burgess & Hitch, 1991; Lewandowsky & Murdock 1989) incorporate response suppression, in which activated items are temporarily inhibited. The simplest implementation, which we use here, removes activated items from the pool of candidate items. Alternative implementations achieve this through transient inhibition, often conceptualized as neuronal fatigue (Burgess & Hitch, 1991; Grossberg 1978). Our model adopts a similar perspective, interpreting this mechanism as mimicking a brief refractory period that renders reactivated neurons unlikely to fire again within a short physiological event such as a sharp-wave ripple. Importantly, this approach does not generate spurious sequences. Instead, the model’s ability to preserve the structure of wake experience during replay depends entirely on the learned associations between items (without these associations, item order would be random). Similar assumptions are also common in models of replay. For example, reinforcement learning models of replay incorporate mechanisms such as inhibition to prevent repeated reactivations (e.g., Diekmann & Cheng, 2023) or prioritize reactivation based on ranking to limit items to a single replay (e.g., Mattar & Daw, 2018). We now discuss these points in the section titled “A context model of memory replay”

      “This mechanism of sampling without replacement, akin to response suppression in established context memory models (Howard & Kahana 2002), could be implemented by neuronal fatigue or refractory dynamics (Burgess & Hitch, 1991; Grossberg 1978). Non-repetition during reactivation is also a common assumption in replay models that regulate reactivation through inhibition or prioritization (Diekmann & Cheng 2023; Mattar & Daw 2018; Singh et al., 2022).”

      (3) If I understand correctly, there are two ways in which novelty (i.e., less exposure) is accounted for in the model. The first and more talked about is the suppression mechanism (lines 639-646). The second is a change in learning rates (lines 593-595). It's unclear to me why both procedures are needed, how they differ, and whether these are two different mechanisms that the model implements. Also, since the authors controlled the extent to which each item was experienced during wakefulness, it's not entirely clear to me which of the simulations manipulated novelty on an individual item level, as described in lines 593-595 (if any).

      We agree that these mechanisms and their relationships would benefit from clarification. As noted, novelty influences learning through two distinct mechanisms. First, the suppression mechanism is essential for capturing the inverse relationship between the amount of wake experience and the frequency of replay, as observed in several studies. This mechanism ensures that items with high wake activity are less likely to dominate replay. Second, the decrease in learning rates with repetition is crucial for preserving the stochasticity of replay. Without this mechanism, the model would increase weights linearly, leading to an exponential increase in the probability of successive wake items being reactivated back-to-back due to the use of a softmax choice rule. This would result in deterministic replay patterns, which are inconsistent with experimental observations.

      We have revised the Methods section to explicitly distinguish these two mechanisms:

      “This experience-dependent suppression mechanism is distinct from the reduction of learning rates through repetition; it does not modulate the update of memory associations but exclusively governs which items are most likely to initiate replay.”

      We have also clarified our rationale for including a learning rate reduction mechanism:

      “The reduction in learning rates with repetition is important for maintaining a degree of stochasticity in the model’s replay during task repetition, since linearly increasing weights would, through the softmax choice rule, exponentially amplify differences in item reactivation probabilities, sharply reducing variability in replay.”

      Finally, we now specify exactly where the learning-rate reduction applied, namely in simulations where sequences are repeated across multiple sessions:

      “In this simulation, the learning rates progressively decrease across sessions, as described above.“

      As to the first mechanism - experience-based suppression - I find it challenging to think of a biological mechanism that would achieve this and is selectively activated immediately before sleep (somehow anticipating its onset). In fact, the prominent synaptic homeostasis hypothesis suggests that such suppression, at least on a synaptic level, is exactly what sleep itself does (i.e., prune or weaken synapses that were enhanced due to learning during the day). This begs the question of whether certain sleep stages (or ultradian cycles) may be involved in pruning, whereas others leverage its results for reactivation (e.g., a sequential hypothesis; Rasch & Born, 2013). That could be a compelling synthesis of this literature. Regardless of whether the authors agree, I believe that this point is a major caveat to the current model. It is addressed in the discussion, but perhaps it would be beneficial to explicitly state to what extent the results rely on the assumption of a pre-sleep suppression mechanism.

      We appreciate the reviewer raising this important point. Unlike the mechanism proposed by the synaptic homeostasis hypothesis, the suppression mechanism in our model does not suppress items based on synapse strength, nor does it modify synaptic weights. Instead, it determines the level of suppression for each item based on activity during awake experience. The brain could implement such a mechanism by tagging each item according to its activity level during wakefulness. During subsequent consolidation, the initial reactivation of an item during replay would reflect this tag, influencing how easily it can be reactivated.

      A related hypothesis has been proposed in recent work, suggesting that replay avoids recently active trajectories due to spike frequency adaptation in neurons (Mallory et al., 2024). Similarly, the suppression mechanism in our model is critical for explaining the observed negative relationship between the amount of recent wake experience and the degree of replay.

      We discuss the biological plausibility of this mechanism and its relationship with existing models in the Introduction. In the section titled “The influence of experience”, we have added the following:

      “Our model implements an activity‑dependent suppression mechanism that, at the onset of each offline replay event, assigns each item a selection probability inversely proportional to its activation during preceding wakefulness. The brain could implement this by tagging each memory trace in proportion to its recent activation; during consolidation, that tag would then regulate starting replay probability, making highly active items less likely to be reactivated. A recent paper found that replay avoids recently traversed trajectories through awake spike‑frequency adaptation (Mallory et al., 2025), which could implement this kind of mechanism. In our simulations, this suppression is essential for capturing the inverse relationship between replay frequency and prior experience. Note that, unlike the synaptic homeostasis hypothesis (Tononi & Cirelli 2006), which proposes that the brain globally downscales synaptic weights during sleep, this mechanism leaves synaptic weights unchanged and instead biases the selection process during replay.”

      (4) As the manuscript mentions, the only difference between sleep and wake in the model is the initial conditions (a0). This is an obvious simplification, especially given the last author's recent models discussing the very different roles of REM vs NREM. Could the authors suggest how different sleep stages may relate to the model or how it could be developed to interact with other successful models such as the ones the last author has developed (e.g., C-HORSE)? 

      We appreciate the encouragement to comment on the roles of different sleep stages in the manuscript, especially since, as noted, the lab is very interested in this and has explored it in other work. We chose to focus on NREM in this work because the vast majority of electrophysiological studies of sleep replay have identified these events during NREM. In addition, our lab’s theory of the role of REM (Singh et al., 2022, PNAS) is that it is a time for the neocortex to replay remote memories, in complement to the more recent memories replayed during NREM. The experiments we simulate all involve recent memories. Indeed, our view is that part of the reason that there is so little data on REM replay may be that experimenters are almost always looking for traces of recent memories (for good practical and technical reasons).

      Regarding the simplicity of the distinction between simulated wake and sleep replay, we view it as an asset of the model that it can account for many of the different characteristics of awake and NREM replay with very simple assumptions about differences in the initial conditions. There are of course many other differences between the states that could be relevant to the impact of replay, but the current target empirical data did not necessitate us taking those into account. This allows us to argue that differences in initial conditions should play a substantial role in an account of the differences between wake and sleep replay.

      We have added discussion of these ideas and how they might be incorporated into future versions of the model in the Discussion section:

      “Our current simulations have focused on NREM, since the vast majority of electrophysiological studies of sleep replay have identified replay events in this stage. We have proposed in other work that replay during REM sleep may provide a complementary role to NREM sleep, allowing neocortical areas to reinstate remote, already-consolidated memories that need to be integrated with the memories that were recently encoded in the hippocampus and replayed during NREM (Singh et al., 2022). An extension of our model could undertake this kind of continual learning setup, where the student but not teacher network retains remote memories, and the driver of replay alternates between hippocampus (NREM) and cortex (REM) over the course of a night of simulated sleep. Other differences between stages of sleep and between sleep and wake states are likely to become important for a full account of how replay impacts memory. Our current model parsimoniously explains a range of differences between awake and sleep replay by assuming simple differences in initial conditions, but we expect many more characteristics of these states (e.g., neural activity levels, oscillatory profiles, neurotransmitter levels, etc.) will be useful to incorporate in the future.”

      Finally, I wonder how the model would explain findings (including the authors') showing a preference for reactivation of weaker memories. The literature seems to suggest that it isn't just a matter of novelty or exposure, but encoding strength. Can the model explain this? Or would it require additional assumptions or some mechanism for selective endogenous reactivation during sleep and rest?

      We appreciate the encouragement to discuss this, as we do think the model could explain findings showing a preference for reactivation of weaker memories, as in Schapiro et al. (2018). In our framework, memory strength is reflected in the magnitude of each memory’s associated synaptic weights, so that stronger memories yield higher retrieved‑context activity during wake encoding than weaker ones. Because the model’s suppression mechanism reduces an item’s replay probability in proportion to its retrieved‑context activity, items with larger weights (strong memories) are more heavily suppressed at the onset of replay, while those with smaller weights (weaker memories) receive less suppression. When items have matched reward exposure, this dynamic would bias offline replay toward weaker memories, therefore preferentially reactivating weak memories. 

      In the section titled “The influence of experience”, we updated a sentence to discuss this idea more explicitly: 

      “Such a suppression mechanism may be adaptive, allowing replay to benefit not only the most recently or strongly encoded items but also to provide opportunities for the consolidation of weaker or older memories, consistent with empirical evidence (e.g., Schapiro et al. 2018; Yu et al., 2024).”

      (5) Lines 186-200 - Perhaps I'm misunderstanding, but wouldn't it be trivial that an external cue at the end-item of Figure 7a would result in backward replay, simply because there is no potential for forward replay for sequences starting at the last item (there simply aren't any subsequent items)? The opposite is true, of course, for the first-item replay, which can't go backward. More generally, my understanding of the literature on forward vs backward replay is that neither is linked to the rodent's location. Both commonly happen at a resting station that is further away from the track. It seems as though the model's result may not hold if replay occurs away from the track (i.e. if a0 would be equal for both pre- and post-run).

      In studies where animals run back and forth on a linear track, replay events are decoded separately for left and right runs, identifying both forward and reverse sequences for each direction, for example using direction-specific place cell sequence templates. Accordingly, in our simulation of, e.g., Ambrose et al. (2016), we use two independent sequences, one for left runs and one for right runs (an approach that has been taken in prior replay modeling work). Crucially, our model assumes a context reset between running episodes, preventing the final item of one traversal from acquiring contextual associations with the first item of the next. As a result, learning in the two sequences remains independent, and when an external cue is presented at the track’s end, replay predominantly unfolds in the backward direction, only occasionally producing forward segments when the cue briefly reactivates an earlier sequence item before proceeding forward.

      We added a note to the section titled “The context-dependency of memory replay” to clarify this:

      “In our model, these patterns are identical to those in our simulation of Ambrose et al. (2016), which uses two independent sequences to mimic the two run directions. This is because the drifting context resets before each run sequence is encoded, with the pause between runs acting as an event boundary that prevents the final item of one traversal from associating with the first item of the next, thereby keeping learning in each direction independent.”

      To our knowledge, no study has observed a similar asymmetry when animals are fully removed from the track, although both types of replay can be observed when animals are away from the track. For example, Gupta et al. (2010) demonstrated that when animals replay trajectories far from their current location, the ratio of forward vs. backward replay appears more balanced. We now highlight this result in the manuscript and explain how it aligns with the predictions of our model:

      “For example, in tasks where the goal is positioned in the middle of an arm rather than at its end, CMR-replay predicts a more balanced ratio of forward and reverse replay, whereas the EVB model still predicts a dominance of reverse replay due to backward gain propagation from the reward. This contrast aligns with empirical findings showing that when the goal is located in the middle of an arm, replay events are more evenly split between forward and reverse directions (Gupta et al., 2010), whereas placing the goal at the end of a track produces a stronger bias toward reverse replay (Diba & Buzsaki 2007).” 

      Although no studies, to our knowledge, have observed a context-dependent asymmetry between forward and backward replay when the animal is away from the track, our model does posit conditions under which it could. Specifically, it predicts that deliberation on a specific memory, such as during planning, could generate an internal context input that biases replay: actively recalling the first item of a sequence may favor forward replay, while thinking about the last item may promote backward replay, even when the individual is physically distant from the track.

      We now discuss this prediction in the section titled “The context-dependency of memory replay”:

      “Our model also predicts that deliberation on a specific memory, such as during planning, could serve to elicit an internal context cue that biases replay: actively recalling the first item of a sequence may favor forward replay, while thinking about the last item may promote backward replay, even when the individual is physically distant from the track. While not explored here, this mechanism presents a potential avenue for future modeling and empirical work.”

      (6) The manuscript describes a study by Bendor & Wilson (2012) and tightly mimics their results. However, notably, that study did not find triggered replay immediately following sound presentation, but rather a general bias toward reactivation of the cued sequence over longer stretches of time. In other words, it seems that the model's results don't fully mirror the empirical results. One idea that came to mind is that perhaps it is the R/L context - not the first R/L item - that is cued in this study. This is in line with other TMR studies showing what may be seen as contextual reactivation. If the authors think that such a simulation may better mirror the empirical results, I encourage them to try. If not, however, this limitation should be discussed.

      Although our model predicts that replay is triggered immediately by the sound cue, it also predicts a sustained bias toward the cued sequence. Replay in our model unfolds across the rest phase as multiple successive events, so the bias observed in our sleep simulations indeed reflects a prolonged preference for the cued sequence.

      We now discuss this issue, acknowledging the discrepancy:

      “Bendor and Wilson (2012) found that sound cues during sleep did not trigger immediate replay, but instead biased reactivation toward the cued sequence over an extended period of time. While the model does exhibit some replay triggered immediately by the cue, it also captures the sustained bias toward the cued sequence over an extended period.”

      Second, within this framework, context is modeled as a weighted average of the features associated with items. As a result, cueing the model with the first R/L item produces qualitatively similar outcomes as cueing it with a more extended R/L cue that incorporates features of additional items. This is because both approaches ultimately use context features unique to the two sides.

      (7) There is some discussion about replay's benefit to memory. One point of interest could be whether this benefit changes between wake and sleep. Relatedly, it would be interesting to see whether the proportion of forward replay, backward replay, or both correlated with memory benefits. I encourage the authors to extend the section on the function of replay and explore these questions.

      We thank the reviewer for this suggestion. Regarding differences in the contribution of wake and sleep to memory, our current simulations predict that compared to rest in the task environment, sleep is less biased toward initiating replay at specific items, leading to a more uniform benefit across all memories. Regarding the contributions of forward and backward replay, our model predicts that both strengthen bidirectional associations between items and contexts, benefiting memory in qualitatively similar ways. Furthermore, we suggest that the offline learning captured  by our teacher-student simulations reflects consolidation processes that are specific to sleep.

      We have expanded the section titled The influence of experience to discuss these predictions of the model: 

      “The results outlined above arise from the model's assumption that replay strengthens bidirectional associations between items and contexts to benefit memory. This assumption leads to several predictions about differences across replay types. First, the model predicts that sleep yields different memory benefits compared to rest in the task environment: Sleep is less biased toward initiating replay at specific items, resulting in a more uniform benefit across all memories. Second, the model predicts that forward and backward replay contribute to memory in qualitatively similar ways but tend to benefit different memories. This divergence arises because forward and backward replay exhibit distinct item preferences, with backward replay being more likely to include rewarded items, thereby preferentially benefiting those memories.”

      We also updated the “The function of replay” section to include our teacher-student speculation:

      “We speculate that the offline learning observed in these simulations corresponds to consolidation processes that operate specifically during sleep, when hippocampal-neocortical dynamics are especially tightly coupled (Klinzing et al., 2019).”

      (8) Replay has been mostly studied in rodents, with few exceptions, whereas CMR and similar models have mostly been used in humans. Although replay is considered a good model of episodic memory, it is still limited due to limited findings of sequential replay in humans and its reliance on very structured and inherently autocorrelated items (i.e., place fields). I'm wondering if the authors could speak to the implications of those limitations on the generalizability of their model. Relatedly, I wonder if the model could or does lead to generalization to some extent in a way that would align with the complementary learning systems framework.

      We appreciate these insightful comments. Traditionally, replay studies have focused on spatial tasks with autocorrelated item representations (e.g., place fields). However, an increasing number of human studies have demonstrated sequential replay using stimuli with distinct, unrelated representations. Our model is designed to accommodate both scenarios. In our current simulations, we employ orthogonal item representations while leveraging a shared, temporally autocorrelated context to link successive items. We anticipate that incorporating autocorrelated item representations would further enhance sequence memory by increasing the similarity between successive contexts. Overall, we believe that the model generalizes across a broad range of experimental settings, regardless of the degree of autocorrelation between items. Moreover, the underlying framework has been successfully applied to explain sequential memory in both spatial domains, explaining place cell firing properties (e.g., Howard et al., 2004), and in non-spatial domains, such as free recall experiments where items are arbitrarily related. 

      In the section titled “A context model of memory replay”, we added this comment to address this point:

      “Its contiguity bias stems from its use of shared, temporally autocorrelated context to link successive items, despite the orthogonal nature of individual item representations. This bias would be even stronger if items had overlapping representations, as observed in place fields.”

      Since CMR-replay learns distributed context representations where overlap across context vectors captures associative structure, and replay helps strengthen that overlap, this could indeed be viewed as consonant with complementary learning systems integration processes. 

      Reviewer #2 (Public Review):

      This manuscript proposes a model of replay that focuses on the relation between an item and its context, without considering the value of the item. The model simulates awake learning, awake replay, and sleep replay, and demonstrates parallels between memory phenomenon driven by encoding strength, replay of sequence learning, and activation of nearest neighbor to infer causality. There is some discussion of the importance of suppression/inhibition to reduce activation of only dominant memories to be replayed, potentially boosting memories that are weakly encoded. Very nice replications of several key replay findings including the effect of reward and remote replay, demonstrating the equally salient cue of context for offline memory consolidation.

      I have no suggestions for the main body of the study, including methods and simulations, as the work is comprehensive, transparent, and well-described. However, I would like to understand how the CMRreplay model fits with the current understanding of the importance of excitation vs inhibition, remembering vs forgetting, activation vs deactivation, strengthening vs elimination of synapses, and even NREM vs REM as Schapiro has modeled. There seems to be a strong association with the efforts of the model to instantiate a memory as well as how that reinstantiation changes across time. But that is not all this is to consolidation. The specific roles of different brain states and how they might change replay is also an important consideration.

      We are gratified that the reviewer appreciated the work, and we agree that the paper would benefit from comment on the connections to these other features of consolidation.

      Excitation vs. inhibition: CMR-replay does not model variations in the excitation-inhibition balance across brain states (as in other models, e.g., Chenkov et al., 2017), since it does not include inhibitory connections. However, we posit that the experience-dependent suppression mechanism in the model might, in the brain, involve inhibitory processes. Supporting this idea, studies have observed increased inhibition with task repetition (Berners-Lee et al., 2022). We hypothesize that such mechanisms may underlie the observed inverse relationship between task experience and replay frequency in many studies. We discuss this in the section titled “A context model of memory replay”:

      “The proposal that a suppression mechanism plays a role in replay aligns with models that regulate place cell reactivation via inhibition (Malerba et al., 2016) and with empirical observations of increased hippocampal inhibitory interneuron activity with experience (Berners-Lee et al., 2022). Our model assumes the presence of such inhibitory mechanisms but does not explicitly model them.”

      Remembering/forgetting, activation/deactivation, and strengthening/elimination of synapses: The model does not simulate synaptic weight reduction or pruning, so it does not forget memories through the weakening of associated weights. However, forgetting can occur when a memory is replayed less frequently than others, leading to reduced activation of that memory compared to its competitors during context-driven retrieval. In the Discussion section, we acknowledge that a biologically implausible aspect of our model is that it implements only synaptic strengthening: 

      “Aspects of the model, such as its lack of regulation of the cumulative positive weight changes that can accrue through repeated replay, are biologically implausible (as biological learning results in both increases and decreases in synaptic weights) and limit the ability to engage with certain forms of low level neural data (e.g., changes in spine density over sleep periods; de Vivo et al., 2017; Maret et al., 2011). It will be useful for future work to explore model variants with more elements of biological plausibility.” Different brain states and NREM vs REM: Reviewer 1 also raised this important issue (see above). We have added the following thoughts on differences between these states and the relationship to our prior work to the Discussion section:

      “Our current simulations have focused on NREM, since the vast majority of electrophysiological studies of sleep replay have identified replay events in this stage. We have proposed in other work that replay during REM sleep may provide a complementary role to NREM sleep, allowing neocortical areas to reinstate remote, already-consolidated memories that need to be integrated with the memories that were recently encoded in the hippocampus and replayed during NREM (Singh et al., 2022). An extension of our model could undertake this kind of continual learning setup, where the student but not teacher network retains remote memories, and the driver of replay alternates between hippocampus (NREM) and cortex (REM) over the course of a night of simulated sleep. Other differences between stages of sleep and between sleep and wake states are likely to become important for a full account of how replay impacts memory. Our current model parsimoniously explains a range of differences between awake and sleep replay by assuming simple differences in initial conditions, but we expect many more characteristics of these states (e.g., neural activity levels, oscillatory profiles, neurotransmitter levels, etc.) will be useful to incorporate in the future.”

      We hope these points clarify the model’s scope and its potential for future extensions.

      Do the authors suggest that these replay systems are more universal to offline processes beyond episodic memory? What about procedural memories and working memory?

      We thank the reviewer for raising this important question. We have clarified in the manuscript:

      “We focus on the model as a formulation of hippocampal replay, capturing how the hippocampus may replay past experiences through simple and interpretable mechanisms.”

      With respect to other forms of memory, we now note that:

      “This motor memory simulation using a model of hippocampal replay is consistent with evidence that hippocampal replay can contribute to consolidating memories that are not hippocampally dependent at encoding (Schapiro et al., 2019; Sawangjit et al., 2018). It is possible that replay in other, more domain-specific areas could also contribute (Eichenlaub et al., 2020).”

      Though this is not a biophysical model per se, can the authors speak to the neuromodulatory milieus that give rise to the different types of replay?

      Our work aligns with the perspective proposed by Hasselmo (1999), which suggests that waking and sleep states differ in the degree to which hippocampal activity is driven by external inputs. Specifically, high acetylcholine levels during waking bias activity to flow into the hippocampus, while low acetylcholine levels during sleep allow hippocampal activity to influence other brain regions. Consistent with this view, our model posits that wake replay is more biased toward items associated with the current resting location due to the presence of external input during waking states. In the Discussion section, we have added a comment on this point:

      “Our view aligns with the theory proposed by Hasselmo (1999), which suggests that the degree of hippocampal activity driven by external inputs differs between waking and sleep states: High acetylcholine levels during wakefulness bias activity into the hippocampus, while low acetylcholine levels during slow-wave sleep allow hippocampal activity to influence other brain regions.”

      Reviewer #3 (Public Review):

      In this manuscript, Zhou et al. present a computational model of memory replay. Their model (CMR-replay) draws from temporal context models of human memory (e.g., TCM, CMR) and claims replay may be another instance of a context-guided memory process. During awake learning, CMR replay (like its predecessors) encodes items alongside a drifting mental context that maintains a recency-weighted history of recently encoded contexts/items. In this way, the presently encoded item becomes associated with other recently learned items via their shared context representation - giving rise to typical effects in recall such as primacy, recency, and contiguity. Unlike its predecessors, CMR-replay has built-in replay periods. These replay periods are designed to approximate sleep or wakeful quiescence, in which an item is spontaneously reactivated, causing a subsequent cascade of item-context reactivations that further update the model's item-context associations.

      Using this model of replay, Zhou et al. were able to reproduce a variety of empirical findings in the replay literature: e.g., greater forward replay at the beginning of a track and more backward replay at the end; more replay for rewarded events; the occurrence of remote replay; reduced replay for repeated items, etc. Furthermore, the model diverges considerably (in implementation and predictions) from other prominent models of replay that, instead, emphasize replay as a way of predicting value from a reinforcement learning framing (i.e., EVB, expected value backup).

      Overall, I found the manuscript clear and easy to follow, despite not being a computational modeller myself. (Which is pretty commendable, I'd say). The model also was effective at capturing several important empirical results from the replay literature while relying on a concise set of mechanisms - which will have implications for subsequent theory-building in the field.

      With respect to weaknesses, additional details for some of the methods and results would help the readers better evaluate the data presented here (e.g., explicitly defining how the various 'proportion of replay' DVs were calculated).

      For example, for many of the simulations, the y-axis scale differs from the empirical data despite using comparable units, like the proportion of replay events (e.g., Figures 1B and C). Presumably, this was done to emphasize the similarity between the empirical and model data. But, as a reader, I often found myself doing the mental manipulation myself anyway to better evaluate how the model compared to the empirical data. Please consider using comparable y-axis ranges across empirical and simulated data wherever possible.

      We appreciate this point. As in many replay modeling studies, our primary goal is to provide a qualitative fit that demonstrates the general direction of differences between our model and empirical data, without engaging in detailed parameter fitting for a precise quantitative fit. Still, we agree that where possible, it is useful to better match the axes. We have updated figures 2B and 2C so that the y-axis scales are more directly comparable between the empirical and simulated data. 

      In a similar vein to the above point, while the DVs in the simulations/empirical data made intuitive sense, I wasn't always sure precisely how they were calculated. Consider the "proportion of replay" in Figure 1A. In the Methods (perhaps under Task Simulations), it should specify exactly how this proportion was calculated (e.g., proportions of all replay events, both forwards and backwards, combining across all simulations from Pre- and Post-run rest periods). In many of the examples, the proportions seem to possibly sum to 1 (e.g., Figure 1A), but in other cases, this doesn't seem to be true (e.g., Figure 3A). More clarity here is critical to help readers evaluate these data. Furthermore, sometimes the labels themselves are not the most informative. For example, in Figure 1A, the y-axis is "Proportion of replay" and in 1C it is the "Proportion of events". I presumed those were the same thing - the proportion of replay events - but it would be best if the axis labels were consistent across figures in this manuscript when they reflect the same DV.

      We appreciate these useful suggestions. We have revised the Methods section to explain in detail how DVs are calculated for each simulation. The revisions clarify the differences between related measures, such as those shown in Figures 1A and 1C, so that readers can more easily see how the DVs are defined and interpreted in each case. 

      Reviewer #4/Reviewing Editor (Public Review):

      Summary:

      With their 'CMR-replay' model, Zhou et al. demonstrate that the use of spontaneous neural cascades in a context-maintenance and retrieval (CMR) model significantly expands the range of captured memory phenomena.

      Strengths:

      The proposed model compellingly outperforms its CMR predecessor and, thus, makes important strides towards understanding the empirical memory literature, as well as highlighting a cognitive function of replay.

      Weaknesses:

      Competing accounts of replay are acknowledged but there are no formal comparisons and only CMR-replay predictions are visualized. Indeed, other than the CMR model, only one alternative account is given serious consideration: A variant of the 'Dyna-replay' architecture, originally developed in the machine learning literature (Sutton, 1990; Moore & Atkeson, 1993) and modified by Mattar et al (2018) such that previously experienced event-sequences get replayed based on their relevance to future gain. Mattar et al acknowledged that a realistic Dyna-replay mechanism would require a learned representation of transitions between perceptual and motor events, i.e., a 'cognitive map'. While Zhou et al. note that the CMR-replay model might provide such a complementary mechanism, they emphasize that their account captures replay characteristics that Dyna-replay does not (though it is unclear to what extent the reverse is also true).

      We thank the reviewer for these thoughtful comments and appreciate the opportunity to clarify our approach. Our goal in this work is to contrast two dominant perspectives in replay research: replay as a mechanism for learning reward predictions and replay as a process for memory consolidation. These models were chosen as representatives of their classes of models because they use simple and interpretable mechanisms that can simulate a wide range of replay phenomena, making them ideal for contrasting these two perspectives.

      Although we implemented CMR-replay as a straightforward example of the memory-focused view, we believe the proposed mechanisms could be extended to other architectures, such as recurrent neural networks, to produce similar results. We now discuss this possibility in the revised manuscript (see below). However, given our primary goal of providing a broad and qualitative contrast of these two broad perspectives, we decided not to undertake simulations with additional individual models for this paper.

      Regarding the Mattar & Daw model, it is true that a mechanistic implementation would require a mechanism that avoids precomputing priorities before replay. However, the "need" component of their model already incorporates learned expectations of transitions between actions and events. Thus, the model's limitations are not due to the absence of a cognitive map.

      In contrast, while CMR-replay also accumulates memory associations that reflect experienced transitions among events, it generates several qualitatively distinct predictions compared to the Mattar & Daw model. As we note in the manuscript, these distinctions make CMR-replay a contrasting rather than complementary perspective.

      Another important consideration, however, is how CMR replay compares to alternative mechanistic accounts of cognitive maps. For example, Recurrent Neural Networks are adept at detecting spatial and temporal dependencies in sequential input; these networks are being increasingly used to capture psychological and neuroscientific data (e.g., Zhang et al, 2020; Spoerer et al, 2020), including hippocampal replay specifically (Haga & Fukai, 2018). Another relevant framework is provided by Associative Learning Theory, in which bidirectional associations between static and transient stimulus elements are commonly used to explain contextual and cue-based phenomena, including associative retrieval of absent events (McLaren et al, 1989; Harris, 2006; Kokkola et al, 2019). Without proper integration with these modeling approaches, it is difficult to gauge the innovation and significance of CMR-replay, particularly since the model is applied post hoc to the relatively narrow domain of rodent maze navigation.

      First, we would like to clarify our principal aim in this work is to characterize the nature of replay, rather than to model cognitive maps per se. Accordingly, CMR‑replay is not designed to simulate head‐direction signals, perform path integration, or explain the spatial firing properties of neurons during navigation. Instead, it focuses squarely on sequential replay phenomena, simulating classic rodent maze reactivation studies and human sequence‐learning tasks. These simulations span a broad array of replay experimental paradigms to ensure extensive coverage of the replay findings reported across the literature. As such, the contribution of this work is in explaining the mechanisms and functional roles of replay, and demonstrating that a model that employs simple and interpretable memory mechanisms not only explains replay phenomena traditionally interpreted through a value-based lens but also accounts for findings not addressed by other memory-focused models.

      As the reviewer notes, CMR-replay shares features with other memory-focused models. However, to our knowledge, none of these related approaches have yet captured the full suite of empirical replay phenomena, suggesting the combination of mechanisms employed in CMR-replay is essential for explaining these phenomena. In the Discussion section, we now discuss the similarities between CMR-replay and related memory models and the possibility of integrating these approaches:

      “Our theory builds on a lineage of memory-focused models, demonstrating the power of this perspective in explaining phenomena that have often been attributed to the optimization of value-based predictions. In this work, we focus on CMR-replay, which exemplifies the memory-centric approach through a set of simple and interpretable mechanisms that we believe are broadly applicable across memory domains. Elements of CMR-replay share similarities with other models that adopt a memory-focused perspective. The model learns distributed context representations whose overlaps encodes associations among items, echoing associative learning theories in which overlapping patterns capture stimulus similarity and learned associations (McLaren & Mackintosh 2002). Context evolves through bidirectional interactions between items and their contextual representations, mirroring the dynamics found in recurrent neural networks (Haga & Futai 2018; Levenstein et al., 2024). However, these related approaches have not been shown to account for the present set of replay findings and lack mechanisms—such as reward-modulated encoding and experience-dependent suppression—that our simulations suggest are essential for capturing these phenomena. While not explored here, we believe these mechanisms could be integrated into architectures like recurrent neural networks (Levenstein et al., 2024) to support a broader range of replay dynamics.”

      Recommendations For The Authors

      Reviewer #1 (Recommendations For The Authors):

      (1) Lines 94-96: These lines may be better positioned earlier in the paragraph.

      We now introduce these lines earlier in the paragraph.

      (2) Line 103 - It's unclear to me what is meant by the statement that "the current context contains contexts associated with previous items". I understand why a slowly drifting context will coincide and therefore link with multiple items that progress rapidly in time, so multiple items will be linked to the same context and each item will be linked to multiple contexts. Is that the idea conveyed here or am I missing something? I'm similarly confused by line 129, which mentions that a context is updated by incorporating other items' contexts. How could a context contain other contexts?

      In the model, each item has an associated context that can be retrieved via Mfc. This is true even before learning, since Mfc is initialized as an identity matrix. During learning and replay, we have a drifting context c that is updated each time an item is presented. At each timestep, the model first retrieves the current item’s associated context cf by Mfc, and incorporates it into c. Equation #2 in the Methods section illustrates this procedure in detail. Because of this procedure, the drifting context c is a weighted sum of past items’ associated contexts. 

      We recognize that these descriptions can be confusing. We have updated the Results section to better distinguish the drifting context from items’ associated context. For example, we note that:

      “We represent the drifting context during learning and replay with c and an item's associated context with cf.”

      We have also updated our description of the context drift procedure to distinguish these two quantities: 

      “During awake encoding of a sequence of items, for each item f, the model retrieves its associated context cf via Mfc. The drifting context c incorporates the item's associated context cf and downweights its representation of previous items' associated contexts (Figure 1c). Thus, the context layer maintains a recency weighted sum of past and present items' associated contexts.”

      (3) Figure 1b and 1d - please clarify which axis in the association matrices represents the item and the context.

      We have added labels to show what the axes represent in Figure 1.

      (4) The terms "experience" and "item" are used interchangeably and it may be best to stick to one term.

      We now use the term “item” wherever we describe the model results. 

      (5) The manuscript describes Figure 6 ahead of earlier figures - the authors may want to reorder their figures to improve readability.

      We appreciate this suggestion. We decided to keep the current figure organization since it allows us to group results into different themes and avoid redundancy. 

      (6) Lines 662-664 are repeated with a different ending, this is likely an error.

      We have fixed this error.

      Reviewer #3 (Recommendations For The Authors):

      Below, I have outlined some additional points that came to mind in reviewing the manuscript - in no particular order.

      (1) Figure 1: I found the ordering of panels a bit confusing in this figure, as the reading direction changes a couple of times in going from A to F. Would perhaps putting panel C in the bottom left corner and then D at the top right, with E and F below (also on the right) work?

      We agree that this improves the figure. We have restructured the ordering of panels in this figure. 

      (2) Simulation 1: When reading the intro/results for the first simulation (Figure 2a; Diba & Buszaki, 2007; "When animals traverse a linear track...", page 6, line 186). It wasn't clear to me why pre-run rest would have any forward replay, particularly if pre-run implied that the animal had no experience with the track yet. But in the Methods this becomes clearer, as the model encodes the track eight times prior to the rest periods. Making this explicit in the text would make it easier to follow. Also, was there any reason why specifically eight sessions of awake learning, in particular, were used?

      We now make more explicit that the animals have experience with the track before pre-run rest recording:

      “Animals first acquire experience with a linear track by traversing it to collect a reward. Then, during the pre-run rest recording, forward replay predominates.”

      We included eight sessions of awake learning to match with the number of sessions in Shin et al. (2017), since this simulation attempts to explain data from that study. After each repetition, the model engages in rest. We have revised the Methods section to indicate the motivation for this choice: 

      “In the simulation that examines context-dependent forward and backward replay through experience (Figs. 2a and 5a), CMR-replay encodes an input sequence shown in Fig. 7a, which simulates a linear track run with no ambiguity in the direction of inputs, over eight awake learning sessions (as in Shin et al. 2019)”

      (3) Frequency of remote replay events: In the simulation based on Gupta et al, how frequently overall does remote replay occur? In the main text, the authors mention the mean frequency with which shortcut replay occurs (i.e., the mean proportion of replay events that contain a shortcut sequence = 0.0046), which was helpful. But, it also made me wonder about the likelihood of remote replay events. I would imagine that remote replay events are infrequent as well - given that it is considerably more likely to replay sequences from the local track, given the recency-weighted mental context. Reporting the above mean proportion for remote and local replay events would be helpful context for the reader.

      In Figure 4c, we report the proportion of remote replay in the two experimental conditions of Gupta et al. that we simulate. 

      (4) Point of clarification re: backwards replay: Is backwards replay less likely to occur than forward replay overall because of the forward asymmetry associated with these models? For example, for a backwards replay event to occur, the context would need to drift backwards at least five times in a row, in spite of a higher probability of moving one step forward at each of those steps. Am I getting that right?

      The reviewer’s interpretation is correct: CMR-replay is more likely to produce forward than backward replay in sleep because of its forward asymmetry. We note that this forward asymmetry leads to high likelihood of forward replay in the section titled “The context-dependency of memory replay”: 

      “As with prior retrieved context models (Howard & Kahana 2002; Polyn et al., 2009), CMR-replay encodes stronger forward than backward associations. This asymmetry exists because, during the first encoding of a sequence, an item's associated context contributes only to its ensuing items' encoding contexts. Therefore, after encoding, bringing back an item's associated context is more likely to reactivate its ensuing than preceding items, leading to forward asymmetric replay (Fig. 6d left).”

      (5) On terminating a replay period: "At any t, the replay period ends with a probability of 0.1 or if a task-irrelevant item is reactivated." (Figure 1 caption; see also pg 18, line 635). How was the 0.1 decided upon? Also, could you please add some detail as to what a 'task-irrelevant item' would be? From what I understood, the model only learns sequences that represent the points in a track - wouldn't all the points in the track be task-relevant?

      This value was arbitrarily chosen as a small value that allows probabilistic stopping. It was not motivated by prior modeling or a systematic search. We have added: “At each timestep, the replay period ends either with a stop probability of 0.1 or if a task-irrelevant item becomes reactivated. (The choice of the value 0.1 was arbitrary; future work could explore the implications of varying this parameter).” 

      In addition, we now explain in the paper that task irrelevant items “do not appear as inputs during awake encoding, but compete with task-relevant items for reactivation during replay, simulating the idea that other experiences likely compete with current experiences during periods of retrieval and reactivation.”

      (6) Minor typos:

      Turn all instances of "nonlocal" into "non-local", or vice versa

      "For rest at the end of a run, cexternal is the context associated with the final item in the sequence. For rest at the end of a run, cexternal is the context associated with the start item." (pg 20, line 663) - I believe this is a typo and that the second sentence should begin with "For rest at the START of a run".

      We have updated the manuscript to correct these typos. 

      (7) Code availability: I may have missed it, but it doesn't seem like the code is currently available for these simulations. Including the commented code in a public repository (Github, OSF) would be very useful in this case.

      We now include a Github link to our simulation code: https://github.com/schapirolab/CMR-replay.

    1. Author response:

      Reviewer #1

      (1) Mechanistic insight into how Hsp70 but not Hsc70 increase PL-SF FL tau aggregation/pathology is missing. This is despite both chaperones binding to PL-SF FL tau. What species of tau does Hsp70 bind, and what cofactors are important in this process?

      We agree that explaining why Hsp70, but not Hsc70, promotes tau aggregation would strengthen the study. Although both chaperones bind tau, they diverge slightly in 1) protein sequence, 2) biochemical activity, and 3) co-chaperone engagement.

      Sequence: Hsp70 has an extra cysteine residue (Cys306) that is highly reactive to oxidation and a glycine residue that is critical for cysteine oxidation (Gly557). Both residues are specific to Hsp70 (not present in Hsc70) and may alter Hsp70 conformation or client handling (Hong et al., 2022).

      Biochemical activity: Prior studies indicate that Hsp70’s ATPase domain (NBD) is critical for tau interactions (Jinwal et al., 2009; Fontaine et al., 2015; Young et al., 2016) and can be disrupted with point mutations including K71E and E175S for ATPase and A406G/V438G for substrate binding (Fontaine et al., 2015).

      Co-chaperone engagement: Hsp70 recruits the co-chaperone and E3 ubiquitin ligase CHIP/Stub1 more strongly than Hsc70, suggesting co-chaperone engagement could lead to differences in tau processing (Jinwal et al., 2013).

      To directly test how the two closely related chaperones could differentially impact tau, we plan to perform the following experiments:

      (a) We will mutate residues responsible for cysteine reactivity in Hsp70 including the cysteine itself (Cys306) and the critical glycine that facilitates cysteine reactivity (Gly557). These residues will be deleted from Hsp70 or alternatively inserted into Hsc70 to determine whether cysteine reactivity is the reason for Hsp70’s ability to drive tau aggregation.

      (b) We will generate Hsp70 mutants lacking ATPase- or substrate-binding mutants to determine which Hsp70 domains are responsible for driving tau aggregation.

      (c) We will perform seeding assays in stable tau-expressing cell lines to determine whether Hsp70/Hsc70 overexpression or depletion alters seeded tau aggregation.

      (d) We will perform confocal microscopy to determine the extent of co-localization of Hsp70 or Hsc70 with phospho-tau, oligomeric tau, or Thioflavin-S (ThioS) to identify which tau species are engaged by Hsp70/Hsc70.

      (e) We will perform immunoprecipitation pull-downs followed by mass spectrometry to globally identify any relevant Hsp70/Hsc70 interacting factors that might account for the differences in tau aggregation.

      (2) The study relies heavily on densitometry of bands to draw conclusions; in several instances, the blots are overexposed to accurately quantify the signal.

      All immunoblots were acquired as 16-bit TIFFs with exposure settings chosen to prevent pixel saturation, and quantification was performed on raw, unsaturated images. Brightness and contrast adjustments were applied only for visualization and did not alter pixel values used for analysis. All quantified bands fell within the linear range of the detector, with one exception in Figure 7B, which we removed from quantification. We will add both low- and high-exposure versions of immunoblots to the revised figures to demonstrate signal linearity and dynamic range.

      Reviewer #2

      (1) Although the PL-SF model can accelerate tau aggregation, it is crucial to determine whether this aligns with the temporal progression and spatial distribution of tau pathology in the brains of patients with tauopathies.

      No single tauopathy model fully recapitulates the temporal and spatial progression of human tauopathies. The PL-SF system is not intended to model the disease course. Rather, it is an excellent model for mechanistic studies of mature tau aggregation, which is otherwise challenging to study. We note that prior studies showed that PL-SF tau expression in transgenic mice (Xia et al., 2022 and Smith et al., 2025) and rhesus monkeys (Beckman et al., 2021) led to prion-like tau seeding and aggregation in hippocampal and cortical regions. Indeed, the spatial and temporal tau aggregation patterns aligned with features of human tauopathies. So far, these findings all support PL-SF as a valid accelerated model of tauopathy than can be used to interrogate pathogenic mechanisms that impact tau processing, degradation, and/or aggregation.

      (2) The authors did not elucidate the specific molecular mechanism by which Hsp70 promotes tau aggregation.

      We agree that a deeper understanding of the molecular mechanism is needed. The revision experiments outlined above (Reviewer #1, point #1) will define how Hsp70 promotes tau aggregation by testing sequence contributions, dissecting ATPase and substrate-binding domain requirements, and mapping Hsp70/Hsc70 interactors to directly address this mechanistic question.

      (3) Some figures in this study show large error bars in the quantitative data (some statistical analysis figures, MEA recordings, etc.), indicating significant inter-sample variability. It is recommended to label individual data points in all quantitative figures and clearly indicate them in figure legends.

      We acknowledge the inter-sample variability in some of the quantitative datasets. This level of variability can occur in primary neuronal cultures (e.g., MEA recordings) that are sensitive to growth and surface adhesion conditions, leading to many technical considerations. To improve transparency and interpretation, we will revise all quantitative figures to display individual data points overlaid on summary statistics and will update figure legends to clearly indicate sample sizes and statistical tests used.

      References

      Hong Z, Gong W, Yang J, Li S, Liu Z, Perrett S, Zhang H. Exploration of the cysteine reactivity of human inducible Hsp70 and cognate Hsc70. J Biol Chem. 2023 Jan;299(1):102723. doi: 10.1016/j.jbc.2022.102723. Epub 2022 Nov 19. PMID: 36410435; PMCID: PMC9800336.

      Jinwal UK, Miyata Y, Koren J 3rd, Jones JR, Trotter JH, Chang L, O'Leary J, Morgan D, Lee DC, Shults CL, Rousaki A, Weeber EJ, Zuiderweg ER, Gestwicki JE, Dickey CA. Chemical manipulation of hsp70 ATPase activity regulates tau stability. J Neurosci. 2009 Sep 30;29(39):12079-88. doi: 10.1523/JNEUROSCI.3345-09.2009. PMID: 19793966; PMCID: PMC2775811.

      Fontaine SN, Rauch JN, Nordhues BA, Assimon VA, Stothert AR, Jinwal UK, Sabbagh JJ, Chang L, Stevens SM Jr, Zuiderweg ER, Gestwicki JE, Dickey CA. Isoform-selective Genetic Inhibition of Constitutive Cytosolic Hsp70 Activity Promotes Client Tau Degradation Using an Altered Co-chaperone Complement. J Biol Chem. 2015 May 22;290(21):13115-27. doi: 10.1074/jbc.M115.637595. Epub 2015 Apr 11. PMID: 25864199; PMCID: PMC4505567

      Young ZT, Rauch JN, Assimon VA, Jinwal UK, Ahn M, Li X, Dunyak BM, Ahmad A, Carlson G, Srinivasan SR, Zuiderweg ERP, Dickey CA, Gestwicki JE. Stabilizing the Hsp70‑Tau Complex Promotes Turnover in Models of Tauopathy. Cell Chem Biol. 2016 Aug 4;23(8):992–1001. doi:10.1016/j.chembiol.2016.04.014.

      Jinwal UK, Akoury E, Abisambra JF, O'Leary JC 3rd, Thompson AD, Blair LJ, Jin Y, Bacon J, Nordhues BA, Cockman M, Zhang J, Li P, Zhang B, Borysov S, Uversky VN, Biernat J, Mandelkow E, Gestwicki JE, Zweckstetter M, Dickey CA. Imbalance of Hsp70 family variants fosters tau accumulation. FASEB J. 2013 Apr;27(4):1450-9. doi: 10.1096/fj.12-220889. Epub 2012 Dec 27. PMID: 23271055; PMCID: PMC3606536.

      Xia, Y., Prokop, S., Bell, B.M. et al. Pathogenic tau recruits wild-type tau into brain inclusions and induces gut degeneration in transgenic SPAM mice. Commun Biol 5, 446 (2022). https://doi.org/10.1038/s42003-022-03373-1.

      Smith ED, Paterno G, Bell BM, Gorion KM, Prokop S, Giasson BI. Tau from SPAM Transgenic Mice Exhibit Potent Strain-Specific Prion-Like Seeding Properties Characteristic of Human Neurodegenerative Diseases. Neuromolecular Med. 2025 May 30;27(1):44. doi: 10.1007/s12017-025-08850-4. PMID: 40447946; PMCID: PMC12125038.

      Beckman D, Chakrabarty P, Ott S, Dao A, Zhou E, Janssen WG, Donis-Cox K, Muller S, Kordower JH, Morrison JH. A novel tau-based rhesus monkey model of Alzheimer's pathogenesis. Alzheimers Dement. 2021 Jun;17(6):933-945. doi: 10.1002/alz.12318. Epub 2021 Mar 18. PMID: 33734581; PMCID: PMC8252011.

    1. Author response:

      Reviewer #1

      (1) The main weakness is that the study is wholly in vitro, using cultured hippocampal neurons.

      We appreciate this reviewer's concern about the limitation of cultured hippocampal neurons in extracting disease-related spine phenotypes. While we fully recognize this limitation, we consider that this in vitro system has several advantages that contribute to translational research on mental disorders.

      First, our culture system has been shown to support the development of spine morphology similar to that of the hippocampal CA1 excitatory synapse in vivo. High-resolution imaging techniques confirmed that the in vitro spine structure was highly preserved compared with in vivo preparations (Kashiwagi et al., Nature Communications, 2019). The present study used the same culture system and SIM imaging. Therefore, the difference we detected in samples derived from disease models is likely to reflect impairment of molecular mechanisms underlying native structural development in vivo.

      Second, super-resolution imaging of thousands of spines in tissue preparations under precisely controlled conditions cannot be practically applied using currently available techniques. The advantage of our imaging and analytical pipeline is its reproducibility, which enabled us to compare the spine population data from eight different mouse models without normalization.

      Third, a reduced culture system can demonstrate the direct effects of gene mutations on synapse phenotypes, independent of environmental influences. This property is highly advantageous for screening chemical compounds that rescue spine phenotypes. Neuronal firing patterns and receptor functions can also be easily controlled in a culture system. The difference in spine structure between ASD and schizophrenia mouse models is valuable information to establish a drug screening system.

      Fourth, establishing an in vitro system for evaluating synapse phenotypes could reduce the need for animal experiments. Researchers should be aware of the 3Rs principles. In the future, combined with differentiation techniques for human iPS cells, our in vitro approach will enable the evaluation of disease-related spine phenotypes without the need for animal experiments. The effort to establish a reliable culture system should not be eliminated.

      (2) Another weakness is that CaMKIIαK42R/K42R mutant mice are presented as a schizophrenia model.

      We agree with this reviewer that CAMK2A mutations in humans are linked to multiple mental disorders, including developmental disorders, ASD, and schizophrenia. Association of gene mutations with the categories of mental disorders is not straightforward, as the symptoms of these disorders also overlap with each other. For the CaMKIIα K42R/K42R mutant, we considered the following points in its characterization as a model of mental disorder. Analysis of CaMKIIα +/- mice in Dr. Tsuyoshi Miyakawa's lab has provided evidence for the reduced CaMKIIα in schizophrenia-related phenotypes (Yamasaki et al., Mol Brain 2008; Frankland et al., Mol Brain Editorial 2008). It is also known that the CaMKIIα R8H mutation in the kinase domain is linked to schizophrenia (Brown et al., 2021). Both CaMKIIα R8H and CaMKIIα K42R mutations are located in the N-terminal domain and eliminate kinase activity. On the other hand, the representative CaMKIIα E183V mutation identified in ASD patients exhibits unique characteristics, including reduced kinase activity, decreased protein stability and expression levels, and disrupted interactions with ASD-associated proteins such as Shank3 (Stephenson et al., 2017). Importantly, reduced dendritic spines in neurons expressing CaMKIIα E183V is a property opposite to that of the CaMKIIα K42R/K42R mutant, which showed increased spine density (Koeberle et al. 2017).

      Different CAMK2A mutations likely cause distinct phenotypes observed in the broad spectrum of mental disorders. In the revised manuscript, we will include a discussion of the relevant literature to categorize this mouse model appropriately.

      References related to this discussion.

      (1) Yamasaki et al., Mol Brain. 2008 DOI: 10.1186/1756-6606-1-6

      (2) Frankland et al. Mol Brain. 2008 DOI: 10.1186/1756-6606-1-5

      (3) Stephenson et al., J Neurosci. 2017 DOI: 10.1523/JNEUROSCI.2068-16.2017

      (4) Koeberle et al. Sci Rep. 2017 DOI: 10.1038/s41598-017-13728-y

      (5) Brown et al., iScience. 2021 DOI: 10.1016/j.isci.2021.103184

      Reviewer #2

      We recognize the reviewer's comments as important for improving our manuscript. We outline our general approach to addressing major concerns. Detailed responses to each point, along with additional data, will be provided in a formal revised manuscript.

      (1) Demonstrating the robustness of statistical analyses

      We appreciate this reviewer's concern about our strategies for the quantitative analysis of the large spine population. For the PCA analysis (Point 2), our preliminary results indicated that including all parameters or the selected five parameters did not make a significant difference in the relative placement of spines with specific morphologies in the feature space defined by the principal components. This point will be discussed in the revised manuscript. The potential problem of selecting a particular region within a feature space for spine shape analysis (Point 1) can be addressed by using alternative simulation-based approaches, such as bootstrap or permutation tests. These analyses will be included in the revised manuscript. The use of sample numbers in statistical analyses should align with the analysis's purpose (Point 3). When analyzing the distribution of samples in the feature space, it is necessary to use spine numbers for statistical assessment. We will recheck the statistical methods and apply the appropriate method for each analysis. The spine population data in Figures 2 and 8 cannot be directly compared, as the spine visualization methods differ (Figure 2 with membrane DiI labeling; Figure 8 with cytoplasmic GFP labeling) (Point 9). Spine populations of the same size are inevitably plotted in different feature spaces. This point will be discussed more clearly in the revised manuscript.

      (2) Clarification of experimental conditions and data reliability

      Per this reviewer's suggestion, we will provide more information on the genetic background of mice and the differences in spine structure from DIV 18-22 (Points 4 and 5). We will also provide additional validation data for the functional analyses using knockdown and overexpression methods, for which we already have preliminary data (Point 7). Concerns about the interpretation of data obtained from in vitro culture (Point 12), raised by this reviewer, are also noted by reviewer #1. As explained in the response to reviewer #1, we intentionally selected an in vitro culture system to analyze multiple samples derived from mouse models of mental disorders for several reasons. Nevertheless, we will revise the discussion and incorporate the points this reviewer raised regarding the disadvantages of in vitro systems.

      (3) Validation of biological mechanisms and interpretation

      In the computational modeling (Point 6), we started from the data of spine turnover (excluding the data of spine volume increase/decrease), fitted the model with the data, and found that the best-fit model showed three features: fast spine turnover, lower spine density, and smaller size of transient spines in schizophrenia mouse models. As the reviewer noted, information about spine turnover is already present in the input data. However, the other two properties are generated independently of the input data, indicating the value of this model. We plan to add additional confirmatory analyses to this model in the revised manuscript.

      In response to Point 8, we will provide supporting data on the functional role of Ecgr4 in synapse regulation. We will also refine our discussion on the ASD and Schizophrenia phenotypes based on the suggested literature (Points 10 and 11). Quantification of the initial growth of spines is technically demanding, as it requires higher imaging frequency and longer time-lapse recordings to capture rare events. It is difficult to conclude which of the two possibilities, slow spine growth or initial size differences, is correct, based on our available data. This point will be discussed in the revised manuscript (Point 13).

    1. Les Algorithmes Contre la Société : Synthèse des Analyses d'Hubert Guillaud

      Résumé Exécutif

      Ce document de synthèse expose les arguments principaux développés par Hubert Guillaud, journaliste et essayiste, concernant l'impact sociétal des systèmes algorithmiques.

      L'analyse révèle que loin d'être des outils neutres, les algorithmes constituent une nouvelle logique systémique qui transforme en profondeur les services publics et les rapports sociaux.

      Leur fonction première est de calculer, trier et appareiller, traduisant le fait social en une simple "combinaison de chiffres".

      Les points critiques à retenir sont les suivants :

      La discrimination comme fonctionnalité : Par nature, le calcul est une machine à différencier.

      Des systèmes comme Parcoursup ou le "score de risque" de la Caisse d'Allocations Familiales (CAF) génèrent des distinctions souvent aberrantes et fictionnelles pour classer les individus, ce qui institutionnalise la discrimination sous couvert d'objectivité mathématique.

      Ciblage des populations précaires : L'automatisation des services publics cible et surveille de manière disproportionnée les populations les plus vulnérables.

      La CAF, par exemple, ne chasse pas tant la fraude que les "indus" (trop-perçus), affectant principalement les personnes aux revenus morcelés et complexes comme les mères isolées.

      Menace sur les principes démocratiques :

      L'interconnexion croissante des données entre les administrations (CAF, Impôts, France Travail, Police) menace la séparation des pouvoirs en créant un système de surveillance généralisée où les faiblesses d'un individu dans un domaine peuvent avoir des répercussions dans tous les autres.

      La massification déguisée : Contrairement à l'idée d'une personnalisation poussée, les algorithmes opèrent une massification des individus.

      Ils ne ciblent pas des personnes uniques mais les regroupent en permanence dans des catégories larges et standardisées à des fins de contrôle ou de publicité.

      Un risque de dérive fasciste : En systématisant la discrimination et en la rendant opaque et invisible, ces technologies créent un terrain propice à des dérives autoritaires, un risque qualifié par Hubert Guillaud de "fasciste".

      En conclusion, bien que ces technologies posent une menace sérieuse, Hubert Guillaud les replace dans un contexte plus large, arguant que les enjeux primordiaux demeurent le réchauffement climatique et les logiques du capitalisme financier, dont les algorithmes ne sont qu'un outil d'amplification.

      --------------------------------------------------------------------------------

      1. Introduction : La Logique Algorithmique et ses Enjeux Sociétaux

      La discussion, introduite par Marine Placa, doctorante en droit public, s'articule autour de l'ouvrage d'Hubert Guillaud, Les algorithmes contre la société.

      L'enjeu central est "l'immixtion d'une nouvelle logique algorithmique plus insidieuse et plus systémique à la délivrance des prestations de services publics".

      Cette logique, qui "traduit le fait social comme une combinaison de chiffres", gouverne de plus en plus l'environnement des individus avec des conséquences tangibles.

      Plusieurs critiques majeures sont soulevées dès l'introduction :

      Opacité et injustice : Les systèmes d'IA sont souvent trop opaques, discriminants et il est impossible d'expliciter les décisions qui en résultent.

      Déconnexion des réalités : Alors que les investissements massifs se poursuivent (109 milliards d'euros débloqués par le gouvernement français), les retours d'expérience alertent sur les "dégâts sociaux, démocratiques et écologiques".

      Technologie privée : La technologie est privée, développée par des capitaux privés et dictée par les "mastodontes économiques de la Silicon Valley".

      Son usage est ainsi largement influencé par des intérêts de profit plutôt que par le bien commun.

      L'IA n'est pas autonome : L'IA "ne décide de rien.

      Elle ne raisonne pas." Elle est le résultat d'une conception humaine, et son impact dépend moins de son essence que de son usage.

      2. Définition et Fonctionnement des Algorithmes

      Selon Hubert Guillaud, les systèmes algorithmiques, de l'algorithme simple à l'IA complexe, doivent être compris comme une "continuité technologique" de systèmes de calcul appliqués à la société. Leur fonctionnement repose sur trois fonctions fondamentales :

      | Fonction | Description | Exemple | | --- | --- | --- | | 1\. Produire des scores | Transformer des informations qualitatives (mots, comportements) en données quantitatives (chiffres, notes). | Un profil sur une application de rencontre est "scoré", une demande d'aide sociale reçoit une note de risque. |

      | 2\. Trier | Classer les individus ou les informations en fonction des scores produits. | Les candidats sur Parcoursup sont classés du premier au dernier. |

      | 3\. Apparier (Le "mariage") | Faire correspondre une demande à une offre sur la base du tri effectué. | Un étudiant est appareillé à une formation, un demandeur d'emploi à un poste, un bénéficiaire à l'obtention (ou non) d'une aide sociale. |

      Cette mécanique simple est au cœur de tous les systèmes, des réseaux sociaux aux plateformes de services publics, avec pour enjeu principal de classer, trier et faire correspondre.

      3. La Modification des Rapports de Force Sociétaux

      3.1. Le Calcul comme Machine à Discriminer : l'Exemple de Parcoursup

      Hubert Guillaud utilise l'exemple de Parcoursup pour illustrer comment le calcul génère une discrimination systémique.

      Contexte : Une plateforme nationale orientant 900 000 élèves de terminale vers plus de 25 000 formations.

      Mécanisme : Chaque formation doit classer tous ses candidats du premier au dernier, sans aucun ex-æquo.

      Le critère principal : les notes. Le système se base quasi exclusivement sur les bulletins scolaires, ignorant des critères essentiels comme la motivation, qui est pourtant un facteur clé de la réussite dans le supérieur.

      La création de distinctions aberrantes : Pour départager la masse d'élèves aux dossiers homogènes (par exemple, avec une moyenne de 14/20), le système génère des calculs complexes pour créer des micro-différences.

      Les scores finaux sont calculés à trois chiffres après la virgule (ex: 14,001 contre 14,003). Guillaud souligne l'absurdité de cette distinction :

      "Je ne peux pas faire de différence académique même entre eux. [...] Mais en fait pour le calcul par le calcul on va générer en fait des différences entre ces élèves."

      Équivalence au tirage au sort : Pour 80 % des candidats, ce système d'attribution basé sur des différences insignifiantes est "pleinement équivalent au tirage au sort", mais il est camouflé par l'apparence scientifique des chiffres.

      3.2. La Normalisation d'une Sélection Élitaire

      Contrairement à un simple tirage au sort, Parcoursup n'introduit pas d'aléa.

      Au contraire, il diffuse et normalise les méthodes de sélection des formations d'élite (grandes écoles, Sciences Po) à l'ensemble du système éducatif, y compris à des formations techniques (BTS) où ce type de sélection est inadapté.

      Cette standardisation interdit les méthodes d'évaluation alternatives (entretiens, projets) et renforce les biais sociaux.

      Le résultat est un taux d'insatisfaction élevé :

      2 % des candidats ne reçoivent aucune proposition.

      20 % reçoivent une seule proposition qu'ils refusent.

      20 % retentent leur chance l'année suivante.

      Au total, environ 45-46 % des élèves sont insatisfaits chaque année par la plateforme.

      4. L'Automatisation de la Vie et la Neutralité Illusoire de la Technologie

      4.1. Le "Score de Risque" de la CAF : Surveillance des Plus Précaires

      Hubert Guillaud réfute l'idée que la technologie est neutre. L'exemple de la Caisse d'Allocations Familiales (CAF) est emblématique de cette non-neutralité.

      Objectif affiché : Détecter le risque de fraude chez les allocataires grâce à l'IA.

      Réalité : Le système ne mesure pas la fraude (souvent liée aux déclarations des employeurs) mais ce que l'on nomme "l'indu", c'est-à-dire le trop-perçu d'un mois qui doit être remboursé le suivant.

      Ciblage : Ce système pénalise les personnes aux situations complexes et aux revenus non-linéaires : mères isolées, veuves, travailleurs précaires.

      Le calcul de leurs droits est difficile, générant mécaniquement des "indus".

      Critères de calcul absurdes : Des données comportementales sont utilisées.

      Par exemple, se connecter à son espace CAF plus d'un certain nombre de fois par mois augmente le score de risque, alors que ce comportement reflète simplement l'anxiété de personnes en situation de besoin.

      Conséquences : Des populations déjà précaires, représentant moins de 20 % des bénéficiaires, subissent la majorité des contrôles.

      Certaines mères isolées sont contrôlées "quatre à cinq fois dans la même année".

      4.2. Menace sur la Séparation des Pouvoirs

      L'interconnexion des données entre les administrations, sous couvert de "fluidifier l'information", constitue une menace pour le principe démocratique de la séparation des pouvoirs.

      • La CAF a accès aux données des impôts, de France Travail, et aux fichiers des comptes bancaires (FICOBA).

      • Le niveau d'accès est opaque : certains agents peuvent voir les soldes, voire le détail des dépenses sur six mois.

      • Cette collusion crée des formes de surveillance étendues et problématiques.

      Exemple : la police qui dénoncerait des individus à la CAF (environ 3000 cas par an), instaurant un "échange de bons procédés" en dehors de tout cadre légal clair.

      • Cela crée ce qu'un sociologue nomme un "lumpen scorariat" : des individus constamment mal évalués et pénalisés par le croisement des systèmes.

      4.3. Le Risque d'une Dérive Fasciste

      La discussion met en avant une phrase choc tirée du livre de Guillaud : "Déni de démocratie un principe, la discrimination une fonctionnalité, le fascisme une possibilité."

      Le risque fasciste réside dans le fait que ces systèmes permettent de mettre en place des discriminations massives, objectives en apparence, mais basées sur des choix politiques et des biais invisibles.

      Exemple du recrutement : Les logiciels de tri de CV analysent les mots pour produire des scores.

      Ils préfèrent des profils "moyens partout" plutôt que des profils avec des failles et des points forts.

      Discrimination géographique et ethnique :

      Ces systèmes permettent très facilement aux employeurs d'exclure des candidats sur la base de critères non-dits, comme leur localisation géographique (via l'adresse IP) ou leur origine (via des termes associés à certains pays).

      5. Implications Psychosociales : La Massification Déguisée en Personnalisation

      L'idée que les algorithmes nous offrent une expérience "personnalisée" (les "bulles de filtre") est un leurre. En réalité, ils opèrent une massification.

      Logique publicitaire : L'objectif n'est pas de comprendre un individu, mais de le faire rentrer dans des catégories préexistantes pour lui vendre de la publicité de masse.

      Exemple concret : Si un utilisateur "like" une publication critiquant le football où le mot "PSG" apparaît, l'algorithme ne retient que le mot-clé "PSG".

      L'utilisateur est alors associé à la masse de tous les autres profils liés au "PSG" et recevra de la publicité ciblée pour les fans de football, même si son intention initiale était opposée.

      • L'individu est ainsi constamment regroupé "d'une masse à l'autre", pris dans des profils de données qui le dépassent.

      6. Conclusion : Mise en Perspective des Menaces Technologiques

      Interrogé sur une citation du journal Le Postillon affirmant que le "grand refroidissement technologique" est la plus grande menace de notre époque, Hubert Guillaud exprime son désaccord.

      • Il considère que cette vision est trop "techno-centrée".

      • Selon lui, des enjeux plus fondamentaux et urgents priment :

      1. Le réchauffement climatique.    2. La concentration financière et les logiques du capitalisme.

      • La technologie et ses dérives ne sont pas la cause première des problèmes sociaux (isolement, repli sur soi), mais plutôt un amplificateur des dynamiques déjà à l'œuvre, comme la "dissolution des rapports engendrés par le capitalisme".

      • Il conclut en affirmant qu'il faut "savoir raison garder".

      L'enjeu n'est pas seulement de réformer un système comme Parcoursup, mais de s'attaquer au problème de fond : "comment est-ce qu'on crée des places dans l'enseignement supérieur public".

      La technologie n'est pas une fatalité, mais un prisme à travers lequel des forces sociales, politiques et économiques plus vastes s'expriment.

    1. Applying it to the design of the web we aim to create a system where we can do everything offline and in local networks and the connection to the internet is optional. This will help the neuronal groups be more resilient and fast. We invite others to join as co-creators to build a local first version of the Internet together.

      Para Cardumem el enfoque, como he dicho en otros lados es diferente, eligiendo una arquitectura federada, que incluye los servidores ejecutándose localmente y con menores complejidades arquitectónicas.

    2. This separateness is not the biggest problem; what is more dangerous is that in each of these versions of the Internet, the neurons can’t talk and express themselves directly to each other. Servers control our communication with those closest to us: family members, neighbors and local communities.The problems with cloud-based architecture don't stop there. Not only do central servers control who can do what, but their control is ubiquitous. Even when texting your family member on the couch next to you, the signal from your device to theirs needs to go to the application server first, and only after that, return to your own living room.

      Una arquitectura donde cada cual pueda fácilmente descargar y ejecutar un servidor completo y comunicarlo con otros, es para efectos prácticos una arquitectura federada, con la posibildad de convertirse en P2P.

      Una arquitectura federada/P2P no es garantía de descentralización, como vemos pasó con la web, diría yo debido a la dificultad de montar y desplegar servidores. Y si bien se ejercen fuerzas extremas de centralización sobre sistemas como el correo electrónico y los podcast, estos continúan siendo federados. Además, el fediverso ha adquirido un nuevo auge tras la compra de Twitter, pero enfrenta sus propios desafíos.

      Diría que se requiere no sólo una manera frugal de poner a funcionar la tecnología, sino de disponerla a terceros para sus usos colectivos. Acá pareciera ser que el cuello de botella es el hospedaje y habría que mirar cómo hacerlo barato y amigable.

    3. The problem is that the modern Internet relies strongly on cloud technologies, where client applications communicate with each other only via servers. It is akin to having a server between any two neurons in the nervous system, or each neuron being inside a box that decides if the signal from this neuron can go through.

      A menos que cada uno tuviera su propio servidor y los servidores centralizados fueran usados sólo para coordinar comunicaciones, como ocurre con Fossil y ocurrirá con Cardumem. Así, la centralización ofrece conveniencia pero no usa asimetría fundamental de capacidad o poder, como ocurre actualmente y servicios/protocolos de descubrimiento de servidores podrían ser implementados sobre la infraestructura cliente servidor actual, en caso de que algún servidor sea dado de baja.

    1. Was ist das wichtigste Merkmal des Lebens in dieser Stadt? 5.  Ein Überblick über die Geschichte des Alten Testaments Um die Bibel besser zu verstehen, ist es häufig hilfreich, etwas über die ursprüngliche historische Situation des behandelten Bibeltextes bzw. biblischen Buches zu wissen. Es ist allerdings noch wichtiger, die Hauptbegebenheiten der Bibel zueinander in Bezug setzen zu können – also die Reihenfolge von Ereignissen und die Einordnung wichtiger Personen in die Hauptstruktur zu kennen. In Einheit 1 wurde die Botschaft der gesamten Bibel von 1. Mose bis zur Offenbarung kurz zusammengefasst und wichtige Ereignisse wurden hervorgehoben. Zum Abschluss dieser Einheit werden einige dieser Ereignisse nun erneut graphisch dargestellt werden mitsamt einigen Jahreszahlen und den Namen wichtiger Personen. Im weiteren Verlauf des Kurses kann es hilfreich sein, immer wieder zu dieser Übersicht zurückzukommen und weitere Details hinzuzufügen. Die Geschichte des Alten Testaments Die Abbildung wurde mit wenigen Änderungen Graeme Goldworthys The Goldsworthy Trilogy (Cumbria: Paternoster Press, 2000, S. 36) entnommen und mit freundlicher Genehmigung wiedergegeben. Weitere Einzelheiten zur biblischen Geschichte können in einem Bibellexikon oder einer entsprechenden Abhandlung alttestamentlicher Geschichte nachgeschlagen werden. Die Daten zu Abraham und Mose sind abhängig von der Datierung des Exodus. Die archäologische Beweislage zum Exodus ist leider nicht eindeutig. Die Mehrheit der Forscher datiert den Exodus heute bevorzugt im 13. Jahrhundert (ca. 1280 bis 1240 v. Chr.), aber die chronologischen Angaben innerhalb des Alten Testaments legen eine Datierung im 15. Jahrhundert nahe (ca. 1450 v. Chr.; vgl. 1. Könige 6,1; Richter 11,26; 2. Mose 12,40). Aufgrund der Mehrdeutigkeit des archäologischen Materials erscheint es weiser, den expliziten Angaben des biblischen Textes Glauben zu schenken und die frühere Datierung („Lange Chronologie“ in der folgenden Tabelle) als korrekt anzunehmen. Wichtige Jahreszahlen (Es gibt zwei mögliche Datierungen für diesen frühen Zeitraum.)1 Lange Chronologie Kurze Chronologie Abraham ca. 2165–1990 v. Chr. ca. 2000–1825 v. Chr. Isaak ca. 2065–1885 v. Chr. ca. 1900–1720 v. Chr. Jakob ca. 2000–1860 v. Chr. ca. 1840–1700 v. Chr. Josef ca. 1910–1800 v. Chr. ca. 1750–1640 v. Chr. Ankunft in Ägypten ca. 1875 ca. 1700 Auszug aus Ägypten ca. 1450 ca. 1260 Zeit der Richter ca. 1380–1050 v. Chr. ca. 1200–1050 v. Chr. Zeitstrahl Manchmal fällt es schwer, verschiedene geschichtliche Ereignisse in Relation zueinander zu setzen. Ein Zeitstrahl kann dabei helfen, einen besseren Überblick zu gewinnen. Betrachten Sie den unten stehenden Zeitstrahl und die oben stehende Tabelle zusammen und gewinnen Sie ein Eindruck davon, mit welch großer Zeitspanne wir uns befassen. Zeitstrahl – menschliche Perspektive auf die Geschichte Übungen Was hat die Sintflut erreicht, wenn sich die Situation der Menschheit nach Noah so schnell wieder abwärts bewegt? Welche Absicht war damit verknüpft? Nehmen Sie sich die Zeit, Jesaja 65,17–25 sorgfältig zu lesen. Versuchen Sie in eigenen Worten auszudrücken, was mit der Bildsprache gemeint ist. Welche Botschaft versucht der Prophet, zu vermitteln? Weiterführende Lektüre: Schlagen Sie „Adam“, „Eva“ und „Sündenfall“ im Bibellexikon nach. Reflexion Wie viele der Probleme in unserer Welt können mit den Begebenheiten in 1. Mose 3–11 in Verbindung gebracht werden? Wir haben in dieser Einheit viel darüber nachgedacht, was in der Welt im Argen liegt. Welche Hoffnungen wurden dabei zugleich in Ihnen geweckt? 1 Vgl. Artikel „Archaeological sites: Late Bronze Age” und „Time Charts: Biblical History from Abraham to Saul” in New Bible Atlas, Leicester: IVP 1985. Als erledigt kennzeichnen ◄ 2. Auslegung der Bibel Direkt zu: Direkt zu: Bitte lesen... Ankündigungen Gruppeninterne Videokonferenz Gruppeninternes Forum Offenes Forum 1. Das Buch der Bücher 2. Auslegung der Bibel 4. Israel und Gottes Heilsplan 5. Das verheißene Land und Gottes Heilsplan 6. Davids Königreich und Gottes Heilsplan 7. Die Erneuerung von Gottes Heilsplan 8. Jesus: die Erfüllung von Gottes Heilsplan 9. Die Gute Nachricht für alle Völker 10. Warten auf die Vollendung – die Schriften der Apostel Quiz: 1. Das Buch der Bücher Quiz: 2. Auslegung der Bibel <input type="submit" class="btn btn-secondary ml-1" value="Start"> 4. Israel und Gottes Heilsplan ► Kontakte Ausgewählte Mitteilungen: 1 × Kontakte 0 Einstellungen Kontakte Anfragen 0 Keine Kontakte Keine Kontaktanfragen Kontaktanfrage gesendet Persönlicher Bereich Speichern Sie Entwürfe von Nachrichten, Links, Notizen usw. für einen späteren Zugriff. Für mich und alle anderen löschen Blockieren Blockierung aufheben Entfernen Hinzufügen Löschen Löschen Kontaktanfrage senden Annehmen und zu Kontakten hinzufügen Ablehnen OK Abbrechen Favoriten () Keine Kommunikation als Favorit markiert Gruppe () Keine Gruppenkommunikation Persönlich () Keine persönliche Kommunikation Kontakte Weitere Personen Mehr laden Mitteilungen Mehr laden Keine Ergebnisse Personen und Mitteilungen suchen Datenschutz Welche Personen sollen Ihnen persönliche Mitteilungen senden können? Mitteilungen akzeptieren von: Nur meine Kontakte Kontakte und aus meinen Kursen Systemnachrichten Allgemein Eingabetaste zum Senden tippen Ausgewählte Mitteilungen löschen Kontaktanfrage senden Sie haben diese Person blockiert. Blockierung für diese Person aufheben Sie können dieser Person keine Mitteilung senden. Alle anzeigen Sie sind angemeldet als Franziska Kaps (Logout) Einführung in die Bibel Datenschutzinfos Impressum | Cookie-Einstellungen © Copyright 2018–2025 – Alle Inhalte des Bibel-für-alle-Kurses sind urheberrechtlich geschützt. Alle Rechte, einschließlich der Vervielfältigung, Veröffentlichung, Bearbeitung und Übersetzung, bleiben vorbehalten. Das Urheberrecht liegt, soweit nicht ausdrücklich anders gekennzeichnet, beim Moore Theological College. Soweit nicht anders angegeben sind die Bibelzitate der Schlachter Übersetzung in der revidierten Fassung von 2000 entnommen: © Copyright Genfer Bibelgesellschaft. Wiedergegeben mit freundlicher Genehmigung. Alle Rechte vorbehalten. try { document.querySelector('.bfaFooter .currentYear').textContent = new Date().getFullYear(); } catch (e) { } //<![CDATA[ var require = { baseUrl : 'https://kurs.bibel-fuer-alle.net/lib/requirejs.php/1647014878/', // We only support AMD modules with an explicit define() statement. enforceDefine: true, skipDataMain: true, waitSeconds : 0, paths: { jquery: 'https://kurs.bibel-fuer-alle.net/lib/javascript.php/1647014878/lib/jquery/jquery-3.5.1.min', jqueryui: 'https://kurs.bibel-fuer-alle.net/lib/javascript.php/1647014878/lib/jquery/ui-1.12.1/jquery-ui.min', jqueryprivate: 'https://kurs.bibel-fuer-alle.net/lib/javascript.php/1647014878/lib/requirejs/jquery-private' }, // Custom jquery config map. map: { // '*' means all modules will get 'jqueryprivate' // for their 'jquery' dependency. '*': { jquery: 'jqueryprivate' }, // Stub module for 'process'. This is a workaround for a bug in MathJax (see MDL-60458). '*': { process: 'core/first' }, // 'jquery-private' wants the real jQuery module // though. If this line was not here, there would // be an unresolvable cyclic dependency. jqueryprivate: { jquery: 'jquery' } } }; //]]> //<![CDATA[ M.util.js_pending("core/first"); require(['core/first'], function() { require(['core/prefetch']) ; require(["media_videojs/loader"], function(loader) { loader.setUp('de'); });; require(['jquery', 'message_popup/notification_popover_controller'], function($, controller) { var container = $('#nav-notification-popover-container'); var controller = new controller(container); controller.registerEventListeners(); controller.registerListNavigationEventListeners(); }); ; require( [ 'jquery', 'core_message/message_popover' ], function( $, Popover ) { var toggle = $('#message-drawer-toggle-6945962ccc63e6945962cc6f2b3'); Popover.init(toggle); }); ; require(['jquery', 'core/custom_interaction_events'], function($, CustomEvents) { CustomEvents.define('#jump-to-activity', [CustomEvents.events.accessibleChange]); $('#jump-to-activity').on(CustomEvents.events.accessibleChange, function() { if (!$(this).val()) { return false; } $('#url_select_f6945962cc6f2b12').submit(); }); }); ; require(['jquery', 'core_message/message_drawer'], function($, MessageDrawer) { var root = $('#message-drawer-6945962ccf8396945962cc6f2b13'); MessageDrawer.init(root, '6945962ccf8396945962cc6f2b13', false); }); ; M.util.js_pending('theme_boost/loader'); require(['theme_boost/loader'], function() { M.util.js_complete('theme_boost/loader'); }); M.util.js_pending('theme_boost/drawer'); require(['theme_boost/drawer'], function(drawer) { drawer.init(); M.util.js_complete('theme_boost/drawer'); }); ; require(['core_course/manual_completion_toggle'], toggle => { toggle.init() }); ; M.util.js_pending('core/notification'); require(['core/notification'], function(amd) {amd.init(170, []); M.util.js_complete('core/notification');});; M.util.js_pending('core/log'); require(['core/log'], function(amd) {amd.setConfig({"level":"warn"}); M.util.js_complete('core/log');});; M.util.js_pending('core/page_global'); require(['core/page_global'], function(amd) {amd.init(); M.util.js_complete('core/page_global');}); M.util.js_complete("core/first"); }); //]]> //<![CDATA[ M.str = {"moodle":{"lastmodified":"Zuletzt ge\u00e4ndert","name":"Name","error":"Fehler","info":"Infos","yes":"Ja","no":"Nein","cancel":"Abbrechen","confirm":"Best\u00e4tigen","areyousure":"Sind Sie sicher?","closebuttontitle":"Schlie\u00dfen","unknownerror":"Unbekannter Fehler","file":"Datei","url":"URL","collapseall":"Alles einklappen","expandall":"Alles aufklappen"},"repository":{"type":"Typ","size":"Gr\u00f6\u00dfe","invalidjson":"Ung\u00fcltiger JSON-Text","nofilesattached":"Keine Datei","filepicker":"Dateiauswahl","logout":"Abmelden","nofilesavailable":"Keine Dateien vorhanden","norepositoriesavailable":"Sie k\u00f6nnen hier zur Zeit keine Dateien hochladen.","fileexistsdialogheader":"Datei bereits vorhanden","fileexistsdialog_editor":"Eine Datei mit diesem Namen wurde bereits an den Text angeh\u00e4ngt, den Sie gerade bearbeiten","fileexistsdialog_filemanager":"Eine Datei mit diesem Namen wurde bereits an den Text angeh\u00e4ngt","renameto":"Nach '{$a}' umbenennen","referencesexist":"Es gibt {$a} Links zu dieser Datei.","select":"W\u00e4hlen Sie"},"admin":{"confirmdeletecomments":"M\u00f6chten Sie die Kommentare wirklich l\u00f6schen?","confirmation":"Best\u00e4tigung"},"debug":{"debuginfo":"Debug-Info","line":"Zeile","stacktrace":"Stack trace"},"langconfig":{"labelsep":":\u00a0"}}; //]]> //<![CDATA[ (function() {M.util.help_popups.setup(Y); M.util.js_pending('random6945962cc6f2b14'); Y.on('domready', function() { M.util.js_complete("init"); M.util.js_complete('random6945962cc6f2b14'); }); })(); //]]> window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };window.hypothesisConfig = function () { return {"openSidebar":false,"showHighlights":true,"appType":"bookmarklet"}; };

      Gott der Herr selbt wird gegenwärtig sein und alles mit seiner herrlichen Gegenwart erfüllen.

    1. contribuer à la surreprésentation de la composante intellectuelle par rapport à la composante émotionnelle dans de la compétence éthique et au maintien d’une conception dualiste dans laquelle s’oppose encore l’émotion à la cognition et la raison

      On voit mal en quoi ce serait le cas, puisque le passage de Bidet mentionne une triple composante de l'appréciation immédiate (et donc une imbrication) de l'émotionnel, de l'intellectuel et de la motricité (volonté). D'ailleurs, la suite du présent article réitère clairement que l'appréciation immédiate n'est pas qu'affaire d'émotion, qu'il y a une part d'activité cognitive dans celle-ci, sous la forme d'une intrication de l'un à l'autre. On voit mal ce qui justifie la critique de conception dualiste où s'opposeraient l'émotion et la cognition.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #2 (Public review):

      (1) Vglut2 isn't a very selective promoter for the STN. Did the authors verify every injection across brain slices to ensure the para-subthalamic nucleus, thalamus, lateral hypothalamus, and other Vglut2-positive structures were never infected?

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      (2) The authors say in the methods that the high vs low power laser activation for optogenetic experiments was defined by the behavioral output. This is misleading, and the high vs low power should be objectively stated and the behavioral results divided according to the power used, not according to the behavioral outcome.

      Optogenetic excitation is no longer part of the study.

      (3) In the fiber photometry experiments exposing mice to the range of tones, it is impossible to separate the STN response to the tone from the STN response to the movement evoked by the tone. The authors should expose the mouse to the tones in a condition that prevents movement, such as anesthetized or restrained, to separate out the two components.

      The new mixed-effects modeling approach clearly differentiates sensory (auditory) from motor contributions during tone-evoked STN activation. In prior work (see Hormigo et al, 2023, eLife), we explored experimental methods such as head restraint or anesthesia to reduce movement, but we concluded that these approaches are unsuitable for addressing this question. Mice exhibit substantial residual movement even when head-fixed, and anesthesia profoundly alters neural excitability and behavioral state, introducing major confounds. To fully eliminate movement would require paralysis and artificial ventilation, which would again disrupt physiological network dynamics and raise ethical concerns. Therefore, the current modeling approach—incorporating window-specific covariates for movement—is the most appropriate and rigorous way to dissociate tone-evoked sensory activity from motor activity in behaving animals.

      (4) The claim 'STN activation is ideally suited to drive active avoids' needs more explanation. This claim comes after the fiber photometry experiments during active avoidance tasks, so there has been no causality established yet.

      Text adjusted. 

      (5) The statistical comparisons in Figure 7E need some justification and/or clarification. The 9 neuron types are originally categorized based on their response during avoids, then statistics are run showing that they respond differently during avoids. It is no surprise that they would have significantly different responses, since that is how they were classified in the first place. The authors must explain this further and show that this is not a case of circular reasoning.

      Statistically verifying the clustering is useful to ensure that the selected number of clusters reflects distinct classes. It is also necessary when different measurements are used to classify (movement time series classified the avoids) and to compare neuronal types within each avoid mode/class (know called “mode”). Moreover, the new modeling approach goes beyond the prior statistical limitations related to considering movement and neuronal variables separately. 

      (6) The authors show that neurons that have strong responses to orientation show reduced activity during avoidance. What are the implications of this? The author should explain why this is interesting and important.

      The new modeling approach goes beyond the prior analysis limitations. For instance, it shows that most of the prior orienting related activations closely reflect the orienting movement, and only in a few cases (noted and discussed in the results) orienting activations are related to the behavioral contingencies or behavioral outcomes in the task. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study. 

      (7) It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1 that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presenting CS1+CS2 at the same time and could be confusing.

      Optogenetic excitation is no longer part of the study. 

      (8) The experiments in Figure 10 are used to say that STN stimulation is not aversive, but they only show that STN stimulation cannot be used as punishment in place of a shock. This doesn't mean that it is not aversive; it just means it is not as aversive as a shock. The authors should do a simpler aversion test, such as conditioned or real-time place preference, to claim that STN stimulation is not aversive. This is particularly surprising as previous work (Serra et al., 2023) does show that STN stimulation is aversive.

      Optogenetic excitation is no longer part of the study.

      (9) In the discussion, the idea that the STN encodes 'moving away' from contralateral space is pretty vague and unsupported. It is puzzling that the STN activates more strongly to contraversive turns, but when stimulated, it evokes ipsiversive turns; however, it seems a stretch to speculate that this is related to avoidance. In the last experiments of the paper, the axons from the STN to the GPe and to the midbrain are selectively stimulated. Do these evoke ipsiversive turns similarly?

      Optogenetic excitation is no longer part of the study. 

      (10) In the discussion, the authors claim that the STN is essential for modulating action timing in response to demands, but their data really only show this in one direction. The STN stimulation reliably increases the speed of response in all conditions (except maximum speed conditions such as escapes). It seems to be over-interpreting the data to say this is an inability to modulate the speed of the task, especially as clear learning and speed modulation do occur under STN lesion conditions, as shown in Figure 12B. The mice learn to avoid and increase their latency in AA2 vs AA1, though the overall avoids and latency are different from controls. The more parsimonious conclusion would be that STN stimulation biases movement speed (increasing it) and that this is true in many different conditions.

      Optogenetic excitation is no longer part of the study.

      (11)  In the discussion, the authors claim that the STN projections to the midbrain tegmentum directly affect the active avoidance behavior, while the STN projections to the SNr do not affect it. This seems counter to their results, which show STN projections to either area can alter active avoidance behavior. What is the laser power used in these terminal experiments? If it is high (3mW), the authors may be causing antidromic action potentials in the STN somas, resulting in glutamate release in many brain areas, even when terminals are only stimulated in one area. The authors could use low (0.25mW) laser power in the terminals to reduce the chance of antidromic activation and spatially restrict the optical stimulation.

      Optogenetic excitation is no longer part of the study. 

      (12) Was normality tested for data prior to statistical testing?

      Yes, although now we use mixed models

      (13) Why are there no error bars on Figure 5B, black circles and orange triangles?

      When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Reviewer #3 (Public review):

      (1) I really don't understand or accept this idea that delayed movement is necessarily indicative of cautious movements. Is the distribution of responses multi-modal in a way that might support this idea, or do the authors simply take a normal distribution and assert that the slower responses represent 'caution'? Even if responses are multi-modal and clearly distinguished by 'type', why should readers think this that delayed responses imply cautious responding instead of say: habituation or sensitization to cue/shock, variability in attention, motivation, or stress; or merely uncertainty which seems plausible given what I understand of the task design where the same mice are repeatedly tested in changing conditions. This relates to a major claim (i.e., in the work's title).

      In our study, “caution” is defined operationally as the tendency to delay initiation of an avoidance response in demanding situations (e.g., taking more time or care before crossing a busy street). The increase in avoidance latency with task difficulty is highly robust, as we have shown previously through detailed analyses of timing distributions and direct comparisons with appetitive behaviors (e.g., Zhou et al., 2022 JNeurosci). Moreover, we used the tracked movement time series to statistically classify responses into cautious modes, which is likely novel. This definition can dissociate cautious responding from broader constructs listed by a reviewer, such as attention, motivation, or stress, which must be explicitly defined to be rigorously considered in this context, including the likelihood that they covary with caution without being equivalent to it. 

      Cue-evoked orienting responses at CS onset are directly measured, and their habituation and sensitization have been characterized in our prior work (e.g., Zhou et al., 2023 JNeurosci). US-evoked escapes are also measured in the present study and directly compared with avoidance responses. Together, these analyses provide a rigorous and consistent framework for defining and quantifying caution within our behavioral procedures.

      Importantly, mice exhibit cautious responding as defined here across different tasks, making it more informative to classify avoidance responses by behavioral mode rather than by task alone. Accordingly, in the miniscope, single-neuron, and mixed-effects model analyses, we classified active avoids into distinct modes reflecting varying levels of caution. Although these modes covary with task contingencies, their explicit classification improves model predictability and interpretability with respect to cautious responding.

      (2) Related to the last, I'm struggling to understand the rationale for dividing cells into 'types' based the their physiological responses in some experiments (e.g., Figure 7).

      This section has now been expanded into 3 figures (Fig. 7-9) with new modeling approaches that should make the rationale more straight forward.

      By emphasizing the mixed-effects modeling results and integrating these analyses directly into the figures, the revised manuscript now more clearly delineates what is encoded at the population and single-neuron levels. Including movement and baseline covariates allowed us to dissociate motor-related modulation from other neural signals, substantially clarifying the distinction between movement encoding and other task-related variables, which we focus on in the paper. These analyses confirm the strong role of the STN in representing movement while revealing additional signals related to aversive stimulation and cautious responding that persist after accounting for motor effects. These signals arise from distinct neuronal populations that can be differentiated by their movement sensitivity and activation patterns across avoidance modes, reflecting varying levels of caution. At the same time, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (3)The description and discussion of orienting head movements were not well supported, but were much discussed in the avoidance datasets. The initial speed peaks to cue seem to be the supporting data upon which these claims rest, but nothing here suggests head movement or orientation responses.

      As described in the methods (and noted above), we track the head and decompose the movement into rotational and translational components. With the new approach, several effects that initially reflected orienting-related activity at CS-onset (note that our movement tracking captures both head position and orientation as a directional vector) dissipated once movement and baseline covariates were included in the models, emphasizing the utility of the analytical improvements in the revision.

      (4) Similar to the last, the authors note in several places, including abstract, the importance of STN in response timing, i.e., particularly when there must be careful or precise timing, but I don't think their data or task design provides a strong basis for this claim.

      The avoidance modes and the measured latencies directly support the relation to action timing, but now the portion of the previous paper about optogenetic excitation and apparently the main source of criticism is no longer in the present study. 

      (5) I think that other reports show that STN calcium activity is recruited by inescapable foot shock as well. What do these authors see? Is shock, independent of movement, contributing to sharp signals during escapes?

      The question, “Is shock, independent of movement, contributing to sharp signals during escapes?” is now directly addressed in the revised analyses. By incorporating movement and baseline covariates into the mixed-effects models, we dissociate STN activity related to aversive stimulation from that associated with motor output. The results show that shock-evoked STN activation persists even after controlling for movement within defined neuronal populations, supporting a specific nociceptive contribution independent of motor dynamics—a dissociation that appears to be new in this field.

      (6) In particular, and related to the last point, the following work is very relevant and should be cited:  Note that the focus of this other paper is on a subset of VGLUT2+ Tac1 neurons in paraSTN, but using VGLUT2-Cre to target STN will target both STN and paraSTN.

      We appreciate the reviewer’s reference to the recent preprint highlighting the role of the para-subthalamic nucleus in avoidance learning. However, our study focused specifically on performance in well-trained mice rather than on learning processes. Behavioral learning is inherently more variable and can be disrupted by less specific manipulations, whereas our experiments targeted the stable execution of learned avoidance behaviors. Future work will extend these findings to the learning phase and examine potential contributions of subthalamic subdivisions, which our current Vglut2-based manipulations do not dissociate. We will consider this and related work more closely in those studies.

      (7) In multiple other instances, claims that were more tangential to the main claims were made without clearly supporting data or statistics. E.g., claim that STN activation is related to translational more than rotational movement; claim that GCaMP and movement responses to auditory cues were small; claims that 'some animals' responded differently without showing individual data.

      We have adjusted the text accordingly.

      (8) In several figures, the number of subjects used was not described. This is necessary. Also necessary is some assessment of the variability across subjects. The only measure of error shown in many figures relates to trial-to-trial or event variability, which is minimal because, in many cases, it appears that hundreds of trials may have been averaged per animal, but this doesn't provide a strong view of biological variability. When bar/line plots are used to display data, I recommend showing individual animals where feasible.

      All experiments report number of mice and sessions. Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeated-measures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (9) Can the authors consider the extent to which calcium imaging may be better suited to identify increases compared to decreases and how this may affect the results, particularly related to the GRIN data when similar numbers of cells show responses in both directions (e.g., Figure 3)?

      This is an interesting issue related to a widely used technique beyond the scope of our study.

      (10) Raw example traces are not provided.

      We do not think raw traces are useful here. All figures contain average traces to reflect the activity of the estimated population.

      (11) The timeline of the spontaneous movement and avoidance sessions was not clear, nor was the number of events or sessions per animal nor how this was set. It is not clear if there was pre-training or habituation, if many or variable sessions were combined per animal, or what the time gaps between sessions were, or if or how any of these parameters might influence interpretation of the results.

      We have enhanced the description of the sessions, including the number of animals and sessions, which are daily and always equal per animals in each group of experiments. As noted, the sessions are part of the random effects in the model.

      (12) It is not clear if or how the spread of expression outside of the target STN was evaluated, and if or how many mice were excluded due to spread or fiber placements.

      The STN is anatomically well-confined, with its borders and the overlying zona incerta (composed of GABAergic neurons) providing protection against off-target expression in most neighboring forebrain regions. All viral injections were histologically verified and did not into extend into thalamic or hypothalamic areas. As described in the Methods, we employed an app we developed (Brain Atlas Analyzer, available on OriginLab) that aligns serial histological sections with the Allen Brain Atlas to precisely assess viral spread and confirm targeting accuracy. The experiments included in the revised manuscript now focus on optogenetic inhibition and irreversible lesion approaches—three complementary methods that consistently targeted the STN and yielded similar behavioral effects.

      Recommendations for the authors:

      Reviewing Editor Comments:

      The primary feedback agreed upon by all the reviewers was that the manuscript requires significant streamlining as it is currently overly long and convoluted.

      We thank the reviewers and editors for their thoughtful and constructive feedback. In response to the primary comment that “the manuscript requires significant streamlining as it is currently overly long and convoluted,” we have substantially revised and refocused the paper. Specifically, we streamlined the included data and enhanced the analyses to emphasize the central findings: the encoding of movement, cautious responding, and punishment in the STN during avoidance behavior. We also focused the causal component of the study by including only the loss-of-function experiments—both optogenetic inhibition and irreversible viral/electrolytic lesions—that establish the critical role of STN circuits in generating active avoidance. Together, these revisions enhance clarity, tighten the narrative focus, and align the manuscript more closely with the reviewers’ recommendations.

      Major revisions include the addition of mixed-effects modeling to dissociate the contributions of movement from other STN-encoded signals related to caution and punishment. This modeling approach allowed us to reveal that these components are statistically separable, demonstrating that movement, cautious responding, and aversive input are encoded by neuronal subsets. To streamline the manuscript and address reviewer concerns, we removed the optogenetic excitation experiments. As revised, the paper presents a more concise and cohesive narrative showing that STN neurons differentially encode movement, caution, and aversive stimuli, and that this circuitry is essential for generating active avoidance behavior.

      Many of the specific points raised by reviewers now fall outside the scope of the revised manuscript. This is primarily because the revised version omits data and analyses related to optogenetic excitation and associated control experiments. By removing these components, the paper now presents a streamlined and internally consistent dataset focused on how the STN encodes movement, cautious responding, and aversive outcomes during avoidance behavior, as well as on loss-of-function experiments demonstrating its necessity for generating active avoidance. Below, we address the points that remain relevant across reviews.

      Following extensive revisions, the current manuscript differs in several important ways from what the assessment describes:

      The description that the study “uses fiber photometry, implantable lenses, and optogenetics” is more accurately represented as using both fiber photometry and singleneuron calcium imaging with miniscopes, combined with optogenetic and irreversible lesion approaches.

      The phrase stating that “active but not passive avoidance depends in part on STN projections to substantia nigra” is better characterized as “STN projections to the midbrain,” since our data show that optogenetic inhibition of STN terminals in both the mesencephalic reticular tegmentum (MRT) and substantia nigra pars reticulata (SNr) produce equivalent effects, and thus these sites are combined in the study. 

      Finally, the original concern that evidence for STN involvement in cautious responding or avoidance speed was incomplete no longer applies. The revised focus on encoding, through the inclusion of mixed-effects modeling, now dissociates movement-related, cautious, and aversive components of STN activity. By removing the optogenetic excitation data, we no longer claim that the STN controls caution but rather that it encodes cautious responding, alongside movement and punishment signals. Furthermore, loss-of-function experiments demonstrate that silencing STN output abolishes active avoidance entirely, supporting an essential role for the STN in generating goal-directed avoidance behavior—a behavioral domain that, unlike appetitive responding, is fundamentally defined by caution and the need to balance action timing under threat.

      Reviewer #2 (Recommendations for the authors):

      (1) Show individual data points on bar plots.

      Wherever feasible, we display individual data points (e.g., Figures 1 and 2) to convey variability directly. However, in cases where figures depict hundreds of paired (repeatedmeasures) data points, showing all points without connecting them would not be appropriate, while linking them would make the figures visually cluttered and uninterpretable. All plots and traces include measures of variability (SEM), and the raw data will be shared on Dryad. When error bars are not visible, they are smaller than the trace thickness or bar line—for example, in Figure 5B, the black circles and orange triangles include error bars, but they are smaller than the symbol size.

      Also, to minimize visual clutter, only a subset of relevant comparisons is highlighted with asterisks, whereas all relevant statistical results, comparisons, and mouse/session numbers are fully reported in the Results section, with statistical analyses accounting for the clustering of data within subjects and sessions.

      (2) The active avoidance experiments are confusing when they are introduced in the results section. More explanation of what paradigms were used and what each CS means at the time these are introduced would add clarity. For example, AA1, AA2, etc, are explained only with references to other papers, but a brief description of each protocol and a schematic figure would really help.

      The avoidance protocols (AA1–4) are now described briefly but clearly in the Results section (second paragraph of “STN neurons activate during goal-directed avoidance contingencies”) and in greater detail in the Methods section. As stated, these tasks were conducted sequentially, and mice underwent the same number of sessions per procedure, which are indicated. All relevant procedural information has been included in these sections. Mice underwent daily sessions and learnt these tasks within 1-2 sessions, progressing sequentially across tasks with an equal number of sessions per task (7 per task), and the resulting data were combined and clustered by mouse/session in the statistical models.

      (3) How do the Class 1, 2, 3 avoids relate to Class 1, 2, 3 neural types established in Figure 3? It seems like they are not related, and if that is the case, they should be named something different from each other to avoid confusion. (4) Similarly, having 3 different cell types (a,b,c) in the active avoidance seems unrelated to the original classification of cell types (1,2,3), and these are different for each class of avoid. This is very confusing, and it is unclear how any of these types relate to each other. Presumably, the same mouse has all three classes of avoids, so there are recordings from each cell during each type of avoid.

      The terms class, mode, and type are now clearly distinguished throughout the manuscript. Modes refer to distinct patterns of avoidance behavior that differ in the level of cautious responding (Mode 3 is most cautious). Within each mode, types denote subgroups of neurons identified based on their ΔF/F activity profiles. In contrast, classes categorize neurons according to their relationship to movement, determined by cross-correlation analyses between ΔF/F and head speed (Class1-4; Fig. 7 is a new analysis) or head turns (ClassA-C, renamed from 1-3). This updated terminology clarifies the analytic structure, highlighting distinct neuronal populations within each analysis. For example, during avoidance behaviors, these classifications distinguish neurons encoding movement-, caution-, and outcome-related signals. Comparisons are conducted within each analytical set, within classes (A-C or 1-4 separately), within avoidance modes, or within modespecific neuronal types.

      …So the authors could compare one cell during each avoid and determine whether it relates to movement or sound, or something else. It is interesting that types a,b, and c have the exact same proportions in each class of avoid, and makes it important to investigate if these are the exact same cells or not.

      That previous table with the a,b,c % in the three figure panels was a placeholder, which was not updated in the included figure. It has now been correctly updated. They do not have the same proportions as shown in Fig. 9, although they are similar.

      Also, these mice could be recorded during the open field, so the original neural classification (class 1, 2,3) could be applied to these same cells, and then the authors can see whether each cell type defined in the open field has a different response to the different avoid types. As it stands, the paper simply finds that during movement and during avoidance behaviors, different cells in the STN do different things.

      We included a new analysis in Fig. 7 that classifies neurons based on the cross-correlation with movement. The inclusion of the models now clearly assigns variance to movement versus the other factors, and this analysis leads to the classification based on avoid modes. 

      (5) The use of the same colors to mean two different things in Figure 9 is confusing. AA1 vs AA2 shouldn't be the same colors as light-naïve vs light signaling CS.

      Optogenetic excitation is no longer part of the study.

      (6) The exact timeline of the optogenetics experiments should be presented as a schematic for understanding. It is not clear which conditions each mouse experienced in which order. This is critical to the interpretation of Figure 9 and the reduction of passive avoids during STN stimulation. Did these mice have the CS1+STN stimulation pairing or the STN+US pairing prior to this experiment? If they did, the stimulation of the STN could be strongly associated with either punishment or with the CS1that predicts punishment. If that is the case, stimulating the STN during CS2 could be like presentingCS1+CS2 at the same time and could be confusing. The authors should make it clear whether the mice were naïve during this passive avoid experiment or whether they had experienced STN stimulation paired with anything prior to this experiment.

      Optogenetic excitation is no longer part of the study.

      (20) Similarly, the duration of the STN stimulation should be made clear on the plots that show behavior over time (e.g., Figure 9E).

      Optogenetic excitation is no longer part of the study.

      (21) There is just so much data and so many conditions for each experiment here. The paper is dense and difficult to read. It would really benefit readability if the authors put only the key experiments and key figure panels in the main text and moved much of the repetitive figure panels to supplemental figures. The addition of schematic drawings for behavioral experiment timing and for the different AA1, AA2, and AA3 conditions would also really improve clarity.

      By focusing the study, we believe it has substantially improved clarity and readability. 

      Reviewer #3 (Recommendations for the authors):

      (1) Minor error in results 'Cre-AAV in the STN of Vglut2-Cre' Fixed.

      (2) In some Figure 2 panels, the peaks appear to be cut off, and blue traces are obscured by red.

      In Fig. 2, the peaks of movement (speed) traces are intentionally truncated to emphasize the rising phase of the turn, which would otherwise be obscured if the full y-axis range were displayed (peaks and other measures are statistically compared). This adjustment enhances clarity without omitting essential detail and is now noted in the legend.

    1. Author response:

      We thank the editors and reviewers for their generally positive and thoughtful feedback on this work. Below are provisional responses to some of the concerns raised:

      Reviewer 1:

      At a total scan duration of 2 minutes, the ASL sequence utilized in this cohort is much shorter than that of a typical ASL sequence (closer to 5 minutes as mentioned by the authors). However, this implementation also included multiple (n=5) PLDs. As currently described, it is unclear how any repetitions were acquired at each PLD and whether these were acquired efficiently (i.e., with a Look-Locker readout) or whether individual repetitions within this acquisition were dedicated to a single PLD. If the latter, the number of repetitions per PLD (and consequently signal-to-noise-ratio, SNR) is likely to be very low. Have the authors performed any analyses to determine whether the signal in individual subjects generally lies above the noise threshold? This is particularly relevant for white matter, which is the focus of several findings discussed in the study.

      We agree that this was a short acquisition compared to most ASL protocols, necessitated by the strict time-keeping requirements for running such a large study. We apologise if this was not clear in the original manuscript, but due to this time constraint and the use of a segmented readout (which was not Look-Locker) there was only time available for a single average at each PLD. This does mean that the perfusion weighted images at each PLD are relatively noisy, although the image quality with this sequence was still reasonable, as demonstrated in Figure 1, with perfusion weighted images visibly above the noise floor. In addition, as has been demonstrated theoretically and experimentally in recent work (Woods et al., 2023, 2020), even though the SNR of each individual PLD image might be low in multi-PLD acquisitions, this is effectively recovered during the model fitting process, giving it comparable or greater accuracy than a protocol which collects many averages at a single (long) PLD. As also noted by the reviewers, this approach has the further benefit of allowing ATT estimation, which has proven to provide useful and complementary information to CBF. Finally, the fact that many of the findings in this study pass strict statistical thresholds for significance, despite the many multiple comparisons performed, and that the spatial patterns of these relationships are consistent with expectations, even in the white matter (e.g. Figure 6B), give us confidence that the perfusion estimation is robust. However, we will consider adding some additional metrics around SNR or fitting uncertainty in a revised manuscript, as well as clarifying details of the acquisition.

      Hematocrit is one of the variables regressed out in order to reduce the effect of potential confounding factors on the image-derived phenotypes. The effect of this, however, may be more complex than accounting for other factors (such as age and sex). The authors acknowledge that hematocrit influences ASL signal through its effect on longitudinal blood relaxation rates. However, it is unclear how the authors handled the fact that the longitudinal relaxation of blood (T1Blood) is explicitly needed in the kinetic model for deriving CBF from the ASL data. In addition, while it may reduce false positives related to the relationships between dietary factors and hematocrit, it could also mask the effects of anemia present in the cohort. The concern, therefore, is two-fold: (1) Were individual hematocrit values used to compute T1Blood values? (2) What effect would the deconfounding process have on this?

      We agree this is an important point to clarify. In this work we decided not to use the haematocrit to directly estimate the T1 of blood for each participant a) because this would result in slight differences in the model fitting for each subject, which could introduce bias (e.g. the kinetic model used assumes instantaneous exchange between blood water and tissue, so changing the T1 of blood for each subject could make us more sensitive to inaccuracies in this assumption); and b) because typically the haematocrit measures were quite some time (often years) prior to the imaging session, leading to an imperfect correction. We therefore took the pragmatic approach to simply regress each subject’s average haematocrit reading out of the IDP and voxelwise data to prevent it contributing to apparent correlations caused by indirect effects on blood T1. However, we agree with the reviewer that this certainly would mask the effects of anaemia in this cohort, so for researchers interested in this condition a different approach should be taken. We will update the revised manuscript to try to clarify these points.

      The authors leverage an observed inverse association between white matter hyperintensity volume and CBF as evidence that white matter perfusion can be sensitively measured using the imaging protocol utilized in this cohort. The relationship between white matter hyperintensities and perfusion, however, is not yet fully understood, and there is disagreement regarding whether this structural imaging marker necessarily represents impaired perfusion. Therefore, it may not be appropriate to use this finding as support for validation of the methodology.

      We appreciate the reviewer’s point that there is still debate about the relationship between white matter hyperintensities and perfusion. We therefore agree that this observed relationship therefore does not validate the methodology in the sense that it is an expected finding, but it does demonstrate that the data quality is sufficient to show significant correlations between white matter hyperintensity volume and perfusion, even in white matter regions, which would not be the case if the signal there were dominated by noise. Similarly, the clear spatial pattern of perfusion changes in the white matter that correlate with DTI measures in the same regions also suggests there is sensitivity to white matter perfusion. However, we will update the wording in the revised manuscript to try to clarify this point.

      Reviewer 2:

      This study primarily serves to illustrate the efficacy and potential of ASL MRI as an imaging parameter in the UK Biobank study, but some of the preliminary observations will be hypothesis-generating for future analyses in larger sample sizes. However, a weakness of the manuscript is that some of the reported observations are difficult to follow. In particular, the associations between ASL and resting fMRI illustrated in Figure 7 and described in the accompanying Results text are difficult to understand. It could also be clearer whether the spatial maps showing ASL correlates of other image-derived phenotypes in Figure 6B are global correlations or confined to specific regions of interest. Finally, while addressing partial volume effects in gray matter regions by covarying for cortical thickness is a reasonable approach, the Methods section seems to imply that a global mean cortical thickness is used, which could be problematic given that cortical thickness changes may be localized.

      We apologise if any of the presented information was unclear and will try to improve this in our revised manuscript. To clarify, the spatial maps associated with other (non-ASL) IDPs were generated by calculating the correlation between the ASL CBF or ATT in every voxel in standard space with the non-ASL IDP of interest, not the values of the other imaging modality in the same voxel. No region-based masking was used for this comparison. This allowed us to examine whether the correlation with this non-ASL IDP was only within the same brain region or if the correlations extended to other regions too.

      We also agree that the associations between ASL and resting fMRI are not easy to interpret. We therefore tried to be clear in the manuscript that these were preliminary findings that may be of interest to others, but clearly further study is required to explore this complex relationship further. However, we will try to clarify how the results are presented in the revised manuscript.

      In relation to partial volume effects, we did indeed use only a global measure of cortical thickness in the deconfounding and we acknowledged that this could be improved in the discussion: [Partial volume effects were] “mitigated here by the inclusion of cortical thickness in the deconfounding process, although a region-specific correction approach that is aware of the through-slice blurring (Boscolo Galazzo et al., 2014) is desirable in future iterations of the ASL analysis pipeline.” As suggested here, although this is a coarse correction, we did not feel that a more comprehensive partial volume correction approach could be used without properly accounting for the through-slice blurring effects from the 3D-GRASE acquisition (that will vary across different brain regions), which is not currently available, although this is an area we are actively working on for future versions of the image analysis pipeline. We again will try to clarify this point further in the revised manuscript.

      References

      Woods JG, Achten E, Asllani I, Bolar DS, Dai W, Detre J, Fan AP, Fernández-Seara M, Golay X, Günther M, Guo J, Hernandez-Garcia L, Ho M-L, Juttukonda MR, Lu H, MacIntosh BJ, Madhuranthakam AJ, Mutsaerts HJ, Okell TW, Parkes LM, Pinter N, Pinto J, Qin Q, Smits M, Suzuki Y, Thomas DL, Van Osch MJP, Wang DJ, Warnert EAH, Zaharchuk G, Zelaya F, Zhao M, Chappell MA. 2023. Recommendations for Quantitative Cerebral Perfusion MRI using Multi-Timepoint Arterial Spin Labeling: Acquisition, Quantification, and Clinical Applications (preprint). Open Science Framework. doi:10.31219/osf.io/4tskr

      Woods JG, Chappell MA, Okell TW. 2020. Designing and comparing optimized pseudo-continuous Arterial Spin Labeling protocols for measurement of cerebral blood flow. NeuroImage 223:117246. doi:10.1016/j.neuroimage.2020.117246

    1. Om duidelijk te maken waar de grafiek over gaat, is het vaak verstandig om een titel toe te voegen. Dit kan je doen met ax[n].set_title(). Ook kan je nog een positie aangeven voor je titel. Dit doe je bijvoorbeeld door loc="left" als extra argument in de functie te zetten. Als voorbeeld hebben we dit gedaan bij de tweede grafiek.

      Ik had gemerkt dat veel studenten moeite hebben om argumenten binnen functies te begrijpen. Het lijkt me dus verstandig om dat hier als voorbeeld aan te geven. Hopelijk leren ze zo meer vanuit het principiële gedeelte van programmeren te denken.

      "Axes.set_title(label, fontdict=None, loc=None, pad=None, , y=None, *kwargs) geeft de mogelijkheid om met "loc" de positie van de titel te bepalen ..."

    1. Conclusion

      Muy buena la conclusión. Solamente dos comentarios: 1.- me hace un poco de ruido definir como neutralización de clase el hecho de que estas condiciones objetivas no logren predecir pension mjp. Siguiendo la idea, me referiría más bien a un patrón generalizado de creencias entre clases 2.- En el párrafo de trabajos futuros mencionaría integrar nuevas mediciones de meritocracia (Castillo et al., 2023) en los estudios, pues se podrían develar otras configuraciones de creencias entre mjp y merit que la escala actual no permite observar

    1. cuerpo político,

      Añadir que de esta clasificación, también consideramos lasa portaciones de Butler, y otras contemporáneas . Añadir un párrafo que explique la tesis busca comprender a cuerpo conectad en esas distintas dimensiones,

    2. El cuerpo social, s

      Añadir, que de estaclasificación, interesa acercarme a entender aspectos de usos intencionales del cuerpo, como los que ofrece Mauss y posteriormente Muñiz.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary: 

      The authors analyzed the expression of ATAD2 protein in post-meiotic stages and characterized the localization of various testis-specific proteins in the testis of the Atad2 knockout (KO). By cytological analysis as well as the ATAC sequencing, the study showed that increased levels of HIRA histone chaperone, accumulation of histone H3.3 on post-meiotic nuclei, defective chromatin accessibility and also delayed deposition of protamines. Sperm from the Atad2 KO mice reduces the success of in vitro fertilization. The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin. 

      We would like to take this opportunity to highlight that the present study builds on our previously published work, which examined the function of ATAD2 in both yeast S. pombe and mouse embryonic stem (ES) cells (Wang et al., 2021). In yeast, using genetic analysis we showed that inactivation of HIRA rescues defective cell growth caused by the absence of ATAD2. This rescue could also be achieved by reducing histone dosage, indicating that the toxicity depends on histone over-dosage, and that HIRA toxicity, in the absence of ATAD2, is linked to this imbalance.

      Furthermore, HIRA ChIP-seq performed in mouse ES cells revealed increased nucleosome-bound HIRA, particularly around transcription start sites (TSS) of active genes, along with the appearance of HIRA-bound nucleosomes within normally nucleosome-free regions (NFRs). These findings pointed to ATAD2 as a major factor responsible for unloading HIRA from nucleosomes. This unloading function may also apply to other histone chaperones, such as FACT (see Wang et al., 2021, Fig. 4C).

      In the present study, our investigations converge on the same ATAD2 function in the context of a physiologically integrated mammalian system—spermatogenesis. Indeed, in the absence of ATAD2, we observed H3.3 accumulation and enhanced H3.3-mediated gene expression. Consistent with this functional model of ATAD2— unloading chaperones from histone- and non-histone-bound chromatin—we also observed defects in histone-toprotamine replacement.

      Together, the results presented here and in Wang et al. (2021) reveal an underappreciated regulatory layer of histone chaperone activity. Previously, histone chaperones were primarily understood as factors that load histones. Our findings demonstrate that we must also consider a previously unrecognized regulatory mechanism that controls assembled histone-bound chaperones. This key point was clearly captured and emphasized by Reviewer #2 (see below).

      Strengths:

      The paper describes the role of ATAD2 AAA+ ATPase in the proper localization of sperm-specific chromatin proteins such as protamine, suggesting the importance of the DNA replication-independent histone exchanges with the HIRA-histone H3.3 axis. 

      Weaknesses: 

      (1) Some results lack quantification. 

      We will consider all the data and add appropriate quantifications where necessary.

      (2) The work was performed well, and most of the results are convincing. However, this manuscript does not suggest a molecular mechanism for how ATAD2 promotes the formation of testis-specific chromatin. 

      Please see our comments above.

      Reviewer #2 (Public review): 

      Summary:

      This manuscript by Liakopoulou et al. presents a comprehensive investigation into the role of ATAD2 in regulating chromatin dynamics during spermatogenesis. The authors elegantly demonstrate that ATAD2, via its control of histone chaperone HIRA turnover, ensures proper H3.3 localization, chromatin accessibility, and histone-toprotamine transition in post-meiotic male germ cells. Using a new well-characterized Atad2 KO mouse model, they show that ATAD2 deficiency disrupts HIRA dynamics, leading to aberrant H3.3 deposition, impaired transcriptional regulation, delayed protamine assembly, and defective sperm genome compaction. The study bridges ATAD2's conserved functions in embryonic stem cells and cancer to spermatogenesis, revealing a novel layer of epigenetic regulation critical for male fertility. 

      Strengths:

      The MS first demonstration of ATAD2's essential role in spermatogenesis, linking its expression in haploid spermatids to histone chaperone regulation by connecting ATAD2-dependent chromatin dynamics to gene accessibility (ATAC-seq), H3.3-mediated transcription, and histone eviction. Interestingly and surprisingly, sperm chromatin defects in Atad2 KO mice impair only in vitro fertilization but not natural fertility, suggesting unknown compensatory mechanisms in vivo. 

      Weaknesses:

      The MS is robust and there are not big weaknesses 

      Reviewer #3 (Public review): 

      Summary: 

      The authors generated knockout mice for Atad2, a conserved bromodomain-containing factor expressed during spermatogenesis. In Atad2 KO mice, HIRA, a chaperone for histone variant H3.3, was upregulated in round spermatids, accompanied by an apparent increase in H3.3 levels. Furthermore, the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis were partially disrupted in the absence of ATAD2, possibly due to delayed histone removal. Despite these abnormalities, Atad2 KO male mice were able to produce offspring normally. 

      Strengths:

      The manuscript addresses the biological role of ATAD2 in spermatogenesis using a knockout mouse model, providing a valuable in vivo framework to study chromatin regulation during male germ cell development. The observed redistribution of H3.3 in round spermatids is clearly presented and suggests a previously unappreciated role of ATAD2 in histone variant dynamics. The authors also document defects in the sequential incorporation and removal of TH2B and PRM1 during spermiogenesis, providing phenotypic insight into chromatin transitions in late spermatogenic stages. Overall, the study presents a solid foundation for further mechanistic investigation into ATAD2 function. 

      Weaknesses:

      While the manuscript reports the gross phenotype of Atad2 KO mice, the findings remain largely superficial and do not convincingly demonstrate how ATAD2 deficiency affects chromatin dynamics. Moreover, the phenotype appears too mild to elucidate the functional significance of ATAD2 during spermatogenesis. 

      We respectfully disagree with the statement that our findings are largely superficial. Based on our investigations of this factor over the years, it has become evident that ATAD2 functions as an auxiliary factor that facilitates mechanisms controlling chromatin dynamics (see, for example, Morozumi et al., 2015). These mechanisms can still occur in the absence of ATAD2, but with reduced efficiency, which explains the mild phenotype we observed.

      This function, while not essential, is nonetheless an integral part of the cell’s molecular biology and should be studied and brought to the attention of the broader biological community, just as we study essential factors. Unfortunately, the field has tended to focus primarily on core functional actors, often overlooking auxiliary factors. As a result, our decade-long investigations into the subtle yet important roles of ATAD2 have repeatedly been met with skepticism regarding its functional significance, which has in turn influenced editorial decisions.

      We chose eLife as the venue for this work specifically to avoid such editorial barriers and to emphasize that facilitators of essential functions do exist. They deserve to be investigated, and the underlying molecular regulatory mechanisms must be understood.

      (1) Figures 4-5: The analyses of differential gene expression and chromatin organization should be more comprehensive. First, Venn diagrams comparing the sets of significantly differentially expressed genes between this study and previous work should be shown for each developmental stage. Second, given the established role of H3.3 in MSCI, the effect of Atad2 knockout on sex chromosome gene expression should be analyzed. Third, integrated analysis of RNA-seq and ATAC-seq data is needed to evaluate how ATAD2 loss affects gene expression. Finally, H3.3 ChIP-seq should be performed to directly assess changes in H3.3 distribution following Atad2 knockout.  

      (1) In the revised version, we will include Venn diagrams to illustrate the overlap in significantly differentially expressed genes between this study and previous work. However, we believe that the GSEAs presented here provide stronger evidence, as they indicate the statistical significance of this overlap (p-values). In our case, we observed p-value < 0.01 (**) and p < 0.001 (***).

      (2) Sex chromosome gene expression was analyzed and is presented in Fig. 5C.

      (3) The effect of ATAD2 loss on gene expression is shown in Fig. 4A, B, and C as histograms, with statistical significance indicated in the middle panels.

      (4) Although mapping H3.3 incorporation across the genome in wild-type and Atad2 KO cells would have been informative, the available anti-H3.3 antibody did not work for ChIP-seq, at least in our hands. The authors of Fontaine et al., 2022, who studied H3.3 during spermatogenesis in mice, must have encountered the same problem, since they tagged the endogenous H3.3 gene to perform their ChIP experiments.

      (2) Figure 3: The altered distribution of H3.3 is compelling. This raises the possibility that histone marks associated with H3.3 may also be affected, although this has not been investigated. It would therefore be important to examine the distribution of histone modifications typically associated with H3.3. If any alterations are observed, ChIP-seq analyses should be performed to explore them further.

      Based on our understanding of ATAD2’s function—specifically its role in releasing chromatin-bound HIRA—in the absence of ATAD2 the residence time of both HIRA and H3.3 on chromatin increases. This results in the detection of H3.3 not only on sex chromosomes but across the genome. Our data provide clear evidence of this phenomenon. The reviewer is correct in suggesting that the accumulated H3.3 would carry H3.3-associated histone PTMs; however, we are unsure what additional insights could be gained by further demonstrating this point.

      (3) Figure 7: While the authors suggest that pre-PRM2 processing is impaired in Atad2 KO, no direct evidence is provided. It is essential to conduct acid-urea polyacrylamide gel electrophoresis (AU-PAGE) followed by western blotting, or a comparable experiment, to substantiate this claim. 

      Figure 7 does not suggest that pre-PRM2 processing is affected in Atad2 KO; rather, this figure—particularly Fig. 7B—specifically demonstrates that pre-PRM2 processing is impaired, as shown using an antibody that recognizes the processed portion of pre-PRM2. ELISA was used to provide a more quantitative assessment; however, in the revised manuscript we will also include a western blot image.

      (4) HIRA and ATAD2: Does the upregulation of HIRA fully account for the phenotypes observed in Atad2 KO? If so, would overexpression of HIRA alone be sufficient to phenocopy the Atad2 KO phenotype? Alternatively, would partial reduction of HIRA (e.g., through heterozygous deletion) in the Atad2 KO background be sufficient to rescue the phenotype? 

      These are interesting experiments that require the creation of appropriate mouse models, which are not currently available.

      (5) The mechanism by which ATAD2 regulates HIRA turnover on chromatin and the deposition of H3.3 remains unclear from the manuscript and warrants further investigation. 

      The Reviewer is absolutely correct. In addition to the points addressed in response to Reviewer #1’s general comments (see above), it would indeed have been very interesting to test the segregase activity of ATAD2 (likely driven by its AAA ATPase activity) through in vitro experiments using the Xenopus egg extract system described by Tagami et al., 2004. This system can be applied both in the presence and absence (via immunodepletion) of ATAD2 and would also allow the use of ATAD2 mutants, particularly those with inactive AAA ATPase or bromodomains. However, such experiments go well beyond the scope of this study, which focuses on the role of ATAD2 in chromatin dynamics during spermatogenesis.

      References:

      (1) Wang T, Perazza D, Boussouar F, Cattaneo M, Bougdour A, Chuffart F, Barral S, Vargas A, Liakopoulou A, Puthier D, Bargier L, Morozumi Y, Jamshidikia M, Garcia-Saez I, Petosa C, Rousseaux S, Verdel A, Khochbin S. ATAD2 controls chromatin-bound HIRA turnover. Life Sci Alliance. 2021 Sep 27;4(12):e202101151. doi: 10.26508/lsa.202101151. PMID: 34580178; PMCID: PMC8500222.

      (2) Morozumi Y, Boussouar F, Tan M, Chaikuad A, Jamshidikia M, Colak G, He H, Nie L, Petosa C, de Dieuleveult M, Curtet S, Vitte AL, Rabatel C, Debernardi A, Cosset FL, Verhoeyen E, Emadali A, Schweifer N, Gianni D, Gut M, Guardiola P, Rousseaux S, Gérard M, Knapp S, Zhao Y, Khochbin S. Atad2 is a generalist facilitator of chromatin dynamics in embryonic stem cells. J Mol Cell Biol. 2016 Aug;8(4):349-62. doi: 10.1093/jmcb/mjv060. Epub 2015 Oct 12. PMID: 26459632; PMCID: PMC4991664.

      (3) Fontaine E, Papin C, Martinez G, Le Gras S, Nahed RA, Héry P, Buchou T, Ouararhni K, Favier B, Gautier T, Sabir JSM, Gerard M, Bednar J, Arnoult C, Dimitrov S, Hamiche A. Dual role of histone variant H3.3B in spermatogenesis: positive regulation of piRNA transcription and implication in X-chromosome inactivation. Nucleic Acids Res. 2022 Jul 22;50(13):7350-7366. doi: 10.1093/nar/gkac541. PMID: 35766398; PMCID: PMC9303386.

      (4) Tagami H, Ray-Gallet D, Almouzni G, Nakatani Y. Histone H3.1 and H3.3 complexes mediate nucleosome assembly pathways dependent or independent of DNA synthesis. Cell. 2004 Jan 9;116(1):51-61. doi: 10.1016/s0092-8674(03)01064-x. PMID: 14718166.

      Recommendations for the authors:

      Reviewing Editor Comments:

      I note that the reviewers had mixed opinions about the strength of the evidence in the manuscript. A revision that addresses these points would be welcome.

      Reviewer #1 (Recommendations for the authors):  

      Major points: 

      (1) No line numbers: It is hard to point out the issues.

      The revised version harbors line numbers.

      (2) Given the results shown in Figure 3 and Figure 4, it is nice to show the chromosomal localization of histone H3.3 in spermatocytes or post-meiotic cells by Chromatin-immunoprecipitation sequencing (ChIP-seq).

      Although mapping H3.3 incorporation across the genome in wild-type and Atad2 KO cells would have been informative, the available anti-H3.3 antibody did not work for ChIP-seq in our hands. In fact, this antibody is not well regarded for ChIP-seq. For example, Fontaine et al. (2022), who investigated H3.3 during spermatogenesis in mice, circumvented this issue by tagging the endogenous H3.3 genes for their ChIP experiments.

      (3) Figure 7B and 8: Why the authors used ELISA for the protein quantification. At least, western blotting should be shown.

      ELISA is a more quantitative method than traditional immunoblotting. Nevertheless, as requested by the reviewer, we have now included a corresponding western blot in Fig. S3.

      (4) For readers, please add a schematic pathway of histone-protamine replacement in sperm formation in Fig.1 and it would be nice to have a model figure, which contains the authors' idea in the last figure.

      As requested by this reviewer, we have now included a schematic model in Figure 9 to summarize the main conclusions of our work.

      Minor points: 

      (1) Page 2, the second paragraph, "pre-PRM2: Please explain more about pre-PRM2 and/or PRM2 as well as PRM1 (Figure 6).

      More detailed descriptions of PRM2 processing are now given in this paragraph. 

      (2) Page 3, bottom paragraph, line 1: "KO" should be "knockout (KO)".

      Done.

      (3) Page 4, second paragraph bottom: Please explain more about the protein structure of germ-line-specific ATAD2S: how it is different from ATAD2L. Germ-line specific means it is also expressed in ovary?

      As Atad2 is predominantly expressed in embryonic stem cells and in spermatogenic cells, we replaced all through the text germ-line specific by more appropriate terms.

      (4) Figure 1C, western blotting: Wild-type testis extracts, both ATAD2L and -S are present. Does this mean that ATADS2L is expressed in both germ line as well as supporting cells. Please clarify this and, if possible, show the western blotting of spermatids well as spermatocytes.

      Figure 1D shows sections of seminiferous tubules from Atad2 KO mice, in which lacZ expression is driven by the endogenous Atad2 promoter. The results indicate that Atad2 is expressed mainly in post-meiotic cells. Most labeled cells are located near the lumen, whereas the supporting Sertoli cells remain unlabeled. Sertoli cells, which are anchored to the basal lamina, span the entire thickness of the germinal epithelium from the basal lamina to the lumen. Their nuclei, however, are usually positioned closer to the basal membrane. Thus, the observed lacZ expression pattern argues against substantial Atad2 expression in Sertoli cells. 

      (5) Figure 1C: Please explain a bit more about the reduction of ATAD2 proteins in heterozygous mice.

      Done

      (6) Figure 1C: Genotypes of the mice should be shown in the legend.

      Done 

      (7) Figure 1D: Please add a more magnified image of the sections to see the staining pattern in the seminiferous tubules.

      The magnification does not bring more information since we lose the structure of cells within tubules due the nature of treatment of the sections for X-gal staining. Please see comments to question 1C to reviewer 2

      (8) Page 5, first paragraph, line 2, histone dosage: What do the authors meant by the histone dosage? Please explain more or use more appropriate word.

      "Histone dosage" refers to the amount or relative abundance of histone proteins in a cell.

      (9) Figure 2A: Figure 2A: Given the result in Figure 1C, it is interesting to check the amount of HIRA in Atad2 heterozygous mice.

      In Atad2 heterozygous mice, we would expect an increase in HIRA, but only to about half the level seen in the Atad2 homozygous knockout shown in Figure 2A, which is relatively modest. Therefore, we doubt that detecting such a small change—approximately half of that in Figure 2A—would yield clear or definitive results. 

      (10) Figure 2A, legend (n=5): What does this "n" mean? The extract of testes from "5" male mice like Figure 2B. Or 5 independent experiments. If the latter is true, it is important to share the other results in the Supplements.

      “n” refers to five WT and five Atad2 KO males. The legend has been clarified as suggested by the reviewer.

      (11) Figure 2A, legend, line 2, Atad2: This should be italicized.

      Done

      (12) Figure 2B: Please show the quantification of amounts of HIRA protein like Fig. 2A.

      As indicated in the legend, what is shown is a pool of testes from 3 individuals per genotype.

      (13) Figure 2B shows an increased level of HIRA in Atad2 KO testis. This suggests the role of ATAD2 in the protein degradation of HIRA. This possibility should be mentioned or tested since ATAD2 is an AAA+ ATPase. 

      The extensive literature on ATAD2 provides no indication that it is involved in protein degradation. In our early work on ATAD2 in the 2000s, we hypothesized that, as a member of the AAA ATPase family, ATAD2 might associate with the 19S proteasome subunit (through multimerization with the other AAA ATPase member of this regulatory subunit). However, both our published pilot studies (Caron et al., PMID: 20581866) and subsequent unpublished work ruled out this possibility. Instead, since the amount of nucleosome-bound HIRA increases in the absence of ATAD2, we propose that chromatin-bound HIRA is more stable than soluble HIRA once it has been released from chromatin by ATAD2.

      (14) Page 6, second paragraph, line 5, ko: KO should be capitalized.

      Done

      (15) Page 6, second paragraph, line 2 from the bottom, chromatin dynamics: Throughout the text, the authors used "chromatin dynamics". However, all the authors analyzed in the current study is the localization of chromatin protein.  So, it is much easier to explain the results by using "chromatin status," etc. In this context, "accessibility" is better. 

      We changed the term “chromatin dynamics” into a more precise term according to the context used all through the text.

      (16) Figure 3: Please provide the quantification of signals of histone H3.3 in a nucleus or nuclear cytoplasm.

      This request is not clear to us since we do not observe any H3.3 signal in the cytoplasm.

      (17) Figure 3: As the control of specificity in post-meiotic cells, please show the image and quantification of the H3.3 signals in spermatocyte, for example.

      This request is not clear to us. What specificity is meant? 

      (18) Figure 3, bottom panels: Please show what the white lines indicate? 

      The white lines indicate the limit of cell nucleus and estimated by Hoechst staining. This is now indicated in the legend of the figure. 

      (19) Figure 4A: Please explain more about what kind of data is here. Is this wild-type and/or Atad2 KO? The label of the Y-axis should be "mean expression level". What is the standard deviation (SD) here on the X-axis. Moreover, there is only one red open circle, but the number of this class is 5611. All 5611 genes in this group show NO expression. Please explain more.

      The plot displays the mean expression levels (y-axis, labeled as "mean expression level") versus the corresponding standard deviations (x-axis), both calculated from three independent biological replicates of isolated round spermatids (Atad2 wild-type and Atad2 KO). The standard deviation reflects the variability of gene expression across biological replicates. Genes were grouped into four categories (grp1: blue, grp2: cyan, grp3: green, grp4: orange) according to the quartile of their mean expression. For grp4, all genes have no detectable expression, resulting in a mean expression of zero and a standard deviation of zero; consequently, the 5611 genes in this group are represented by a single overlapping point (red open circle) at the origin. 

      (20) Figure 4C: If possible, it would be better to have a statistical comparison between wild-type and the KO.  

      The mean profiles are displayed together with their variability (± 2 s.e.m.) across the four replicates for both ATAD2 WT (blue) and ATAD2 KO (red). For groups 1, 2, and 3, the envelopes of the curves remain clearly separated around the peak, indicating a consistent difference in signal between the two conditions. In contrast, group 4 does not present a strong signal and, accordingly, no marked difference is observed between WT and KO in this group.

      (21) Figure 5, GSEA panels: Please explain more about what the GSEA is in the legend.  The legend has been updated as follows:

      (A) Expression profiles of post-meiotic H3.3-activated genes. The heatmap (left panel) displays the normalized expression levels of genes identified by Fontaine and colleagues as upregulated in the absence of histone H3.3 (Fontaine et al. 2022) for Atad2 WT (WT) and Atad2 KO (KO) samples at days 20, 22, 24, and 26 PP (D20 to D26). The colour scale represents the z-score of log-transformed DESeq2-normalized counts. The middle panel box plots display, pooled, normalized expression levels, aggregated across replicates and genes, for each condition (WT and KO) and each time point (D20 to D26). Statistical significance between WT and KO conditions was determined using a two-sided t-test, with p-values indicated as follows: * for p-value<0.05, ** for p-value<0.01 and *** for p-value<0.001. The right panel shows the results of gene set enrichment analysis (GSEA), which assesses whether predefined groups of genes show statistically significant differences between conditions. Here, the post-meiotic H3.3-activated genes set, identified by Fontaine et al. (2022), is significantly enriched in Atad2 KO compared with WT samples at day 26 (p < 0.05, FDR < 0.25). Coloured vertical bars indicate the “leading edge” genes (i.e., those contributing most to the enrichment signal), located before the point of maximum enrichment score.  (B) As shown in (A) but for the "post-meiotic H3.3-repressed genes" gene set. (C) As shown in (A) but for the " sex chromosome-linked genes " gene set.

      (22) Figure 6. In the KO, the number of green cells is more than red and yellow cells, suggesting the delayed maturation of green (TH2B-positive) cells. It is essential to count the number of each cell and show the quantification.

      The green cells correspond to those expressing TH2B but lacking transition proteins (TP) and protamine 1 (Prm1), indicating that they are at earlier stages than elongating–condensing spermatids. Counting these green cells simply reflects the ratio of elongating/condensing spermatids to earlier-stage cells, which varies depending on the field examined. The key point in this experiment is that in wild-type mice, only red cells (elongating/condensing spermatids) and green cells (earlier stages) are observed. By contrast, in Atad2 KO testes, a significant proportion of yellow cells appears, which are never seen in wild-type tissue. The crucial metric here is the percentage of yellow cells relative to the total number of elongating/condensing spermatids (red cells). In wild-type testes, this value is consistently 0%, whereas in Atad2 KO testes it always ranges between 50% and 100% across all fields containing substantial numbers of elongating/condensing spermatids.

      (23) Figure 8A: Please show the images of sperm (heads) in the KO mice with or without decompaction.

      The requested image is now displayed in Figure S5.

      (24) Figure 8C: In the legend, it says n=5. However, there are more than 5 plots on the graph. Please explain the experiment more in detail.

      The experiment is now better explained in the legend of this Figure.

      Reviewer #2 (Recommendations for the authors): 

      While the study is rigorous and well performed, the following minor points could be addressed to strengthen the manuscript: 

      Figure 1C should indicate each of the different types of cells present in the sections. It would be of interest to show specifically the different post-meiotic germ cells.

      With this type of sample preparation, it is difficult to precisely distinguish the different cell types within the sections. Nevertheless, the staining pattern strongly indicates that most of the intensely stained cells are post-meiotic, situated near the tubule lumens and extending roughly halfway toward the basal membrane.

      In the absence of functional ATAD2, the accumulation of HIRA primarily occurs in round spermatids (Fig. 2B). If technically possible, it would be of great interest to show this by IHC of testis section. 

      Unfortunately, our antibody did not satisfactorily work in IHC.

      The increased of H3.3 signal in Atad2 KO spermatids (Fig. 3) is interpreted because of a reduced turnover. However, alternative explanations (e.g., H3.3 misincorporation or altered chaperone affinity) should not be ruled out. 

      The referee is correct that alternative explanations are possible. However, based on our previous work (Wang et al., 2021; PMID: 34580178), we demonstrated that in the absence of ATAD2, there is reduced turnover of HIRAbound nucleosomes, as well as reduced nucleosome turnover, evidenced by the appearance of nucleosomes in regions that are normally nucleosome-free at active gene TSSs. We have no evidence supporting any other alternative hypothesis.

      In the MS the reduced accessibility at active genes (Fig. 4) is attributed to H3.3 overloading. However, global changes in histone acetylation (e.g., H4K5ac) or other remodelers in KO cells could be also consider.

      In fact, we meant that histone overloading could be responsible for the altered accessibility. This has been clearly demonstrated in case of S. cerevisiae in the absence of Yta7 (S.  cerevisiae’ ATAD2) (PMID: 25406467).

      In relation with the sperm compaction assay (Fig. 8A), the DTT/heparin/Triton protocol may not fully reflect physiological decompaction. This could be validated with alternative methods (e.g., MNase sensitivity). 

      The referee is right, but since this is a subtle effect as it can be judged by normal fertility, we doubt that milder approaches could reveal significant differences between wildtype and Atad2 KO sperms.

      It is surprising that despite the observed alterations in the genome organization of the sperm, the natural fertility of the KO mice is not affected (Fig. 8C). This warrants deeper discussion: Is functional compensation occurring (e.g., by p97/VCP)? Analysis of epididymal sperm maturation or uterine environment could provide insights.

      As detailed in the Discussion section, this work, together with our previous study (Wang et al., 2021; PMID: 34580178), highlights an overlooked level of regulation in histone chaperone activity: the release of chromatinbound factors following their interaction with chromatin. This is an energy-dependent process, driven by ATP and the associated ATPase activity of these factors. Such activity could be mediated by various proteins, such as p97/VCP or DNAJC9–HSP70, as discussed in the manuscript, or by yet unidentified factors. However, most of these mechanisms are likely to occur during the extensive histone-to-histone variant exchanges of meiosis and post-meiotic stages. To the best of our knowledge, epididymal sperm maturation and the uterine environment do not involve substantial histone-to-histone or histone-to-protamine exchanges.

      The authors showed that MSCI genes present an enhancement of repression in the absence of ATAD2 by enhancing H3.3 function. It would be also of interest to analyze the behavior of the Sex body during its silencing (zygotene to pachytene) by looking at different markers (i.e., gamma-H2AX phosphorylation, Ubiquitylation etc). 

      The referee is correct that this is an interesting question. Accordingly, in our future work, we plan to examine the sex body in more detail during its silencing, using a variety of relevant markers, including those suggested by the reviewer. However, we believe that such investigations fall outside the scope of the present study, which focuses on the molecular relationship between ATAD2 and H3.3, rather than on the role of H3.3 in regulating sex body transcription. For a comprehensive analysis of this aspect, studies should primarily focus on the H3.3 mouse models reported by Fontaine and colleagues (PMID: 35766398).

      Fig. 6: Co-staining of TH2B/TP1/PRM1 is convincing but would benefit from quantification (% cells with overlapping signals).

      The green cells correspond to those expressing TH2B but lacking transition proteins (TP) and protamine 1 (Prm1), indicating that they are at earlier stages than elongating–condensing spermatids. Counting these green cells simply reflects the ratio of elongating/condensing spermatids to earlier-stage cells, which varies depending on the field examined. The key point is that in wild-type mice, only red cells (elongating/condensing spermatids) and green cells (earlier stages) are observed. By contrast, in Atad2 KO testes, a significant proportion of yellow cells appears, which are never seen in wild-type tissue. The crucial metric is the percentage of yellow cells relative to the total number of elongating/condensing spermatids (red cells). In wild-type testes, this value is consistently 0%, whereas in Atad2 KO testes it always ranges between 50% and 100% across all fields containing substantial numbers of elongating/condensing spermatids.

    1. s’il y a une approche du virtuel qui peut intéresser les spécialistes du fonctionnement psychique – et je désigne en premier lieu par là ses psychanalystes – c’est bien la sienne.

      conception deleuzienne du virtuel

    1. Car après tout, est-ce parce qu’une « fake news » a été partagée par des milliers d’internautes que chacun d’entre eux y a cru ?

      question rhétorique souligne une position centrale de l’article : le partage d’une fake news ne signifie pas nécessairement l’adhésion à son contenu.

    1. We’re bringing a social experience to Anytype by making spaces more interactive. We start with the concept of one space = one group = one chat. Then we’ll expand to include discussions on objects, enabling forum-like use cases. It will significantly improve collaborative use cases. You’ll chat and discuss your pages and files in the same end-to-end encrypted and local-first way.

      Acá hay transiciones en los siguientes cuadrantes:

      Cardumem toma una ruta alterna y más sencilla para explorar transiciones similares.

      1. Inicia por el wiki, como software documental asíncrono.
      2. Se conectará con HedgeDoc como software documetal síncrono.
      3. Se conectará con Hypothesis como software dialógico asíncrono.
      4. Implementará progresivamente funcionalidades síncronas vía sistemas hipermedia en tiempo real.

      La idea de local primero ocurrirá debido a que el servidor puede correr de manera local o remota.

    1. Para su funcionamiento, los dispositivos contienen una batería (típicamente de litio) que puede ser o no recargable, la cual proveerá de energía a un microprocesador que controla el calor y la luz de los elementos led que habitualmente indican la batería

      aqui podría ir esa imagen que menciono

  6. information.serenechessur.xyz information.serenechessur.xyz
    1. La educación ocupa un lugar central en la evolución cultural de la humanidad. Todo el saber que desarrolla la humanidad sobre el mundo circundante, sobre el ser humano y el universo en general, tiene que ser transmitido a las generaciones venideras con el fin de asegurar la permanencia y la continuidad de la civilización. En este sentido, la educación es un asunto universal, porque representa una construcción social que asegura la conservación de la cultura.

      alguien que me ayude con eso

    1. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment:

      This is an important study, supported by solid to convincing data, that suggests a model for diet selection in C. elegans. The significance is that while C. elegans has long been known to be attracted to bacterial volatiles, what specific bacterial volatiles may signify to C. elegans is largely unknown. This study also provides evidence for a possible odorant/GPCR pairing.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Siddiqui et al., investigate the question of how bacterial metabolism contributes to the attraction of C. elegans to specific bacteria. They show that C. elegans prefers three bacterial species when cultured in a leucine-enriched environment. These bacterial species release more isoamyl alcohol, a known C. elegans attractant, when cultured with leucine supplement than without leucine supplement. The study shows correlative evidence that isoamyl alcohol is produced from leucine by the Ehrlich pathway. In addition, they show that SRD-12 (SNIF-1) is likely a receptor for isoamyl alcohol because a null mutant of this receptor exhibits lower chemotaxis to isoamyl alcohol and lower preference for leucine-enriched bacteria.

      Strengths:

      (1) This study takes a creative approach to examine the question of what specific volatile chemicals released by bacteria may signify to C. elegans by examining both bacterial metabolism and C. elegans preference behavior. Although C. elegans has long been known to be attracted to bacterial metabolites, this study may be one of the first to examine the role of a specific bacterial metabolic pathway in mediating attraction.

      (2)  A strength of the paper is the identification of SRD-12 (SNIF-1) as a likely receptor for isoamyl alcohol. The ligands for very few olfactory receptors have been identified in C. elegans and so this is a significant addition to the field. The srd-12 (snif-1) null mutant strain will likely be a useful reagent for many labs examining olfactory and foraging behaviors.

      Weaknesses:

      (1) The authors write that the leucine metabolism via the Ehrlich pathway is required for the production of isoamyl alcohol by three bacteria (CEent1, JUb66, BIGb0170), but their evidence for this is correlation and not causation. They write that the gene ilvE is a bacterial homolog of the first gene in the yeast Ehrlich pathway (it would be good to include a citation for this) and that the gene is present in these three bacterial strains. In addition, they show that this gene, ilvE, is upregulated in CEent1 bacteria upon exposure to leucine. To show causation, they need to knockout ilvE from one of these strains, show that the bacteria does not have increased isoamyl alcohol production when cultured on leucine, and that the bacteria is no longer attractive to C. elegans.

      Thank you for the comment. We have added the appropriate citation [1,2]. We agree that worms’ diet preference for the preferred strains upon ilvE knockout will further strengthen the claim for IAA being used as a proxy for leucine-enriched diet. Currently, protocols and tools for genetic manipulations for CeMbio strains are not available, making this experiment not feasible at this time.  

      (2) The authors examine three bacterial strains that C. elegans showed increased preference when grown with leucine supplementation vs. without leucine supplementation. However, there also appears to be a strong preference for another strain, JUb0393, when grown on plus leucine (Figure 1B). It would be good to include statistics and criteria for selecting the three strains.

      Thanks for your comment. We agree that for Pantoea nemavictus, JUb393, worms seem to prefer the leucine supplemented (+ LEU) bacteria over unsupplemented (-LEU). However, when given a choice between the individual CeMbio bacteria and E. coli OP50, worms showed preference for only CEent1, JUb66, and BIGb0170 (Figure 1F). Consequently, CEent1, JUb66, and BIGb0170 were selected for further analyses. We have included statistics for Figure 1B-C and Figure S1A-G with details mentioned in the figure legend. 

      (3) Although the behavioral evidence that srd-12 (snif-1) gene encodes a receptor for isoamyl alcohol is compelling, it does not meet the standard for showing that it is an olfactory receptor in C. elegans. To show it is indeed a likely receptor one or more of the following should be done:

      (a) Calcium imaging of AWC neurons in response to isoamyl alcohol in the receptor mutant with the expectation that the response would be reduced or abolished in the mutant compared to wildtype.

      (b)"A receptor swap" experiment where the SRD-12 (SNIF-1) receptor is expressed in AWB repulsive neuron in SRD-12 (SNIF-1) receptor mutant background with the expectation that with receptor swap C. elegans will now be repulsed from isoamyl alcohol in chemotaxis assays (experiment from Sengupta et al., 1996 odr-10 paper).

      Thanks for all your comments and suggestions. While the lab currently does not have the necessary expertise to conduct calcium imaging of neurons, we have performed additional experiments to confirm the requirements of AWC neurons for SNIF-1 function. We generated transgenic worms with extrachromosomal array expressing snif-1 under (a) AWC-specific promoter, odr-1, and (b) AWB-specific promoter, str-1. As shown in new panel 6H in the revised manuscript and Author response image 1, we found that overexpression of snif-1 in AWC neurons completely rescues the chemotaxis defect of snif-1 mutant (referred at VSL2401), whereas upon the “receptor swap" in AWB neurons IAA is sensed as a repellent.  

      Author response image 1.

      (A) Chemotaxis index (CI) of WT, VSL2401, VSL2401 [AWCp::snif-1] and VSL2401 [AWBp::snif-1] worms to IAA at 1:1000 dilution. Significant differences are indicated as **** P ≤ 0.0001 determined by one-way ANOVA followed by post hoc Dunnett’s multiple comparison test. Error bars indicate SEM (n≥15).

      (4) The authors conclude that C. elegans cannot detect leucine in chemotaxis assays. It is important to add the method for how leucine chemotaxis assay was done in order to interpret these results. Because leucine is not volatile if leucine is put on the plates immediately before the worms are added (as in a traditional odor chemotaxis assay), there is no leucine gradient for the worm to detect. It would be good to put leucine on the plate several hours before worms are introduced so worms have the possibility to be able to detect the gradient of leucine (for example, see Wakabayashi et al., 2009).

      Previously, the chemotaxis assays with leucine were performed like traditional odor chemotaxis assays. We also performed chemotaxis assay as detailed in Shingai et al 2005[3]. Leucine was spotted on the assay plates 5 hours prior to the introduction of worms on the plates. As shown in new panel S1H in the revised manuscript, wild-type worms do not show response to leucine in the modified chemotaxis assay.

      We have included the experimental details for leucine chemotaxis assays in the revised manuscript.  

      (5) The bacterial preference assay entitled "odor-only assay" is a misleading name. In the assay, C. elegans is exposed to both volatile chemicals (odors) and non-volatile chemicals because the bacteria are grown on the assay plate for 12 hours before the worms are introduced to the assay plate. In that time, the bacteria is likely releasing non-volatile metabolites into the plate which may affect the worm's preference. A true odor-only assay would have the bacteria on the lid and the worms on the plate.

      The ‘odor-only’ diet preference assay does not allow for non-volatile chemicals to reach worms. We achieved this by using tripartite dishes where the compartments containing worms and bacterial odors are separated by polystyrene barriers. At the time of the assay, worms were spotted in a separate compartment from that of bacteria (as shown in schematic 1A). The soluble metabolites released by the bacteria during their growth will accumulate in the agar within the bacterial compartment alone such that worms only encounter the volatile metabolites produced by bacteria wafting past the polystyrene barrier.

      (6) The findings of the study should be discussed more in the context of prior literature. For example, AWC neurons have been previously shown to be involved in bacterial preference (Harris et al., 2014; Worthy et al., 2018). In addition, CeMbio bacterial strains (the strains examined in this study) have been previously shown to release isoamyl alcohol (Chai et al. 2024).

      Thanks for the suggestion. We have modified the Discussion section to discuss the study in the light of relevant prior literature.  

      Reviewer #2 (Public review):

      Summary:

      Siddiqui et al. show that C. elegans prefers certain bacterial strains that have been supplemented with the essential amino acid (EEA) leucine. They convincingly show that some leucine enriched bacteria stimulate the production of isoamyl alcohol (IAA). IAA is an attractive odorant that is sensed by the AWC. The authors an identify a receptor, SRD-12 (SNIF-1), that is expressed in the AWC chemosensory neurons and is required for chemotaxis to IAA. The authors propose that IAA is a predominant olfactory cue that determines diet preference in C. elegans. Since leucine is an EAA, the authors propose that worm IAA sensing allows the animal provides a proxy mechanism to identify EAA rich diets.

      Strengths:

      The authors propose IAA as a predominant olfactory cue that determines diet preference in C. elegans providing molecular mechanism underlying diet selection. They show that wild isolates of C. elegans have a strong chemotactic response to IAA indicating that IAA is an ecologically relevant odor for the worm. The paper is well written, and the presented data are convincing and well organized. This is an interesting paper that connects chemotactic response with bacterially produced odors and thus provides an understanding of how animals adapt their foraging behavior through the perception of molecules that may indicate the nutritional value.

      Weaknesses:

      Major:

      While I do like the way the authors frame C. elegans IAA sensing as mechanisms to identify leucine (EAA) rich diets it is not fully clear whether bacterial IAA production is a proxy for bacterial leucine levels.

      (1) Can the authors measure leucine (or other EAA) content of the different CeMbio strains? This would substantiate the premise in the way they frame this in the introduction. While the authors convincingly show that leucine supplementation induces IAA production in some strains, it is not clear if there are lower leucine levels in the different in non-preferred strains.

      Thanks for your suggestion. Estimating leucine levels in various bacteria will provide useful information, and we hope to do so in future studies.

      (2) It is not clear whether the non-preferred bacteria in Figure 1A and 1B have the ability to produce IAA. To substantiate the claim that C. elegans prefers CEent1, JUb66, and BIGb0170 due to their ability to generate IAA from leucine, it would measure IAA levels in non-preferred bacteria (+ and - leucine supplementation). If the authors have these data it would be good to include this.

      Thanks for the suggestion. We have included the table indicating the presence or absence of IAA production by all the bacteria under + LEU and – LEU conditions (Table S2). Some of the nonpreferred bacteria indeed produce isoamyl alcohol. However, the abundance of IAA in these strains is significantly less than in the preferred bacteria.  

      Using the available genomic sequence data, we found that all CeMbio strains encode IlvE-like transaminase enzymes[4]. This suggests that presumably all the bacteria have the metabolic capacity to make alpha-ketoisocaproate (an intermediate in IAA biosynthetic pathway) from leucine. However, the regulation of metabolic flux is likely to be quite complex in various bacteria.  

      (3) The authors would strengthen their claim if they could show that deletion or silencing ilvE enzyme reduces IAA levels and eliminates the increased preference upon leucine supplementation.

      We agree that testing worms’ diet preference for the preferred strains upon ilvE knockout will further strengthen the claim for IAA being crucial for finding leucine-enriched diet. Currently the lab does not have the necessary expertise and standardize protocols to do genetic manipulations for the CeMbio strains.

      (4) While the three preferred bacteria possess the ilvE gene, it is not clear whether this enzyme is present in the other non-preferred bacterial strains. As far as I know, the CeMbio strains have been sequenced so it should be easy to determine if the non-preferred bacteria possess the capacity to make IAA. Does the expression of ilvE in e.g. E. coli increase its preference index or are the other genes in the biosynthesis pathway missing?

      Thanks for the suggestion. Using the available genomic sequence data, we find that all the bacteria in the CeMbio collection possess IlvE-like transaminase necessary for synthesis of alphaketoisocaproate, key metabolite in leucine turn over as well as precursor for IAA [4]. E. coli has an IlvE encoding gene in its genome [2]. However, we do not find IAA in the headspace of E. coli either with or without leucine supplementation. This indicates either (i) E. coli lacks enzymes for subsequent steps in IAA biosynthesis or (ii) leucine provided under the experimental regime is not sufficient to shift the metabolic flux to IAA production.  

      Previous studies have suggested that in yeast, the final two steps leading to IAA production are catalyzed by decarboxylase and dehydrogenase enzymes1. The genomic and metabolic flux data available for CeMbio do not describe specific enzymes leading up to IAA synthesis [4].  

      (5) It is strongly implied that leucine-rich diets are beneficial to the worm. Do the authors have data to show the effect on leucine supplementation on C. elegans healthspan, life-span or broodsize?

      Edwards et al. 2015 reported a 15% increase in the lifespan of worms upon 1 mM leucine supplementation [5]. Wang et al 2018 also showed lifespan extension upon 1 mM and 10 mM leucine supplementation. They also reported that while leucine supplementation did not have any effect on brood size, it did make worms more resistant to heat, paraquat, and UV-stress [6]. These studies have been included in the discussion section.

      Other comments:

      Page 6. Figure 2c. While the authors' conclusions are correct based on AWC expts. it would be good at this stage to include the possibility that odors that enriched in the absence of leucine may be aversive.

      Thanks for the comment. We have tested the chemotaxis response of the worms for most of the odors produced by CeMbio strains without leucine supplementation. We did not find any odor that is aversive to worms. However, we cannot completely rule out the possibility that a low abundance of aversive odor in the headspace of the bacteria was missed.

      Interestingly, we did identify 2-nonanone, a known repellent, in the headspace of the preferred bacteria upon leucine supplementation. However, the abundance of 2-nonanone in headspace of bacteria is relatively low (less than 1% for CEent1, and JUb66, and ~10% for BIGb0170). This suggests that the relative abundance of odors in an odor bouquet may be a relevant factor in determining worms’ reference.  

      Page 6. IAA increases 1.2-4 folds upon leucine supplementation. If the authors perform a chemotaxis assay with just IAA with 1-2-4 fold differences do you get the shift in preference index as seen with the bacteria? i.e. is the difference in IAA concentration sufficient to explain the shift in bacterial PI upon leucine supplementation? Other attractants such as Acetoin and isobutanol go up in -Leu conditions.

      Thanks for the suggestion. As shown in Figure S2H and S2I, when given a choice between a concentration of IAA (1:1000 dilution) attractive to worms and a 4-fold higher amount of IAA, worms chose the latter. This result suggests that worms can distinguish between relatively small difference in concentrations of IAA.

      We agree that the relative abundance of Acetoin and Isobutanol is high in -LEU conditions. The presence of other attractants in - LEU conditions should skew the preference of worms for – LEU bacteria. However, we found that worms prefer + LEU bacteria (Figure 1B), suggesting that the abundance of IAA mainly influences the diet preference of the worms.  

      Page 14-15. The authors identify a putative IAA receptor based on expression studies. I compliment the authors for isolating two CRISPR deletion alleles. They show that the srd-12 (snif-1) mutants have obvious defects in IAA chemotaxis. Very few ligand-odorant receptors combinations have been identified so this is an important discovery. CenGen data indicate that srd-12 (snif-1) is expressed in a limited set of neurons. Did the authors generate a reporter to show the expression of srd-12 (snif-1)? This is a simple experiment that would add to the characterization of the SRD-12 (SNIF-1) receptor. Rescue experiments would be nice even though the authors have independent alleles. To truly claim that SRD-12 (SNIF-1) is the ligand for IAA and activates the AWC neurons would require GCamp experiments in the AWC neuron or heterologous expression system. I understand that GCamp imaging might not be part of the regular arsenal of the lab but it would be a great addition (even in collaboration with one of the many labs that do this regularly). Comparing AWC activity using GCaMP in response IAA-producing bacteria with high leucine levels in both wild-type and SRD-12 (SNIF-1) deficient backgrounds, would further support their narrative. I leave that to the authors.

      Thanks for your comments and suggestions. To address this comment, we rescued snif-1 mutant (referred as VSL2401) with extrachromosomal array expressing snif-1 under AWC-specific promoter as well as its native promoter. As shown in Figure 6H and Author response image 2, we find that both transgenic lines show a complete rescue of chemotaxis response to isoamyl alcohol. To find where snif-1 is expressed, we generated a transgenic line of worms expressing GFP under snif-1 promoter, and mCherry under odr-1 promoter (to mark AWC neurons). As shown in Figure 6I, we found that snif-1 is expressed faintly in many neurons, with strong expression in one of the two AWC neurons marked by odr-1::mCherry. This result suggests that SNIF-1 is expressed in AWC neuron.

      We hope to perform GCaMP assay and further characterization of SNIF-1 in the future.

      Author response image 2.

      Chemotaxis index (CI) of WT, VSL2401, VSL2401 [AWCp:: snif-1] and VSL2401 [snif-1p::snif-1] worms to IAA at 1:1000 dilution. Significant differences are indicated as **** P ≤ 0.0001 determined by one-way ANOVA followed by post hoc Dunnett’s multiple comparison test. Error bars indicate SEM (n≥15).

      Minor:

      Page 4 "These results suggested that worms can forage for diets enriched in specific EAA, leucine...." More precise at this stage would be to state " These results indicated that worms can forage for diets supplemented with specific EAA...".

      We have changed the statement in the revised manuscript.

      Page 5."these findings suggested that worms not only rely on odors to choose between two bacteria but also to find leucine enriched bacteria" This statement is not clear to me and doesn't follow the data in Fig. S2. Preferred diets in odorant assays are the IAA producing strains.

      Thanks for your comment. We have revised the manuscript to make it clear. “Altogether, these findings suggested that worms rely on odors to distinguish different bacteria and find leucineenriched bacteria”. This statement concludes all the data shown in Figure 1 and Figure S1.  

      Page 5. Figure S2A provides nice and useful data that can be part of the main Figure 1.

      Thanks for the comment. We have incorporated the data from Figure S2A to main Figure 1.

      Reviewer #3 (Public review):

      Summary:

      The authors first tested whether EAA supplementation increases olfactory preference for bacterial food for a variety of bacterial strains. Of the EAAs, they found only leucine supplementation increased olfactory preference (within a bacterial strain), and only for 3 of the bacterial strains tested. Leucine itself was not found to be intrinsically attractive.

      They determined that leucine supplementation increases isoamyl alcohol (IAA) production in the 3 preferred bacterial strains. They identify the biochemical pathway that catabolizes leucine to IAA, showing that a required enzyme for this pathway is upregulated upon supplementation.

      Consistent with earlier studies, they find that AWC olfactory neuron is primarily responsible for increased preference for IAA-producing bacteria.

      Testing volatile compounds produced by bacteria and identified by GC/MS, and identified several as attractive, most of them require AWC for the full effect. Adaptation assays were used to show that odorant levels produced by bacterial lawns were sufficient to induce olfactory adaptation, and adaptation to IAA reduced chemotaxis to leucine-supplemented lawns. They then showed that IAA attractiveness is conserved across wild strains, while other compounds are more variable, suggesting IAA is a principal foraging cue.

      Finally, using the CeNGEN database, they developed a list of candidate IAA receptors. Using behavioral tests, they show that mutation of srd-12 (snif-1) greatly impairs IAA chemotaxis without affecting locomotion or attraction to another AWC-sensed odor, PEA.

      Comments

      This study will be of great interest in the field of C. elegans behavior, chemical senses and chemical ecology, and understanding of the sensory biology of foraging.

      Strengths:

      The identification of a receptor for IAA is an excellent finding. The combination of microbial metabolic chemistry and the use of natural bacteria and nematode strains makes an extremely compelling case for the ecological and adaptive relevance of the findings.

      Weaknesses:

      AWC receives synaptic input from other chemosensory neurons, and thus could potentially mediate navigation behaviors to compounds detected in whole or in part by those neurons. Language concluding detection by AWC should be moderated (e.g. p9 "worms sense an extensive repertoire...predominantly using AWC") unless it has been demonstrated.

      Thanks for your comment. We have modified the manuscript to incorporate the suggestion.

      srd-12 (snif-1) is not exclusively expressed in AWC. Normally, cell-specific rescue or knockdown would be used to demonstrate function in a specific cell. The authors should provide such a demonstration or explain why they are confident srd-12 (snif-1) acts in AWC.

      Thanks for the comment. We have performed AWC-specific rescue of snif-1 in mutant worms. As shown in Figure 6H, we found that AWC neurons specific rescue completely recovered the chemotaxis defect of the snif-1 mutant (referred as VSL2401) for IAA. In addition, snif-1 is expressed in one of the AWC neurons.

      A comparison of AWC's physiological responses between WT and srd-12 (snif-1), preferably in an unc13 background, would be nice. Even further, the expression of srd-12 (snif-1) in a different neuron type and showing that it confers responsiveness to IAA (in this case, inhibition) would be very convincing.

      Thanks for the suggestion. We have performed a receptor swap experiment, where snif-1 is misexpressed in AWB neurons. We find that these worms show slight but significant repulsion to IAA compared to WT and snif-1 mutant worms (Author response image 1).

      Recommendations for the authors:

      Reviewing Editor:

      Please consider all of the reviewer comments. In particular, as noted in the individual reviews, the strength of the evidence would be bolstered by additional experiments to demonstrate that the iLvE enzyme affects IAA levels in the preferred bacteria. The reviewers note that the authors haven't shown that IAA production is a reflection of leucine content. Are the non-preferred bacteria low on leucine or lack iLvE or IAA synthesis pathways? Further, more direct evidence that SRD-12 (SNIF-1) is in fact the primary IAA receptor would further strengthen the study. The authors should also be aware that geographic distance for wild isolate C. elegans may not directly correlate with phylogenetic distance. This should be assessed/discussed for the strains used.

      Thanks for the suggestions. Some of these have been addressed in response to reviewers. Thanks for your comments about possible disconnect between geographical and phylogenetic distances amongst natural isolates used here.

      By analyzing the phylogenetic tree generated using neighbor-joining algorithm available at CaeNDR database, we found that QX1211 and JU3226 are phylogenetically close, but the remaining isolates fall under different clades separated by long phylogenetic distances [7,8].  

      Reviewer #1 (Recommendations for the authors):

      (1) In the first sentence of the third paragraph of the introduction, C. elegans are described as "soildwelling." Although C. elegans has been described as soil-dwelling in the past, current research indicates they are most often found on rotten fruit, compost heaps and other bacterial-rich environments, not soil. "All Caenorhabditis species are colonizers of nutrient- and bacteria-rich substrates and none of them is a true soil nematode." from Kiontke, K. and Sudhaus, W. Ecology of Caenorhabditis species (WormBook).

      Your specific comment about C. elegans’ habitat is well received. However, in that sentence we are referring to the chemosensory system of soil-dwelling animals in general, and not particularly C. elegans.

      (2) Figure 3K, the model would be clearer if leucine-rich diet -> volatile chemicals ->AWC (instead of leucine-rich diet -> AWC <- volatile chemicals). The leucine-rich diet results in the production of volatile chemicals which are detected by AWC.

      We have modified the figure to make it clearer.

      (3) Figure 4 - it would help to include a table summarizing the volatile chemicals that each bacteria releases. Then the reader could more easily evaluate whether the adaptation to each specific odor is consistent with the change in preference for the specific bacteria based on what it releases in its headspace. In addition, Figure 4 would help to clarify whether bacteria in these experiments were cultured with or without leucine supplementation.

      Table S2 summarizes the odors released by all the bacteria under + LEU and – LEU conditions.

      In Figure 4, adaptation was performed by odors of bacteria when cultured under leucineunsupplemented conditions.

      Reviewer #2 (Recommendations for the authors):

      Page 9. Previous studies e.g. Bargmann Hartwieg and Horvitz have shown IAA is sensed by the AWC. Would be good to cite appropriately.

      Thanks for the comment. The reference has been cited at p9 and p16.

      References:

      (1) Yuan, J., Mishra, P., and Ching, C.B. (2017). Engineering the leucine biosynthetic pathway for isoamyl alcohol overproduction in Saccharomyces cerevisiae. Journal of Industrial Microbiology and Biotechnology 44, 107-117. 10.1007/s10295-016-1855-2 %J Journal of Industrial Microbiology and Biotechnology.

      (2) Kanehisa, M., Furumichi, M., Sato, Y., Matsuura, Y., and Ishiguro-Watanabe, M. (2025). KEGG: biological systems database as a model of the real world. Nucleic Acids Res 53, D672-d677. 10.1093/nar/gkae909.

      (3) Shingai, R., Wakabayashi, T., Sakata, K., and Matsuura, T. (2005). Chemotaxis of Caenorhabditis elegans during simultaneous presentation of two water-soluble attractants, llysine and chloride ions. Comparative biochemistry and physiology. Part A, Molecular & integrative physiology 142, 308-317. 10.1016/j.cbpa.2005.07.010.

      (4) Dirksen, P., Assié, A., Zimmermann, J., Zhang, F., Tietje, A.M., Marsh, S.A., Félix, M.A., Shapira, M., Kaleta, C., Schulenburg, H., and Samuel, B.S. (2020). CeMbio - The Caenorhabditis elegans Microbiome Resource. G3 (Bethesda, Md.) 10, 3025-3039. 10.1534/g3.120.401309.

      (5) Edwards, C., Canfield, J., Copes, N., Brito, A., Rehan, M., Lipps, D., Brunquell, J., Westerheide, S.D., and Bradshaw, P.C. (2015). Mechanisms of amino acid-mediated lifespan extension in Caenorhabditis elegans. BMC genetics 16, 8. 10.1186/s12863-015-0167-2.

      (6) Wang, H., Wang, J., Zhang, Z.J.J.o.F., and Research, N. (2018). Leucine Exerts Lifespan Extension and Improvement in Three Types of Stress Resistance (Thermotolerance, AntiOxidation and Anti-UV Irradiation) in C. elegans. 6, 665-673.

      (7) Crombie, T.A., McKeown, R., Moya, N.D., Evans, Kathryn S., Widmayer, Samuel J., LaGrassa, V., Roman, N., Tursunova, O., Zhang, G., Gibson, Sophia B., et al. (2023). CaeNDR, the Caenorhabditis Natural Diversity Resource. Nucleic Acids Research 52, D850-D858. 10.1093/nar/gkad887 %J Nucleic Acids Research.

      (8) Cook, D.E., Zdraljevic, S., Roberts, J.P., and Andersen, E.C. (2017). CeNDR, the Caenorhabditis elegans natural diversity resource. Nucleic Acids Res 45, D650-d657. 10.1093/nar/gkw893.

    1. Dentro de las ciencias del lenguaje, el Análisis Crítico del Discurso (ACD) aprovecha las aportaciones de los recientes estudios sobre el texto (análisis del discurso, pragmática, sociolingüística, etnografía de la comunicación, teoría de la enunciación, etc.) para definir los propósitos y la metodología del análisis crítico.

      haganme una cita de esto

    1. Author response:

      eLife Assessment

      This important study fills a major geographic and temporal gap in understanding Paleocene mammal evolution in Asia and proposes an intriguing "brawn before bite" hypothesis grounded in diverse analytical approaches. However, the findings are incomplete because limitations in sampling design - such as the use of worn or damaged teeth, the pooling of different tooth positions, and the lack of independence among teeth from the same individuals - introduce uncertainties that weaken support for the reported disparity patterns. The taxonomic focus on predominantly herbivorous clades also narrows the ecological scope of the results. Clarifying methodological choices, expanding the ecological context, and tempering evolutionary interpretations would substantially strengthen the study.

      We thank Dr. Rasmann for the constructive evaluation of our manuscript. Considering the reviewers’ comments, we plan to implement revisions to our study focusing on (1) expansion of the fossil sample description, including a detailed account of the process of excluding extremely worn or damaged teeth from all analyses, (2) expanded reporting of the analyses done on individual tooth positions, and tempering the interpretation of the pooled samples in light of the issues raised by reviewers, (3) providing a more comprehensive introduction that includes an overview of the Paleocene mammal faunas in south China, which unevenly samples certain clades whereas others are extremely rare, and why the current available fossil samples would not permit a whole-fauna analysis to be adequately conducted across the three land mammal age time bins of the Paleocene in China. We believe these revisions would substantially strengthen the study’s robustness and impact for understanding the ecomorphological evolution of the earliest abundant placental mammals during the Paleocene in Asia.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work provides valuable new insights into the Paleocene Asian mammal recovery and diversification dynamics during the first ten million years post-dinosaur extinction. Studies that have examined the mammalian recovery and diversification post-dinosaur extinction have primarily focused on the North American mammal fossil record, and it's unclear if patterns documented in North America are characteristic of global patterns. This study examines dietary metrics of Paleocene Asian mammals and found that there is a body size disparity increase before dietary niche expansion and that dietary metrics track climatic and paleobotanical trends of Asia during the first 10 million years after the dinosaur extinction.

      Strengths:

      The Asian Paleocene mammal fossil record is greatly understudied, and this work begins to fill important gaps. In particular, the use of interdisciplinary data (i.e., climatic and paleobotanical) is really interesting in conjunction with observed dietary metric trends.

      Weaknesses:

      While this work has the potential to be exciting and contribute greatly to our understanding of mammalian evolution during the first 10 million years post-dinosaur extinction, the major weakness is in the dental topographic analysis (DTA) dataset.

      There are several specimens in Figure 1 that have broken cusps, deep wear facets, and general abrasion. Thus, any values generated from DTA are not accurate and cannot be used to support their claims. Furthermore, the authors analyze all tooth positions at once, which makes this study seem comprehensive (200 individual teeth), but it's unclear what sort of noise this introduces to the study. Typically, DTA studies will analyze a singular tooth position (e.g., Pampush et al. 2018 Biol. J. Linn. Soc.), allowing for more meaningful comparisons and an understanding of what value differences mean. Even so, the dataset consists of only 48 specimens. This means that even if all the specimens were pristinely preserved and generated DTA values could be trusted, it's still only 48 specimens (representing 4 different clades) to capture patterns across 10 million years. For example, the authors note that their results show an increase in OPCR and DNE values from the middle to the late Paleocene in pantodonts. However, if a singular tooth position is analyzed, such as the lower second molar, the middle and late Paleocene partitions are only represented by a singular specimen each. With a sample size this small, it's unlikely that the authors are capturing real trends, which makes the claims of this study highly questionable.

      We thank Reviewer 1 for their careful review of our manuscript. A major external limitation of the application of DTA to fossil samples is the availability of specimens. Whereas a typical study design using extant or geologically younger/more abundant fossil species would preferably sample much larger quantities of teeth from each treatment group (time bins, in our case), the rarity of well-preserved Paleocene mammalian dentitions in Asia necessitates the analysis of small samples in order to make observations regarding major trends in a region and time period otherwise impossible to study (see Chow et al. 1977). That said, we plan to clarify methodological details in response to the reviewer’s comments, including a more comprehensive explanation of our criteria for exclusion of broken tooth crowns from the analyses. We also plan to expand our results reporting on individual tooth position analysis, potentially including resampling and/or simulation analyses to assess the effect of small and uneven samples on our interpretation of results. Lastly, we plan to revise the discussion and conclusion accordingly, including more explicit distinction between well-supported findings that emerge from various planned sensitivity analyses, versus those that are more speculative and tentative in nature.

      Chow, M., Zhang, Y., Wang, B., and Ding, S. (1977). Paleocene mammalian fauna from the Nanxiong Basin, Guangdong Province. Paleontol. Sin. New Ser. C 20, 1–100.

      Reviewer #2 (Public review):

      Summary:

      This study uses dental traits of a large sample of Chinese mammals to track evolutionary patterns through the Paleocene. It presents and argues for a 'brawn before bite' hypothesis - mammals increased in body size disparity before evolving more specialized or adapted dentitions. The study makes use of an impressive array of analyses, including dental topographic, finite element, and integration analyses, which help to provide a unique insight into mammalian evolutionary patterns.

      Strengths:

      This paper helps to fill in a major gap in our knowledge of Paleocene mammal patterns in Asia, which is especially important because of the diversification of placentals at that time. The total sample of teeth is impressive and required considerable effort for scanning and analyzing. And there is a wealth of results for DTA, FEA, and integration analyses. Further, some of the results are especially interesting, such as the novel 'brawn before bite' hypothesis and the possible link between shifts in dental traits and arid environments in the Late Paleocene. Overall, I enjoyed reading the paper, and I think the results will be of interest to a broad audience.

      Weaknesses:

      I have four major concerns with the study, especially related to the sampling of teeth and taxa, that I discuss in more detail below. Due to these issues, I believe that the study is incomplete in its support of the 'brawn before bite' hypothesis. Although my concerns are significant, many of them can be addressed with some simple updates/revisions to analyses or text, and I try to provide constructive advice throughout my review.

      (1) If I understand correctly, teeth of different tooth positions (e.g., premolars and molars), and those from the same specimen, are lumped into the same analyses. And unless I missed it, no justification is given for these methodological choices (besides testing for differences in proportions of tooth positions per time bin; L902). I think this creates some major statistical concerns. For example, DTA values for premolars and molars aren't directly comparable (I don't think?) because they have different functions (e.g., greater grinding function for molars). My recommendation is to perform different disparity-through-time analyses for each tooth position, assuming the sample sizes are big enough per time bin. Or, if the authors maintain their current methods/results, they should provide justification in the main text for that choice.

      We thank Reviewer 2 for raising several issues worthy of clarification. Separate analyses for individual tooth positions were performed but not emphasized in the first version of the study. In our revised version we plan to highlight the nuances of the results from premolar versus molar partition analyses.

      Also, I think lumping teeth from the same specimen into your analyses creates a major statistical concern because the observations aren't independent. In other words, the teeth of the same individual should have relatively similar DTA values, which can greatly bias your results. This is essentially the same issue as phylogenetic non-independence, but taken to a much greater extreme.

      It seems like it'd be much more appropriate to perform specimen-level analyses (e.g., Wilson 2013) or species-level analyses (e.g., Grossnickle & Newham 2016) and report those results in the main text. If the authors believe that their methods are justified, then they should explain this in the text.

      We plan to emphasize individual tooth position analyses in our revisions, and provide a stronger justification for our current treatment of multiple teeth from the same individual specimens as independent samples. We recognize the statistical nonindependence raised by Reviewer 2, but we would point out that from an ecomorphological perspective, it is unclear to us that the heterodont dentition of these early Cenozoic placental mammals should represent a single ecological signal (and thus warrant using only a single tooth position as representative of an individual’s DTA values). We plan to closely examine the nature of nonindependence in the DTA data within individuals, to assess a balanced approach to maximize information content from the relatively small and rare fossil samples used, while minimizing signal nonindependence across the dentition.

      (2) Maybe I misunderstood, but it sounds like the sampling is almost exclusively clades that are primarily herbivorous/omnivorous (Pantodonta, Arctostylopida, Anagalida, and maybe Tillodonta), which means that the full ecomorphological diversity of the time bins is not being sampled (e.g., insectivores aren't fully sampled). Similarly, the authors say that they "focused sampling" on those major clades and "Additional data were collected on other clades ... opportunistically" (L628). If they favored sampling of specific clades, then doesn't that also bias their results?

      If the study is primarily focused on a few herbivorous clades, then the Introduction should be reframed to reflect this. You could explain that you're specifically tracking herbivore patterns after the K-Pg.

      We plan to revise the introduction section to more accurately reflect the emphasis on those clades. However, we would note that conventional dietary ecomorphology categories used to characterize later branching placental mammals are likely to be less informative when applied to their Paleocene counterparts. Although there are dental morphological traits that began to characterize major placental clades during the Paleocene, distinctive dietary ecologies have not been demonstrated for most of the clade representatives studied. Thus, insectivory was probably not restricted to “Insectivora”, nor carnivory to early Carnivmorpha or “Creodonta”, each of which represented less than 5% of the taxonomic richness during the Paleocene in China (Wang et al. 2007).

      Wang, Y., Meng, J., Ni, X., and Li, C. (2007). Major events of Paleogene mammal radiation in China. Geol. J. 42, 415–430.

      (3) There are a lot of topics lacking background information, which makes the paper challenging to read for non-experts. Maybe the authors are hindered by a short word limit. But if they can expand their main text, then I strongly recommend the following:

      (a) The authors should discuss diets. Much of the data are diet correlates (DTA values), but diets are almost never mentioned, except in the Methods. For example, the authors say: "An overall shift towards increased dental topographic trait magnitudes ..." (L137). Does that mean there was a shift toward increased herbivory? If so, why not mention the dietary shift? And if most of the sampled taxa are herbivores (see above comment), then shouldn't herbivory be a focal point of the paper?

      We plan to revise the text to make clearer connections between DTA and dietary inferences, and at the same time advise caution in making one-to-one linkages between them. Broadly speaking, dental indices such as DTA are phenotypic traits, and as in other phenotypic traits, the strength of structure-function relationships needs to be explicitly established before dietary ecological inferences can be confidently made. There is, to date, no consistent connection between dental topology and tooth use proxies and biomechanical traits in extant non-herbivorous species (e.g., DeSantis et al. 2017, Tseng and DeSantis 2024), and in our analyses, FEA and DTA generally did not show strong correlations to each other. Thus, we plan to continue to exercise care in interpreting DTA data as dietary data.

      DeSantis LRG, Tseng ZJ, Liu J, Hurst A, Schubert BW, Jiangzuo Q. Assessing niche conservatism using a multiproxy approach: dietary ecology of extinct and extant spotted hyenas. Paleobiology. 2017;43(2):286-303. doi:10.1017/pab.2016.45

      Tseng ZJ, DeSantis LR. Relationship between tooth macrowear and jaw morphofunctional traits in representative hypercarnivores. PeerJ. 2024 Nov 11;12:e18435.

      (b) The authors should expand on "we used dentitions as ecological indicators" (L75). For non-experts, how/why are dentitions linked to ecology? And, again, why not mention diet? A strong link between tooth shape and diet is a critical assumption here (and one I'm sure that all mammalogists agree with), but the authors don't provide justification (at least in the Introduction) for that assumption. Many relevant papers cited later in the Methods could be cited in the Introduction (e.g., Evans et al. 2007).

      Thank you for this suggestion. We plan to expand the introduction section to better contextualize the methodological basis for the work presented.

      (c) Include a better introduction of the sample, such as explicitly stating that your sample only includes placentals (assuming that's the case) and is focused on three major clades. Are non-placentals like multituberculates or stem placentals/eutherians found at Chinese Paleocene fossil localities and not sampled in the study, or are they absent in the sampled area?

      We thank Reviewer 2 for raising this important point worthy of clarification. Multituberculates are completely absent from the first two land mammal ages in the Paleocene of Asia, and non-placentals are rare in general (Wang et al. 2007). We plan to provide more context for the taxonomic sampling choices made in the study.

      Wang, Y., Meng, J., Ni, X., and Li, C. (2007). Major events of Paleogene mammal radiation in China. Geol. J. 42, 415–430.

      (d) The way in which "integration" is being used should be defined. That is a loaded term which has been defined in different ways. I also recommend providing more explanation on the integration analyses and what the results mean.

      If the authors don't have space to expand the main text, then they should at least expand on the topics in the supplement, with appropriate citations to the supplement in the main text.

      We plan to clarify our usage of “integration” to enable readers to accurately interpret what we mean by it.

      (4) Finally, I'm not convinced that the results fully support the 'brawn before bite' hypothesis. I like the hypothesis. However, the 'brawn before ...' part of the hypothesis assumes that body size disparity (L63) increased first, and I don't think that pattern is ever shown. First, body size disparity is never reported or plotted (at least that I could find) - the authors just show the violin plots of the body sizes (Figures 1B, S6A). Second, the authors don't show evidence of an actual increase in body size disparity. Instead, they seem to assume that there was a rapid diversification in the earliest Paleocene, and thus the early Paleocene bin has already "reached maximum saturation" (L148). But what if the body size disparity in the latest Cretaceous was the same as that in the Paleocene? (Although that's unlikely, note that papers like Clauset & Redner 2009 and Grossnickle & Newham 2016 found evidence of greater body size disparity in the latest Cretaceous than is commonly recognized.) Similarly, what if body size disparity increased rapidly in the Eocene? Wouldn't that suggest a 'BITE before brawn' hypothesis? So, without showing when an increase in body size diversity occurred, I don't think that the authors can make a strong argument for 'brawn before [insert any trait]".

      Although it's probably well beyond the scope of the study to add Cretaceous or Eocene data, the authors could at least review literature on body size patterns during those times to provide greater evidence for an earliest Paleocene increase in size disparity.

      We plan to provide a broader discussion and any supporting evidence from the Cretaceous and Eocene to either make a stronger case for “brawn before bite”, or to refine what we mean by brawn/size/size disparity.