10,000 Matching Annotations
  1. Feb 2026
    1. Reviewer #2 (Public review):

      Summary:

      This is a very interesting study focusing on a remarkable oligomerization domain, the LisH-CTLH-CRA module. The module is found in a diverse set of proteins across evolution. The present manuscript focuses on the extraordinary elaboration of this domain in GID/CTLH RING E3 ubiquitin ligases, which assemble into a gigantic, highly ordered, oval-shaped megadalton complex with strict subunit specificity. The arrangement of LisH-CTLH-CRA modules from several distinct subunits is required to form the oval on the outside of the assembly, allowing functional entities to recruit and modify substrates in the center. Although previous structures had shown that data revealed that CTLH-CRA dimerization interfaces share a conserved helical architecture, the molecular rules that govern subunit pairing have not been explored. This was a daunting task in protein biochemistry that was achieved in the present study, which defines this "assembly specificity code" at the structural and residue-specific level.

      The authors used X-ray crystallography to solve high-resolution structures of mammalian CTLH-CRA domains, including RANBP9, RANBP10, TWA1, MAEA, and the heterodimeric complex between RANBP9 and MKLN. They further examined and characterized assemblies by quantitative methods (ITC and SEC-MALS) and qualitatively using nondenaturing gels. Some of their ITC measurements were particularly clever and involved competitive titrations and titrations of varying partners depending on protein behavior. The experiments allowed the authors to discover that affinities for interactions between partners is exceptionally tight, in the pM-nM range, and to distill the basis for specificity while also inferring that additional interactions beyond the LisH-CTLH-CRA modules likely also contribute to stability. Beyond discovering how the native pairings are achieved, the authors were able to use this new structural knowledge to reengineer interfaces to achieve different preferred partnerings.

      Strengths:

      Nearly everything about this work is exceptionally strong.

      (1) The question is interesting for the native complexes, and even beyond that, has potential implications for the design of novel molecular machines.

      (2) The experimental data and analyses are quantitative, rigorous, and thorough.

      (3) The paper is a great read - scholarly and really interesting.

      (4) The figures are exceptional in every possible way. They present very complex and intricate interactions with exquisite clarity. The authors are to be commended for outstanding use of color and color-coding throughout the study, including in cartoons to help track what was studied in what experiments. And the figures are also outstanding aesthetically.

      Weaknesses:

      There are no major weaknesses of note, but I can make a few recommendations for editing the text.

    1. Reviewer #2 (Public review):

      The work by Spokaite et al describes the discovery of a novel Rab5 binding site present in complex II of class III PI3K using a combination of HDX and Cryo EM. Extensive mutational and sequence analysis define this as the primordial Rab5 interface. The data presented are convincing that this is indeed a biologically relevant interface, and is important in defining mechanistically how VPS34 complexes are regulated.

      This paper is a very nice expansion of their previous cryo-ET work from 2021, and is an excellent companion piece on high-resolution cryo-EM of the complex I class III complex bound to Rab1 from the Hurley lab in 2025. Overall, this work is of excellent technical quality and answers important unexplained observations on some unexpected mutational analysis from the previous work.

      They used their increased affinity VPS34 mutant to determine the 3.2 ang structure of Rab5 bound to VPS34-CII. Clear density was seen for the original Rab5 interface, but an additional site was observed. Based on this structure, they mutated out the VPS34 interface, allowing for a high-resolution structure of the Rab5 bound at the VPS15 interface.

      They extensively validated the VPS15 interface in the yeast variant of VPS34, showing that the Vp215-Rab5 (VPS21) interface identified is critical in controlling complex II VPS34 recruitment.

      The major strengths of this paper are that the experiments appear to be done carefully and rigorously, and I have very few experimental suggestions.

      Here is what I recommend based on some very minor weaknesses I observed

      (1) My main concern has to do a little bit with presentation. My main issue is how the authors use mutant description. They clearly indicate the mutant sequence in the human isoform (for example, see Figure 2A, VPS15 described as 579-SHMIT-583>DDMIE); however, when they shift to the yeast version, they shift to saying VPS15 mutant, but don't define the mutant, Figure 2G). I would recommend they just include the same sequence numbering and WT to mutant replacement every time a new mutant (or species) is described. It is always easier to interpret what is being shown when the authors are jumping between species, when the exact mutant is included. This is particularly important in this paper, where we are jumping between different subunits and different species, so a clear description in the figure/figure legends makes it much easier to read for non-specialists.

      (2) The HDX data very clearly shows that Rab5 is likely able to bind at both sites, which back ups the cryo EM data nicely. I am slightly confused by some of the HDX statements described in the methods.

      (3) The authors state, "Only statistically significant peptides showing a difference greater than 0.25 Da and greater than 5% for at least two timepoints were kept." This seems to be confusing as to why they required multiple timepoints, and before they also describe that they required a p-value of less than 0.05. It might be clearer to state that significant differences required a 0.25 Da, 5%, and p-value of <0.05 (n=3). Also, what do they mean by kept? Does this mean that they only fully processed the peptides with differences?

      (4) They show peptide traces for a selection in the supplement, but it would be ideal to include the full set of HDX data as an Excel file, including peptides with no differences, as there is a lot of additional information (deuteration levels for everything) that would be useful to share, as recommended from the Masson et al 2019 recommendations paper. This may be attached, but this reviewer could not see an example of it in the shared data dropbox folder.

    1. Joint Public review:

      Summary

      This interesting work by Shuhao Li and colleagues suggests that developmental sleep and feeding behavior in larval flies is genetically programmed to prepare the animal for adult contingencies, such as in the case of flies living in harsh ecological environments, such as deserts. Thus, the work proposes that desert-dwelling flies such as Drosophila mojavensis sleep less and feed more than D. melanogaster as larvae, which allows them to feed less and sleep more as adults in the harsh desert conditions where they live. The authors argue that this is evidence for developmental sleep reallocation, which helps the adult flies survive in the desert. In general, their results support this compelling hypothesis, so this work provides a new perspective on how sleep might be differentially programmed across developmental stages according to the requirements of an ecological niche. This work is particularly innovative for several reasons. First, it extends the Drosophila sleep field beyond D. melanogaster and directly addresses questions about the evolution of sleep that remain largely unexplored. Second, it investigates the possibility that changes in sleep across development may be adaptive, rather than sleep being a static trait. Overall, this work opens new avenues of research, effectively bridges the fields of sleep biology and evolutionary ecology, and should be of broad interest to a general readership. The manuscript is scientifically sound and clearly written for a generalist audience.

      There are, however, two important weaknesses that should be addressed. The first is the implicit assumption that all observed behavioral differences are adaptive; this would benefit from a more cautious framing. Second, the manuscript would be strengthened by a more detailed discussion, and potentially additional data, regarding the ecological differences experienced by D. mojavensis and D. melanogaster at distinct life-cycle stages.

      Strengths:

      (1) The study astutely uses desert Drosophila species as models to understand how sleep is optimized in a challenging environment. The manuscript is rigorous, experiments are well controlled, the work is very clearly presented, and the results support the main conclusions, which are quite exciting.

      (2) The manuscript examines previously unexplored sleep differences in a non-melanogaster species.

      (3) The study provides evidence that selective pressure can be restricted to specific developmental stages.

      (4) This work offers evolutionary insights into the trade-offs between sleep and feeding across development.

      Weaknesses

      (1) The authors should soften interpretations so that it is not assumed that any observed difference between mojavensis and melanogaster is necessarily adaptive, or evolved due to food availability or temperature stress.

      (2) The study relies on comparisons and correlations. While it seems likely that the observed differences in sleep explain the increased food consumption and energy storage in the larvae of desert flies, demonstrating this through sleep manipulation would strengthen the authors' conclusions.

      (3) The question arises regarding whether transiently quiescent larvae are always really sleeping, and whether it is appropriate to treat sleep as a stochastic population-level phenomenon rather than as an individual trait.

      (4) The manuscript would benefit from comparative analysis beyond mojavensis and melanogaster.

      (5) A deeper discussion of the ecological differences between the 2 Drosophila species would place the results in a broader context.

      (6) The feeding parameters used in adults and larvae measure different aspects of feeding, confounding comparisons.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      We thank the reviewer for great suggestions.

      (1) The X-axis labels in some panels in Figure 2C and Supplementary Figure 2B overlap, making them difficult to read. Adjusting the label spacing or formatting would improve clarity.

      We thank the reviewer for the comment. All panels including Figure 2C and Supplementary Figure 2B, have now been organized the way in which X-axis labels are easily read.

      (2) In the scatter dot plot bar diagrams, it appears that n=3 for most of the data. Does this represent the number of mice used or the number of tissue sections per sample? This should be clarified in the figure legends for better transparency. 

      Great suggestion. In Results (page 7, lines 135-136), we now clarified that quantification was performed on every tenth section of the brain from 3 female and 3 male mice. Additionally, in the legends for scatter dot plot bar diagrams we also mentioned that n=3 represents the number of mice used.

      (3) In Supplemental Figure 2B, the positive signals are not clearly visible. Providing higher-magnification images is recommended.

      Great suggestion. The revised Supplemental Figure 2B, but also Figure 2A, now provide higher magnification inset images with distinctive positive signals.

      Reviewer #2:

      We thank the reviewer for great and critical suggestions.

      (1) Introduction:

      Line 58: References should be provided for this statement as it is based on a robust field of research, not on a new concept.

      We thank the reviewer for the comment. We have now included relevant references as suggested (page 4, line 58).

      (2) Line 100-102: This sentence seems to make new, an idea that has been well-documented since the late 1970s. Posterior pituitary hormones oxytocin and vasopressin have long been known to have multiple peripheral targets, and at least a subset of vasopressin and oxytocin neurons have robust central projections. The central targets have been the focus of study for numerous labs. Reference 34 does not relate to posterior pituitary hormones and seems mis-cited.

      We have changed this sentence, excluded the reference that does not relate to posterior pituitary hormones and added 4 further references reporting other non-traditional roles of vasopressin and oxytocin (page 6, lines 100-102).

      (3) Lines 102-108: While the regulation of bone is an interesting example of an under-appreciated impact of vasopressin, the example does not build to the rationale for examining central Avp and Avpr1a expression.

      We mean no disrespect here, but we have recently reported neural brain-bone connections using the SNS-specific PRV152 virus (Ryu et al., 2024; PMID: 38963696) and submitted Single Transcript Level Atlas of Oxytocin and the Oxytocin Receptor in the Mouse Brain (doi: https://doi.org/10.1101/2024.02.15.580498). Surprisingly, we detected Avpr1a and Oxtr expression in certain brain areas (for example, PVH and MPOM) that connect to both bone and adipose tissue through the SNS—raising an important question regarding a central role of Avpr1a and Oxtr in bodily mass and fat regulation. 

      (4) Line 111: Avp expression and Avpr1a expression have both been studied using in situ hybridization. Thus, the overall concept is less novel than hinted at in the text. Avp expression has been studied quite extensively. Avpr1a expression has not been studied in an exhaustive fashion. 

      We thank the reviewer for this comment and absolutely agree that brain AVP expression has been studied extensively. As with the Avpr, we believe that RNAscope probe design and signal amplification system employed in our study allow for more specific and sensitive detection of individual RNA targets at the single transcript level with much cleaner background noise comparing to in situ hybridization method. 

      (5) Results:

      Line 143: RNAscope is indeed a powerful method of detecting mRNA at the single transcript level. However, using that single transcript resolution only to provide transcript per brain region analysis, losing all of the nuance of the individual transcript expression, seems like a poor use of the method potential.

      This is a good point and we did notice that Avpr1a transcript expression in several brain nuclei displayed individual pattern of expression versus more ubiquitous expression in most of the other brain areas. We noted this finding in Results (page 9, lines 164-168); however, because of the word limits in Discussion, we are not sure what would be dropped to make more room and whether this is truly necessary.

      (6 &7) Line 135: Sections were coded from 3 males and 3 females. I would argue that there is not enough statistical power to make inferences regarding sex differences or regional differences. In fact, the authors did not provide any statistical analysis in the manuscript at all, even though they stated they had completed statistical tests on the methods.

      150-157: All statements regarding sex differences in expression are made without statistical analyses, which, if conducted, would be underpowered. Given the limitations of performing and analyzing RNAscope data en masse a low n is understandable, but it requires a much more precise description of the data and a more careful look at how the results can be interpreted.

      We thank the reviewer for these comments. We mean no disrespect here, but while statistical analysis for main brain regions is relevant, it is not meaningful as far as nuclei, sub-nuclei and regions are concerned. It is noteworthy to mention that RNAscope data analysis in the whole mouse brain is an extremely drawn-out process requiring almost 2 months to conduct exhaustive manual counting of single Avpr1a transcripts in a single mouse brain—authors analyzed 6 brains. That said, statistical tests have been performed and exact P values are now shown in graphs.

      (8) Line 146: I am flagging this instance, but it should be corrected everywhere it occurs. Since we cannot know the gender of a given mouse, I would recommend referring to the mouse's "sex" rather than its "gender."

      Good suggestion. We made adequate changes throughout the manuscript.

      (9) Line 153: The authors switch to discussing cell numbers. Why is this data relegated to the supplemental material?

      Main figures in the manuscript report Avp and Avpr1a transcript density which has more important biological significance in terms of signal efficiency and cellular response dynamics. Due to the graph abundancy in the main text, we included all graphs with Avp and Avpr1a transcript counts in the supplemental material.

      (10) Methods:

      Line 369: "For simplicity and clarity, exact test results and exact P values are not presented." Simplicity or clarity is not a scientific rationale not to provide accurate statistics.

      We now provide exact P values in the graphs and the sentence in line 369 has been corrected accordingly (page 18, lines 379-380).

      (11) Line 362: The description of how data were analyzed is inadequate. More detail is needed.

      Agreed. We now included a detailed description on how data was analyzed (page 18, lines 365-374).

      (12) Discussion:

      Line 321: "This contrasts the rudimentary attribution of a single function per brain area." While brain function is often taught in such rudimentary terms to make the information palatable to students, I do not think the scientific literature on vasopressin function published over the past 50 years would suggest that we are so naïve in interpreting the functional role of vasopressin in the brain. Clearly, vasopressin is involved in numerous brain functions that likely cross behavioral modalities.

      Agreed and we removed this sentence.

      (13) Line 322: "The approach of direct mapping of receptor expression in the brain and periphery provides the groundwork." On its face, this statement is true, but the present data build on the groundwork laid by others (multiple papers from Ostrowski et al. in the early 1990s).

      Agreed.

      (14) Figures:

      Figure 1: 1B, I do not know the purpose of creating graphs with single bars (3V, ic, pir-male, and pir-female); there are no comparisons made in the graph. In the graphs with many brain regions, very little data can be effectively represented with the scale as it is. I recommend using tables to provide the count/density data and making graphs of only the most robust areas. In addition, although there is no statistical comparison, combining males and females in the same graphs might be beneficial to make a visual comparison easier. Why were cell counts only included in the supplemental material? Is that data not relevant?

      We thank the reviewer for this comment. Now all figures are presented in a more effective and aesthetically pleasing way.

      (15) There is a real missed opportunity to highlight some of the findings. For example, cell counts and density measures are provided for regions in the hippocampus, thalamus, and cortex that are not typically reported to contain vasopressin-expressing cells. Photomicrographs of these locations showing the RNAscope staining would be far more impactful in reporting these data. Are there cells expressing Avp, or is the Avp mRNA in these areas contained in fibers projecting to these areas from hypothalamic and forebrain sources?

      Great suggestion. We now see in Figure 1D showing novel Avp transcript expression in the hippocampus, thalamus and cortex. Based on counterstained hematoxylin staining, Avp mRNA transcripts were found in somata.

      (16) Figure 1C legend suggests images of the hippocampus and cortex, but all images are from the hypothalamus. Abbreviations are not defined.

      Thank you for the comment. We corrected Figure 1C legend and separately included Figure 1D showing novel Avp mRNA expression in the hippocampus and cortex.

      (17) Figure 2: The analysis of Avpr1a suffers from some of the same issues as the Avp analysis. In Figure 2A, the photomicrographs do not do a very good job of illustrating representative staining. The central canal image does not appear to have any obvious puncta, but the density of Avpr1a puncta suggests something different. The sex difference in the arcuate is also not clearly apparent in representative images. There is minimal visualization of the data for a project that depends so heavily on the appearance of puncta in tissue, coupled with the lack of clarity in the images provided, greatly diminished the overall enthusiasm for the data presentation. The figures in 2C would be more useful as tables with graphs used to highlight specific regions; as is, most of the data points are lost against the graph axis. Photomicrographs would also provide a better understanding of the data than graphs.

      Great suggestion. The revised Figure 2A but also Supplemental Figure 2B now provide higher magnification inset images with distinctive positive signals. As with Figures 2C, we arranged all graphs in a more effective and aesthetically pleasing manner.

      (18) Figure 3: Given the low number of animals and, therefore, low statistical power, I do not think that illustrating the ratios of male to female is a statistically valid comparison.

      Please see response to Point 6 & Point 7.

      (19) Figure 4: Pituitary is an interesting choice to analyze. However, why was only the posterior pituitary analyzed? Were Avp transcripts contained in terminals of vasopressin neuron axons or other cells? Was Avpr1a transcript present in blood vessel cells where Avp is released? A different cell type? Why not examine the anterior pituitary, which also expresses Avp receptors (although the literature suggests largely Avpr1b)?

      Thank you for the great comment. We included only posterior pituitary because there were no positive Avp/Avpr1a transcripts found in the anterior pituitary. Unfortunately, we have not performed cell type-specific staining, which would have enabled greater variation in AVP and its receptor expression across various cell types.

    1. L’Évaluation dans le Système Éducatif : Enjeux, Mécanismes et Perspectives d'Évolution

      Synthèse de l'intervention

      Ce document de synthèse analyse les réflexions d'un enseignant-chercheur sur la nature et l'évolution de l'évaluation au sein du système éducatif français.

      L'analyse met en lumière le malaise persistant autour de la notation traditionnelle et propose une transition vers une « évaluation positive ».

      Le postulat central est que l'évaluation ne doit plus être un simple outil de certification appartenant au système, mais devenir un moteur d'apprentissage dont l'élève doit progressivement s'emparer.

      L'objectif ultime est de transformer l'acte d'évaluer en un levier de réussite et d'autonomie, en dépassant le simple « malentendu » de la note pour instaurer une véritable culture de la réflexion sur l'action.

      --------------------------------------------------------------------------------

      1. Perspective Historique et Paradoxes de la Notation

      L'évaluation chiffrée en France n'est pas une donnée naturelle mais une construction historique liée à des fonctions de sélection et de certification.

      Les racines de la note : La notation sur 10 a été instaurée sous Jules Ferry pour le certificat d'études primaires, dans une logique de rationalisation héritée de la Révolution française.

      La notation sur 20, quant à elle, apparaît avec la création du baccalauréat en 1808 par Napoléon, marquant une hiérarchie symbolique entre le secondaire et le primaire.

      L'évolution des enjeux sociaux : En 1900, seulement 1 % d'une classe d'âge obtenait le baccalauréat, contre plus de 60 % à la fin du XXe siècle.

      Ce changement d'échelle rend l'échec scolaire (les 7 % de sorties sans diplôme) socialement « mortel », alors qu'il était la norme autrefois.

      La « constante macabre » : Concept d'André Antibi cité pour illustrer la tendance des enseignants à reproduire une courbe de Gauss (distribution des notes entre bons et mauvais élèves) indépendamment de la réalité des acquis, par peur de manquer de crédibilité ou de sélectivité.

      --------------------------------------------------------------------------------

      2. Déconstruction du Processus d'Évaluation

      L'évaluation est définie comme un processus cognitif en trois étapes, souvent invisible, qui se distingue de la simple communication d'un résultat.

      Les piliers du processus

      Le Référent : Ce à quoi l'on se rapporte (le modèle, les critères, l'objectif idéal).

      L'auteur souligne l'importance de construire ce référent de manière concrète, voire de le co-construire avec les élèves.

      Le Référé : La performance réelle de l'élève, l'objet observé (travail écrit, prestation orale, geste technique).

      La Mesure de l'écart : L'estimation de la distance entre le référé et le référent. L'auteur précise que l'on ne « mesure » jamais vraiment en éducation (absence de mètre étalon) ; on « bricole » une estimation.

      La différence entre évaluer et communiquer

      Il existe une distinction majeure entre la fabrication de l'évaluation (l'analyse interne de l'enseignant) et sa communication (la note ou le commentaire).

      Le malaise actuel provient souvent d'un défaut de communication ou d'un codage inadéquat de cet écart.

      --------------------------------------------------------------------------------

      3. Typologie des Codes d'Évaluation

      Le système utilise divers codes pour traduire l'évaluation, chacun présentant des limites spécifiques :

      | Code d'évaluation | Caractéristiques et Limites | | --- | --- | | Notes (0-10 / 0-20) | Système dominant en France (système décimal). Perçu comme rationnel mais souvent utilisé pour classer plutôt que pour faire apprendre. | | Commentaires ouverts | Destinés à conseiller, ils sont souvent redondants (« Très bien » pour un 16) ou trop spécialisés pour être compris sans feedback. | | Lettres (A, B, C, D, E) | Souvent un échec en France car calquées sur la moyenne (A = au-dessus, E = en dessous), perdant leur intérêt de création de groupes homogènes. | | Smileys et Codes couleurs | Utiles pour une communication endogène à la classe ; moins stigmatisants et centrés sur la fonction psychologique. | | Grilles d'évaluation | Outil le plus complet et proche des compétences (type « checklist » de pilote), mais extrêmement lourd à gérer au quotidien. |

      --------------------------------------------------------------------------------

      4. L'Évaluation comme Moteur d'Apprentissage

      L'évolution vers une évaluation positive nécessite une rupture épistémologique.

      Évaluation Formative vs Sommative : L'auteur refuse de choisir entre les deux (« les deux mon colonel »).

      L'évaluation doit être formative (donner de l'information pour ajuster l'enseignement) pendant la formation, et sommative (certifier un niveau) au moment de l'examen.

      La boucle de l'action réfléchie : S'inspirant de Philippe Perrenoud et de Marguerite Altet, l'auteur propose un cycle : Action -> Réflexion -> Théorisation -> Entraînement -> Retour à l'action. L'évaluation est l'activité réflexive au cœur de ce cycle.

      La « Dépossession » : L'enjeu est que l'enseignant ne soit plus le seul détenteur de l'évaluation. L'élève doit apprendre à s'auto-évaluer pour devenir autonome. « Il n'y a pas d'autonomie des élèves tant qu'ils ne sont pas capables d'auto-évaluation. »

      --------------------------------------------------------------------------------

      5. Dimensions Institutionnelles et Professionnelles

      L'évaluation est présentée comme un « premier geste de métier » pour lequel les enseignants sont paradoxalement peu formés.

      Le manque de formation : La formation des enseignants est souvent fragmentée entre savoirs disciplinaires et didactique, négligeant les gestes professionnels transversaux comme l'évaluation et l'orientation.

      Le rôle de l'établissement : Une innovation isolée sur l'évaluation (comme une classe sans notes) est fragile.

      Pour faire bouger le système, l'action doit être portée par l'équipe de l'établissement, en lien avec la direction, pour créer un « effet de levier ».

      La posture réflexive : L'évaluation ne doit pas seulement porter sur les élèves, mais aussi sur les pratiques enseignantes elles-mêmes.

      Il est nécessaire d'évaluer les dispositifs d'évaluation (méta-évaluation) par le biais d'analyses de situations éducatives.

      --------------------------------------------------------------------------------

      Citations Clés

      « Le paradoxe du métier d'enseignant, c'est que l'on n'est pas toujours formé au premier geste de métier : évaluer et orienter. »

      « On ne peut pas ne pas évaluer. Nous sommes condamnés à évaluer. »

      « L'évaluation doit être formative pendant la formation et sommative pendant la certification. Je ne monterais pas à bord d'un Airbus où le pilote n'aurait fait que du simulateur de vol. »

      « Faire de l'évaluation le moteur des apprentissages est la meilleure voie vers les savoirs et le savoir-agir. »

      --------------------------------------------------------------------------------

      Conclusion

      L'évaluation dans le système éducatif français est à la croisée des chemins entre un héritage sélectif du XIXe siècle et les nécessités sociales du XXIe siècle.

      Passer d'une évaluation subie à une évaluation « moteur » exige de clarifier le contrat de communication avec l'élève, de co-construire les critères de réussite et de réintégrer l'évaluation au cœur de la pratique réflexive des enseignants et des chefs d'établissement.

      L'autonomie de l'apprenant, finalité de l'école, passe nécessairement par sa capacité à évaluer son propre cheminement vers le savoir.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The manuscript titled, "Sleep-Wake Transitions Are Impaired in the AppNL-G-F Mouse Model of Early Onset Alzheimer's Disease", is about a study of sleep/wake phenomena in a knockin mouse strain carrying "three mutations in the human App gene associated with elevated risk for early onset AD". Traditional, in-depth characterization of sleep/wake states, EEG parameters, and response to sleep loss are employed to provide evidence, "supporting the use of this strain as a model to investigate interventions that mitigate AD burden during early disease stages". The sleep/wake findings of earlier studies (especially Maezono et al., 2020, as noted by the authors) were extended by several important, genotype-related observations, including age-related hyperactivity onset that is typically associated with increased arousal, a normal response to loss of sleep and to multiple sleep latency testing, and a stronger AD-like phenotype in females. The authors conclude that the AppNL-G-F mice demonstrate many of the human AD prodromal symptoms and suggest that this strain may serve as a model for prodromal AD in humans, confirming the earlier results and conclusions of Maezono et al. Finally, based on state bout frequency and duration analyses, it is suggested that the AppNL-G-F mice may develop disruptions in mechanism(s) involved in state transition.

      Strengths:

      The study appears to have been, technically, rigorously conducted with high quality, in-depth traditional assessment of both state and EEG characteristics, with the concordant addition of activity and temperature. The major strengths of this study derive from observations that the AppNL-G-F mice: (1) are more hyperactive in association with decreased transitions between states; (2) maintain a normal response to sleep deprivation and have normal MSLT results; and (3) display a sex specific, "stronger" insomnia-like effect of the knockin in females.

      Weaknesses:

      The weaknesses stem from the study's impact being limited due to its being largely confirmatory of the Maezono et al. study, with advances of importance to a potentially more focused field. Further, the authors conclude that AppNL-G-F mice have disrupted mechanism(s) responsible for state transition; however, these were not directly examined. The rationale for this conclusion is stated by the authors as based on the observations that bouts of both W and NREM tend to be longer in duration and decreased in frequency in AppNL-G-F mice. Although altered mechanism(s) of state transition (it is not clear what mechanisms are referenced here) cannot be ruled out, other explanations might be considered. For example, increased arousal in association with hyperactivity would be expected to result in increased duration of W bouts during the active phase. This would also predictably result in greater sleep pressure that is typically associated with more consolidated NREM bouts, consistent with the observations of bout duration and frequency.

      Reviewer 1 succinctly summarizes the advances of this study beyond the ground-breaking Maezono et al (2020) study of this “humanized” mouse model exhibiting amyloid deposition. Whereas Maezono et al. conducted sleep/wake studies on male App<sup>NL-G-F</sup> mice at 6 and 12 months of age, we had the unusual opportunity to study both sexes of homozygous App<sup>NL-G-F</sup> mice and WT littermates at 14-18 months of age and to conduct a longitudinal assessment of many of the same individuals at 18-22 months. In addition to baseline sleep/wake and EEG spectral analyses, we (1) measured subcutaneous body temperature and activity to obtain a broader picture of the physiology and behavior of this strain at advanced ages; (2) assessed baseline sleepiness in this strain using the murine version of the clinically-relevant Multiple Sleep Latency Test (MSLT); (3) evaluated the response of App<sup>NL-G-F</sup> mice and WT littermates to a perturbation of the sleep homeostat; (4) compared the sleep/wake characteristics of male vs. female App<sup>NL-G-F</sup> mice at 18-22 months and, (5) to assess the stability of the phenotypes, analyzed these data over a continuous 14-d recording rather than the conventional 24h recordings typical of most sleep/wake studies including Maezono et al. We found that a long wake/short sleep phenotype was characteristic of homozygous App<sup>NL-G-F</sup> mice at these advanced ages which is also evident in the Maezono et al. (2020) study at 12 months of age (but not at 6 months), although the authors do not comment on this phenotype and instead focus on the reduced REM sleep which is particularly evident in female App<sup>NL-G-F</sup> mice in our study. Remarkably, despite being awake ~20% longer per day, we find that App<sup>NL-G-F</sup> mice are no sleepier than WT mice as determined by the MSLT and that their sleep homeostat is intact when challenged by 6-h sleep deprivation. At both advanced ages, the long wake/short sleep phenotype is due primarily to longer Wake bouts and shorter bouts of both NREM and REM sleep during the dark phase. Moreover, hyperactivity develops in older in App<sup>NL-G-F</sup> mice, particularly females, which contributes to this phenotype. We agree with Reviewer 1 that “hyperactivity would be expected to result in increased duration of W bouts during the active phase” and that this could result in more consolidated NREM bouts and we will modify the manuscript to discuss this alternative. However, the suggestion of greater sleep pressure is not borne out by the MSLT studies as we did not observe the shorter sleep latencies and increased sleep during the nap opportunities on the MSLT that we have observed in other mouse strains. Moreover, due to their short sleep phenotype, App<sup>NL-G-F</sup> mice would be entering the sleep deprivation study with a greater sleep debt than WT mice, yet we did not observe greater EEG Slow Wave Activity in this strain during recovery from sleep deprivation. Thus, we have suggested that App<sup>NL-G-F</sup> mice are unable to transition from Wake to sleep as readily as their WT littermates. Our observations summarized above set the stage for subsequent mechanistic studies in aged App<sup>NL-G-F</sup> mice, although realistically, mice of this age and genotype are a rare commodity.

      Reviewer #2 (Public review):

      Summary:

      The authors have used a knock-in mouse model to explore late-in-life amyloid effects on sleep. This is an excellent model as the mutated genes are regulated by the endogenous promoter system. The sleep study techniques and statistical analyses are also first-rate.

      The group finds an age-dependent increase in motor activity in advanced age in the NLGF homozygous knock-in mice (NLGF), with a parallel age-dependent increase in body temperature, both effects predominate in the dark period. Interestingly, the sleep patterns do not quite follow the sleep changes. Wake time is increased in NLGF mice, and there is no progression in increased wake over time. NREMS and REM sleep are both reduced, and there is no progression. Sleep-wake effects, however, show a robust light:dark effect with larger effects in the dark period. These findings support distinct effects of this mutation on activity and temperature and on sleep. This is the first description of the temporal pattern of these effects. NLGF mice show wake stability (longer bout durations in the dark period (their active period) and fewer brief arousals from sleep. Sleep homeostasis across the lights-on period is normal. Wake power spectral density is unaffected in NLGF mice at either age. Only REM power spectra are affected, with NLGF mice showing less theta and more delta. There are interesting sex differences, with females showing no gene difference in wake bout number, while males show a gene effect. Similarly, gene effects on NREM bout number seem larger in males than in females. Although there was no difference in homeostatic response, there was normalization of sleep-wake activity after sleep deprivation.

      Strengths:

      Approach (model extent of sleep phenotyping), analysis.

      Weaknesses:

      The weaknesses are summarized below and are viewed as "addressable".

      (1) The term insomnia. Insomnia is defined as a subjective dissatisfaction with sleep, which cannot be ascertained in a mouse model. The findings across baseline sleep in NLGF mice support increased wake consolidation in the active period. The predominant sleep period (lights on) is largely unaffected, and the active period (lights off) shows increased activity and increased wake with longer bouts. There is a fantastic clue where NLGF effects are consistent with increased hypocretinergic (orexinergic) neuron activity in the dark period, and/or increased drive to hypocretin neurons from PVH.

      (2) Sleep-wake transitions are impaired: This should not be termed an impairment. It could actually be beneficial to have greater state stability, especially wake stability in the dark or active period. There is reduced sleep in the model that can be normalized by short-term sleep loss. It is fascinating that recovery sleep normalized sleep in the NLGF in the immediate lights-on and light-off period. This is a key finding.

      Reviewer 2 suggests a provocative hypothesis to test. Curiously, although a recent Science paper suggests that hyperexcitable hypocretin/orexin neurons in aging mice results in greater sleep/wake fragmentation, hyperexcitability of this system could result in hyperactivity and longer wake bouts in aged App<sup>NL-G-F</sup> mice.

      Reviewer #3 (Public review):

      Summary:

      In this study, Tisdale et al. studied the sleep/wake patterns in the biological mouse model of Alzheimer's disease. The results in this study, together with the established literature on the relationship of sleep and Alzheimer's disease progression, guided the authors to propose this mouse model for the mechanistic understanding of sleep states that translates to Alzheimer's disease patients. However, the manuscript currently suffers from a disconnect between the physiological data and the mechanistic interpretations. Specifically, the claim of "impaired transitions" is logically at odds with the observed increase in wake-state stability or possible hyperactivity. Additionally, the description of the methods, the quantification, and the figure presentation could be substantially improved. I detail some of my concerns below.

      Strengths:

      The selection of the knock-in model is a notable strength as it avoids the artifacts associated with APP overexpression and more closely mimics human pathology. The study utilizes continuous 14-day EEG recordings, providing a unique dataset for assessing chronic changes in arousal states. The assessment of sex as a biological variable identifies a more severe "insomniac-like" phenotype in females, which aligns with the higher prevalence and severity of Alzheimer's disease in women.

      Weaknesses:

      The study seems to lack a clear hypothesis-driven approach and relies mostly on explorative investigations. Moreover, lack of quantitative analytical methods as well as shaky logical conclusions, possibly not supported by data in its current form, leaves room for major improvement.

      Since this paper studied sleep states, the "Methods" section is quite unclear on what specific criteria were used to classify sleep states. There is no quantitative description of classifying sleep based on clear, reproducible procedures. There are many reasonably well-characterized sleep scoring systems used in rat electrophysiological literature, which could be useful here. The authors are generally expected to describe movement speed and/or EMG and/or EEG (theta/delta/gamma) criteria used to classify these epochs. The subjective (manual) nature of this procedure provides no verifiable validation of the accuracy and interpretability of the results.

      One of the bigger claims is that "state transition mechanism(s)" are impaired. However, Figure 7 shows that model mice exhibit significantly more long wake bouts (>260s) and fewer short wake bouts (<60s). Logically, an "impaired switch" (the flip-flop model, Saper et al., 2010) results in state fragmentation. The data here show the opposite: the wake state has become too stable. This suggests the primary defect is not in the transition mechanism itself, but possibly in a pathological increase in arousal drive (hyper-arousal), likely linked to the dark-phase hyperactivity shown in Figures 4 and 5. Also, a point to note is that this finding is not new.

      Figure 3 heatmaps lack color bars and units. Spectral power must be quantitatively defined and methods well-explained in the Methods section. Without these, the reader cannot discern if the "reduced power" in females is a global suppression of signal or a frequency-specific shift. Additionally, the representative example used to claim shorter sleep bouts lacks the statistical weight required for a major physiological conclusion. How does a cooler color (not clear what range and what the interpretation is) mean shorter sleep bout in female mice? The authors should clearly mark the frequency ranges that support their claims. In this figure, there is a question mark following the theta/delta range. The authors should avoid speculation and state their claims based on facts. They should also add the theta and delta ranges in the plot, such that readers can draw their own conclusions.

      Figure 8 and the MSLT results show that model mice are "no sleepier than WT mice" and have a functional homeostatic rebound. This presents a logical flaw in the "insomnia" narrative. True insomnia in AD patients typically involves a failure of the homeostatic process or a debilitating accumulation of sleep debt. If these mice do not show increased sleepiness (shorter latency) despite ~19% less sleep, the authors might be describing a "reduced need" for sleep or a "hyper-aroused" state, possibly not a clinical insomnia phenotype.

      In Figure 9, LFP power shown and compared in percentages is problematic, as LFP power distribution is known to be skewed (follows power law). This is particularly problematic here because all the frequencies above ~20 Hz seem to be totally flattened or nonexistent, which makes this comparison of power severely limited and biased towards the relative frequency in the highly skewed portion of the LFP power spectrum, i.e., very low frequency ranges like delta, theta, and possibly beta. This ignores low, mid, and high gamma as well as ripple band frequencies. NREM sleep is known to have relatively greater ripple band (100-250 Hz) power bursts in hippocampal regions, and REM sleep is known to have synchronous theta-gamma relationships.

      We agree with the reviewer that the “Classification of arousal states” section was missing the key description of how we scored the recordings into arousal states based on EEG, EMG and locomotor activity; this was an oversight as the corresponding text exists in all our previous sleep/wake studies published over several decades. Reviewer 1 also points out the alternative interpretation that “the wake state has become too stable.” However, I think we are using different words to say the same thing: that the transition from wake to sleep is impaired whether it is due to hyperarousal or to a defect in the flip/flop switch that results in greater Wake stability. We will revise Fig 3 (Reviewer 2 suggests combining with Fig 14) but note that the X-axis is labelled 0-25 Hz and that this figure was intended to be descriptive -- illustrating how unusual the female App<sup>NL-G-F</sup> mice are relative to WT -- rather than a quantitative analysis of spectral power as in Fig. 14. Both Reviewer 2 and 3 suggest that we are using “insomnia” incorrectly, which we have simply used to describe less sleep per 24h period. Reviewer 2 states that “Insomnia is defined as a subjective dissatisfaction with sleep” and Reviewer 3 suggests a narrow definition of insomnia as due only to “a failure of the homeostatic process or a debilitating accumulation of sleep debt.” In a revised manuscript, we will define “insomnia” as an operational term to succinctly mean “less sleep”. Regarding the problem of presenting spectral power in percentages, we completely agree with the reviewer. However, we intentionally presented spectral power density, a measure of relative power, as in Figure 3A and 3B of Maezono et al. (2020). At the risk of making Fig. 9 even more busy, we will revise Fig. 9 to add labels for all Y-axes.

      In addition to a revised Fig. 9, in the revised manuscript, we will reformat Tables 1-3, Figs. S1 and S2 for legibility and correct an error in Fig. 7.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study addresses an important clinical challenge by proposing muscle network analysis as a tool to evaluate rehabilitation outcomes. The research direction is relevant, and the findings suggest further research. The strength of evidence supporting the claims is, however, limited: the improvements in function are not directly demonstrated, the robustness of the method is not benchmarked against already published approaches, and key terminology is not clearly defined, which reduces the clarity and impact of the work.

      Comments:

      There are several aspects of the current work that require clarification and improvement, both from a methodological and a conceptual standpoint.

      First, the actual improvements associated with the rehabilitation protocol remain unclear. While the authors report certain quantitative metrics, the study lacks more direct evidence of functional gains. Typically, rehabilitation interventions are strengthened by complementary material (e.g., videos or case examples) that clearly demonstrate improvements in activities of daily living. Including such evidence would make the findings more compelling.

      We thank the reviewer for their careful consideration of our work. We agree that direct evidence for the functional gains achieved by patients is important for establishing the efficacy of a clinical intervention and that this evidence should provide comprehensive insights for clinicians, from videos to case examples as suggested. Our aim here was apply a novel computational framework to a cohort of patients undergoing rehabilitation, and in doing so, provide empirical support for its utility in standardised motor assessments. We have shown that our novel approach can identify distinct physiological responses to VR vs PT conditions across the post-stroke cohort (see Fig.2B and associated text). Hence, although the data contains virtual reality vs. conventional physical therapy experimental conditions which likely holds important insights into the clinical use case of virtual reality interventions, we did not focus on such complementary evidence in this study. In future work, research groups (including our own) investigating the important question of clinical intervention efficacy will likely gain unique and useful mechanistic insights using our approach.

      Moreover, a threshold of 5 points at the FMA-UE was considered as MCID, to distinguish between responder and non-responder patients, which represents an acknowledged and applicable measure in the clinical field. The use of single cases represents low evidence of change from the perspective of expert clinicians, raising concerns on the clinical meaningful of reported results. All this given, we chose to provide stronger evidence of clinical effect (i.e. comparison between responders and non-responders) interpreted from the perspective of muscle synergies, than to support our results in single selected cases, representing a bias in terms of translation to population of people survived to a stroke.

      Second, the claim that the proposed muscle network analysis is robust is not sufficiently substantiated. The method is introduced without adequate reference to, or comparison with, the extensive literature that has proposed alternative metrics. It is also not evident whether a simpler analysis (e.g., EMG amplitude) might produce similar results. To highlight the added value of the proposed method, it would be important to benchmark it against established approaches. This would help clarify its specific advantages and potential applications. Moreover, several studies have shown very good outcomes when using AI and latent manifold analyses in patients with neural lesions. Interpreting the latent space appears even easier than interpreting muscle networks, as the manifolds provide a simple encoding-decoding representation of what the patient can still perform and what they can no longer do.

      To address the reviewers concerns regarding adequate evidence for the claims made about the presented framework, we have now included an application of the conventional muscle synergy analysis approach based on non-negative matrix factorisation to the post-stroke cohort (see Supplementary materials Fig.5 and associated text). We made efforts to make this comparison as fair as possible by applying the conventional approach at the population level also and clustering the activation coefficients using a similar yet more conventional approach, agglomerative clustering. Accompanying the output of this application, we have included several points of where our framework improves significantly upon conventional muscle synergy analysis:

      “Comparison with conventional approaches

      To more directly illustrate the advantages of the proposed framework, we carried out a standardised pre-processing of the EMG data in line with conventional muscle synergy analysis. This included rectification, low-pass filtration (cut-off: 20Hz) and smooth resampling of EMG waveforms to 50 timepoints. All data for each participant at each session was separately normalised by channel-wise variance, concatenated together and input into non-negative matrix factorisation (NMF) ('nnmf' Matlab function, 10 replications) to extract 11 muscle synergies (W1-11 of Supplementary Materials Fig.5(Left)) and their time-varying activations. The number of components to extract was determined in a conventional way as the number of components required to explain >75% of the data variance. The extracted muscle synergies included distinct shoulder- (e.g. W2), elbow (e.g. W8) and forearm-level (e.g. W1) muscle covariation patterns along with more isolated muscle contributions (e.g. UT in W3, TL in W10).

      Regarding the clustering results of our framework and how they compare to conventional approaches, to facilitate this comparison we applied agglomerative clustering to the time-varying activation coefficients of all participants, trials, tasks separately for pre- and post-sessions and employed the 'evalclusters' Matlab function (Ward linkage clustering, Calinski Harabasz criterion, Klist search = 2:21) for each session. We identified two clusters both at pre-session (Criterion = 1.69) and post-session (Criterion = 1.81) as optimal fits to the population data (see Supplementary Materials Fig.5(Right)). We found no associations between pre- or post-session cluster partitions and participants FMA-UE scores. Nevertheless, we did identify significant associations between the pre-session clustering’s and S_Pre (X<sup>2</sup> = 7.08, p = 0.008) and between post-session clustering’s and conventionally-defined treatment responders (X<sup>2</sup> = 4.2, p = 0.04). These findings, along with the similar two-way clustering structure found using the NIF, highlights important commonalities between these approaches.

      To summarise the main advantages of our framework over this conventional approach:

      - Lower dimensionality and enhanced interpretability of extracted components.

      Our framework yields a lower number of population-level components that correspond more consistently to meaningful biomechanical and physiological functions.

      - Integration of pairwise muscle relationships.

      By incorporating muscle-pair level analysis, our framework captures coordinated interactions between primary and stabilising muscles—relationships that conventional NMF approaches overlook.

      - Separation of task-relevant and task-irrelevant activity.

      The NIF isolates task-relevant coordination patterns, distinguishing them from task-irrelevant interactions driven by biomechanical or task constraints. On the other hand, task-relevant and -irrelevant muscle contributions are intermixed in conventional muscle synergy analysis.

      - Ability to identify complementary functional roles.

      The NIF characterises whether muscle pairs act in similar or complementary ways, providing richer insight into motor control strategies.

      - Reduced dependence on variance-based optimisation.

      Unlike conventional methods that rely on maximising variance explained, our framework allows detection of subtle but functionally significant interactions that contribute less to total variance.

      - Improved detection of clinically relevant population structure.

      The clustering component of our framework revealed distinct post-stroke subgroups with important clinical relevance, distinguishing moderately and severely impaired cohorts and treatment responders and non-responders from pre-treatment data.”

      This supplementary analysis is referred to in the Methods section of the main text with reference to previous similar comparisons between our framework and conventional approaches:

      “Towards finding an effective approach to clustering participants in this data based on differences in impairment severity and therapeutic (non-)responsiveness, we found that conventional clustering algorithms (e.g. agglomerative, k-means etc.) could not provide substantive outputs (see Supplementary Materials Fig.5 and associated text for a direct comparison with conventional approaches), perhaps resulting from the complex interdependencies between the modular activations.”

      “To facilitate comparisons with existing approaches, we performed a conventional muscle synergy analysis on the post-stroke cohort (see Supplementary Materials Fig.5 and associated text). Further comparisons with conventional approaches can be found in our previous work (O’Reilly & Delis, 2022).”

      Further, we have also referred to a previous analysis of this post-stroke dataset using the conventional approach in the discussion section, where we point out how our approach can identify salient features of post-stroke physiological responses that conventional approaches cannot:

      “Further, the NIF demonstrated here an enhanced capability over traditional approaches to identify these crucial patterns, as earlier work on related versions of this dataset could not identify any differentiable fractionation events across the cohort (Pregnolato et al., 2025).”

      Overall, the utility of conventional muscle synergy analysis is well recognised across the field (Hong et al 2021). Our proposed approach builds on this conventional method by addressing key limitations to further enhance this clinical utility. We also agree that manifold learning approaches are an exciting area of research that we aim to incorporate into our framework in future research. Specifically, manifold learning methods like Laplacian eigenmaps can readily be applied to the co-membership matrix produced by our clustering algorithm, exploiting the geometry of this matrix to provide a continuous rather than discrete representation of population structure. We have highlighted this possibility in the discussion section:

      “Indeed, in future work, we aim to apply manifold learning approaches to the co-membership matrix derived from this clustering algorithm, providing a continuous representation of the population structure.”

      Third, the terminology used throughout the manuscript is sometimes ambiguous. A key example is the distinction made between "functional" and "redundant" synergies. The abstract states: "Notably, we identified a shift from redundancy to synergy in muscle coordination as a hallmark of effective rehabilitation-a transformation supported by a more precise quantification of treatment outcomes."

      However, in motor control research, redundancy is not typically seen as maladaptive. Rather, it is a fundamental property of the CNS, allowing the same motor task to be achieved through different patterns of muscle activity (e.g., alternative motor unit recruitment strategies). This redundancy provides flexibility and robustness, particularly under fatiguing conditions, where new synergies often emerge. Several studies have emphasized this adaptive role of redundancy. Thus, if the authors intend to use "redundancy" differently, it is essential to define the term explicitly and justify its use to avoid misinterpretation.

      We appreciate the reviewers concerns regarding the terminology employed in this study. Indeed, we agree that redundancy is seen in the motor control literature as a positive feature of biological systems, appearing to contradict the interpretations of the redundancy-to-synergy information conversion result we have presented. We also wish to highlight that across the motor control literature and beyond, the idea of redundancy is often conflated with the related but distinct notion of degeneracy. Traditional motor control research has also recognised this difference, for example, Latash has outlined this difference in the seminal work on motor abundance (https://doi.org/10.1007/s00221-012-3000-4). A key reference discussing this conflation and these two concepts in an information-theoretic way is found here: https://doi.org/10.1093/cercor/bhaa148. To summarise what their arguments mean for our work:

      - System degeneracy relates to the ability of different system components to contribute towards the same task in a context-specific way.

      - System redundancy corresponds to the degree of functional overlap among system components.

      Hence, conceptually speaking, informational redundancy as employed in our study (i.e. functionally-similar muscle interactions) links with system redundancy in that it quantifies the functional overlap of system components. This definition of system redundancy implies that it is an unavoidable by-product of degenerate systems (inefficient use of degrees of freedom) which should be minimised where possible. As a result of stroke, in our study and related previous work patients displayed increased informational redundancy, linking with the abnormal co-activations they typically experience for example and with previous results from traditional muscle synergy analysis showing fewer components extracted as a function of motor impairment post-stroke (i.e. higher informational redundancy) (Clark et al. 2010). Our novel contribution here is to convey how effective rehabilitation is underpinned by a redundancy-to-synergy information conversion across the muscle networks, relating in a loose sense conceptually to a reduction in system redundancy and enhancement of system degeneracy (i.e. functionally differentiated system components contributing towards task performance).

      Together, and alongside the mathematical descriptions of redundant (functionally-similar) and synergistic (functionally-complementary) information in what types of functional relationships they capture, we believe the intuition behind this finding has clear links with previous research showing a) the merging of muscle synergies in response to post-stroke impairment (i.e. functional de-differentiation), b) reduction in abnormal couplings with effective rehabilitation (i.e. functional re-differentiation). To communicate this more clearly to readers, we have included the following in the corresponding discussion section:

      “Previous research has shown that functional redundancy increases post-stroke (Cheung et al., 2012; Clark et al., 2010), reflecting the characteristic loss of functional specificity (i.e. functional de-differentiation) of muscle interactions post-stroke. Enhanced synergy with treatment here thus reflects the functional re-differentiation of predominantly flexor-driven muscle networks towards different, complementary task-objectives across the seven upper-limb motor tasks performed (Kim et al., 2024b), leading to improved motor function among responders.”

      Finally, we have screened the updated manuscript for consistent use of terminology including functional/redundant/synergistic.

      References

      Clark DJ, Ting LH, Zajac FE, Neptune RR, Kautz SA. Merging of healthy motor modules predicts reduced locomotor performance and muscle coordination complexity post-stroke. Journal of neurophysiology. 2010 Feb;103(2):844-57.

      Hong YN, Ballekere AN, Fregly BJ, Roh J. Are muscle synergies useful for stroke rehabilitation?. Current Opinion in Biomedical Engineering. 2021 Sep 1;19:100315.

      Latash ML. The bliss (not the problem) of motor abundance (not redundancy). Experimental brain research. 2012 Mar;217(1):1-5.

      O'Reilly D, Delis I. Dissecting muscle synergies in the task space. Elife. 2024 Feb 26;12:RP87651.

      Sajid N, Parr T, Hope TM, Price CJ, Friston KJ. Degeneracy and redundancy in active inference. Cerebral Cortex. 2020 Nov;30(11):5750-66.

      Reviewer #2 (Public review):

      Summary:

      This study analyzes muscle interactions in post-stroke patients undergoing rehabilitation, using information-theoretic and network analysis tools applied to sEMG signals with task performance measurements. The authors identified patterns of muscle interaction that correlate well with therapeutic measures and could potentially be used to stratify patients and better evaluate the effectiveness of rehabilitation.

      However, I found that the Methods and Materials section, as it stands, lacks sufficient detail and clarity for me to fully understand and evaluate the quality of the method. Below, I outline my main points of concern, which I hope the authors will address in a revision to improve the quality of the Methods section. I would also like to note that the methods appear to be largely based on a previous paper by the authors (O'Reilly & Delis, 2024), but I was unable to resolve my questions after consulting that work.

      I understand the general procedure of the method to be: (1) defining a connectivity matrix, (2) refining that matrix using network analysis methods, and (3) applying a lower-dimensional decomposition to the refined matrix, which defines the sub-component of muscle interaction. However, there are a few steps not fully explained in the text.

      (1) The muscle network is defined as the connectivity matrix A. Is each entry in A defined by the co-information? Is this quantity estimated for each time point of the sEMG signal and task variable? Given that there are only 10 repetitions of the measurement for each task, I do not fully understand how this is sufficient for estimating a quantity involving mutual information.

      We acknowledge the confusion caused here in how many datapoints were incorporated into the estimation of II. The number of datapoints included in each variable involved was in fact no. of timepoints x 10 repetitions. Hence for the EMGs employed in this analysis with a sampling rate of 2000Hz, the length of variables involved in this analysis could easily extend beyond 20,000 datapoints each. We have clarified this more specifically in the corresponding section of the methods:

      “We carried out this application in the spatial domain (i.e. interactions between muscles across time (Ó’Reilly & Delis, 2022)) by concatenating the 10 repetitions of each task executed on a particular side (i.e. variables of length no. of timepoints x 10 trials) and quantifying II with respect to this discrete task parameter codified to describe the motor task performed at each timepoint for each trial included.”

      In the previous paper (O'Reilly & Delis, 2024), the authors initially defined the co-information (Equation 1.3) but then referred to mutual information (MI) in the subsequent text, which I found confusing. In addition, while the matrix A is symmetrical, it should not be orthogonal (the authors wrote A<sup>T</sup>A = I) unless some additional constraint was imposed?

      We thank the reviewer for spotting this typo in the previous paper describing a symmetric matrix as A<sup>T</sup>A = I which is in fact related to orthogonality instead. To clarify this error, in the current study we have correctly described the symmetric matrix as A = A<sup>T</sup> here:

      “We carried out this application in the spatial domain (i.e. interactions between muscles across time (Ó’Reilly & Delis, 2022)) by concatenating the 10 repetitions of each task executed on a particular side (i.e. variables of length no. of timepoints x 10 trials) and quantifying II with respect to this discrete task parameter codified to describe the motor task performed at each timepoint for each trial included. This computation was performed on all unique m<sub>x</sub> and m<sub>y</sub> pairings, generating symmetric matrices (A) (i.e. A = A<sup>T</sup>) composed separately of non-negative redundant and synergistic values (Fig.5).”

      Regarding the reviewers point about the reference to MI after equation 1.3 of the previous paper where co-Information is defined, we were referring both to the task-relevant and task-irrelevant estimates analysed there collectively in a general sense as ‘MI estimates’ as they both are derived from mutual information, task-irrelevant being the MI between two muscles conditioned on a task variable (conditional mutual information) and task-relevant being the difference between two MI values (co-I is a higher-order MI estimate). This removed the need to continuously refer to each separately throughout the paper which may in its own way cause some confusion. For clarity, in the results of that paper we also provided context for each MI estimate on how they were estimated (see beginning of “Task-irrelevant muscle couplings” and “Task-redundant muscle couplings” and “Task-synergistic muscle couplings” results sections), referring throughout the Venn diagrams depicting them (see Fig.1 of previous paper). In the present study however, for brevity and focus we did not perform an analysis on task-irrelevant muscle interactions and so decided to focus our terminology on co-I (II), a higher-order MI estimate. We acknowledge that this may have caused some confusion but highlight the efforts made to communicate each measure throughout the previous and present study. We have explicitly pointed out this specific focus on task-dependent muscle couplings in this paper at the end of the introduction of the updated manuscript:

      “To do so, here we focussed our analysis on quantifying task-dependent muscle couplings (collectively referred to as II), extracting functionally-similar (i.e. redundant) and -complementary (i.e. synergistic) modules…”

      (2) The authors should clarify what the following statement means: "Where a muscle interaction was determined to be net redundant/synergistic, their corresponding network edge in the other muscle network was set to zero."

      We acknowledge this sentence was unclear/misleading and have now clarified this statement in the following way:

      “This computation was performed on all unique m<sub>x</sub> and m<sub>y</sub> pairings, generating sparse symmetric matrices (A) (i.e. A = A<sup>T</sup>) composed separately of non-negative redundant and synergistic values (Fig.5).” Additionally, we have now included an additional figure (fig.5) describing this text graphically.

      (3) It should be clarified what the 'm' values are in Equation 1.1. Are these the co-information values after the sparsification and applying the Louvain algorithm to the matrix 'A'? Furthermore, since each task will yield a different co-information value, how is the information from different tasks (r) being combined here?

      We thank the reviewer for their attention to detail. For clarity, at the related section of Equation 1.1, we have clarified that the input matrix is composed of co-I estimates:

      “The input matrix for PNMF consisted of the sparsified A on both affected and unaffected sides from all participants at both pre- and post-sessions concatenated in their vectorised forms. More specifically, the input matrix composed of redundant or synergistic values was configured such that the set of unique muscle pairings (1 … K) on affected and unaffected sides (m<sub>aff</sub> and m<sub>unaff</sub> respectively)…”.

      The co-I estimates in this input matrix are indeed those that survived sparsification in previous steps, however, for determining the number of modules to extract using the Louvain algorithm, this step has no direct impact or transformation on the co-I estimates and is simply employed to derive an empirical input parameter for dimensionality reduction. We refer the reviewer to the following part of this paragraph where this is described:

      “The number of muscle network modules identified in this final consensus partition was used as the input parameter for dimensionality reduction, namely projective non-negative matrix factorisation (PNMF) (Fig.1(D)) (Yang & Oja, 2010). The input matrix for PNMF consisted of the sparsified A on both affected and unaffected sides from all participants at both pre- and post-sessions concatenated together in their vectorised form.”

      Finally, as the reviewer has mentioned, the co-I estimates from the same muscles pairings but for different tasks, experimental sessions and participants are indeed different, reflecting their task-specific tuning, changes with rehabilitation and individual differences. To combine these representations into low-dimensional components, we employed projective non-negative matrix factorisation (PNMF). As outlined in the previous paper and earlier work on this framework (O’ Reilly & Delis, 2022), application of dimensionality reduction here can generate highly generalisable motor components, highlighting their ability to effectively represent large populations of participants, tasks and sessions, while allowing interesting individual differences mentioned by the reviewer to be buffered into the corresponding activation coefficients. These activation coefficients are for this reason the focus of the cluster analyses in the present study to characterise the post-stroke cohort. We have explicitly provided this reason in the methods section of the updated manuscript:

      “We focussed on $a$ here as the extraction of population-level functional modules enabled the buffering of individual differences into the space of modular activations, making them an ideal target for identifying population structure.”

      (4) In general, I recommend improving the clarity of the Methods section, particularly by being more precise in defining the quantities that are being calculated. For example, the adjacency matrix should be defined clearly using co-information at the beginning, and explain how it is changed/used throughout the rest of the section.

      We thank the reviewer for their constructive advice and have gone to lengths to improve the clarity of the methods section. Firstly, we have addressed all the reviewers comments on various specific sections of the methods, including more clearly the ‘why’ and ‘how’ of what was performed. Secondly, we have now included an additional figure illustrating how co-information was quantified at the network level and separated into redundant and synergistic values (see Fig.5 of updated manuscript). Finally, we have re-structured several paragraphs of the methods section to enhance flow with additional subheadings for clarity.

      (5) In the previous paper (O'Reilly & Delis, 2024), the authors applied a tensor decomposition to the interaction matrix and extracted both the spatial and temporal factors. In the current work, the authors simply concatenated the temporal signals and only chose to extract the spatial mode instead. The authors should clarify this choice.

      The reviewer is correct in that a different dimensionality reduction approach was employed in the previous paper. In the present study, we instead chose to employ projective non-negative matrix factorisation, as was employed in a preliminary paper on this framework (O’Reilly & Delis, 2022). This decision was made simply based on aiming to maintain brevity and simplicity in the analysis and presentation of results as we introduce other tools to the framework (i.e. the clustering algorithm). Indeed, we could have just as easily employed the tensor decomposition to extract both spatial and temporal components, however we believed the main take away points for this paper could be more easily communicated using spatial networks only. To clarify this difference for readers we have included the following in the methods section:

      “The choice of PNMF here, in contrast to the space-time tensor decomposition employed in the parent study (O’Reilly & Delis, 2024), was chosen simply to maintain brevity by focussing subsequent analyses on the spatial domain.”

      References

      Ó’Reilly D, Delis I. A network information theoretic framework to characterise muscle synergies in space and time. Journal of Neural Engineering. 2022 Feb 18;19(1):016031.

      O'Reilly D, Delis I. Dissecting muscle synergies in the task space. Elife. 2024 Feb 26;12:RP87651.

      Recommendations for the authors:

      Reviewing Editor Comments:

      Both reviewers are concerned with the manuscript in its current form. They questioned the relevance of the current approach in providing functional or mechanistic explanations about the rehabilitation process of post-stroke patients. Our eLife Assessment would change if you include comparisons between your current method and classical ones, in addition to improving the description of your method to strengthen the evidence of its robustness.

      Reviewer #1 (Recommendations for the authors):

      There is a minor typographical error in Figure 2 ("compononents" should be corrected).

      This error has been rectified.

      Reviewer #2 (Recommendations for the authors):

      The authors should be able to address most of my concerns by providing a substantially improved version of the Methods section.

      See above responses to the reviewers comments regarding the methods section.

      However, I would like the authors to explain in full detail (potentially including a simulation or power analysis) the procedure for estimating the co-information quantity, and to clarify whether it is robust given the sample size used in this paper.

      We refer the reviewer to our previous responses outlining with greater clarity the number of samples included in the estimation of co-I. We would also like to mention here that our framework does not make inferences on the statistical significance of individual muscle couplings (i.e. co-I estimates). Instead, these estimates are employed collectively for the sole purpose of pattern recognition. Nevertheless, to generate reliable estimates of the muscle couplings, we have employed a substantial number of samples for each co-I estimate (>20k samples in each variable) addressing the reviewers main concern her.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study by Wu et al. uses endogenous bruchpilot expression in a cell-type-specific manner to assess synaptic heterogeneity in adult Drosophila melanogaster mushroom body output neurons. The authors performed genomic on locus tagging of the presynaptic scaffold protein bruchpilot (BRP) with one part of splitGFP (GFP11) using the CRISPR/Cas9 methodology and co-expressed the other part of splitGFP (GFP1-10) using the GAL4/UAS system. Upon expression of both parts of splitGFP, fluorescent GFP is assembled at the N-terminus of BRP, exactly where BRP is endogenously expressed in active zones. For manageable analysis, a high-throughput pipeline was developed. This analysis evaluated parameters like location of BRP clusters, volume of clusters, and cluster intensity as a direct measure of the relative amount of BRP expression levels on site, using publicly available 3D analysis tools that are integrated in Fiji. Analysis was conducted for different mushroom body cell types in different mushroom body lobes using various specific GAL4 drivers. To test this new method of synapse assessment, Wu et al. performed an associative learning experiment in which an odor was paired with an aversive stimulus and found that, in a specific time frame after conditioning, the new analysis solidly revealed changes in BRP levels at specific synapses that are associated with aversive learning.

      Strengths:

      Expression of splitGFP bound to BRP enables intensity analysis of BRP expression levels as exactly one GFP molecule is expressed per BRP. This is a great tool for synapse assessment. This tool can be widely used for any synapse as long as driver lines are available to co-express the other part of splitGFP in a cell-type-specific manner. As neuropils and thus the BRP label can be extremely dense, the analysis pipeline developed here is very useful and important. The authors have chosen an exceptionally dense neuropil - the mushroom bodies - for their analysis and convincingly show that BRP assessment can be achieved with such densely packed active zones. The result that BRP levels change upon associative learning in an experiment with odor presentation paired with punishment is likewise convincing, and strongly suggests that the tool and pipeline developed here can be used in an in vivo context.

      Weaknesses:

      Although BRP is an important scaffold protein and its expression levels were associated with function and plasticity, I am still somewhat reluctant to accept that synapse structure profiling can be inferred from only assessing BRP expression levels and BRP cluster volume. Also, is it guaranteed that synaptic plasticity is not impaired by the large GFP fluorophore? Could the GFP10 construct that is tagged to BRP in all BRP-expressing cells, independent of GAL4, possibly hamper neuronal function? Is it certain that only active zones are labeled? I do see that plastic changes are made visible in this study after an associative learning experiment with BRP intensity and cluster volume as read-out, but I would be reassured by direct measurement of synaptic plasticity with splitGFP directly connected to BRP, maybe at a different synapse that is more accessible.

      We appreciate the reviewer’s comments. In the revised manuscript, we have clarified that Brp is an important, but not the only player in the active zone. We have included new data to demonstrate that split-GFP tagging does not severely affect the localization and plasticity of Brp and the function of synapses by showing: (1) nanoscopic localization of Brp::rGFP using STED imaging; (2) colocalization between Brp::rGFP and anti-Brp signals/VGCCs; (3) activity-dependent Brp remodeling in R8 photoreceptors; (4) no defect in memory performance when labeling Brp::rGFP in KCs; These four lines of additional evidence further corroborate our approach to characterize endogenous Brp as a proxy of active zone structure.

      Reviewer #2 (Public review):

      Summary:

      The authors developed a cell-type specific fluorescence-tagging approach using a CRISPR/Cas9 induced spilt-GFP reconstitution system to visualize endogenous Bruchpilot (BRP) clusters as presynaptic active zones (AZ) in specific cell types of the mushroom body (MB) in the adult Drosophila brain. This AZ profiling approach was implemented in a high-throughput quantification process, allowing for the comparison of synapse profiles within single cells, cell types, MB compartments, and between different individuals. The aim is to analyse in more detail neuronal connectivity and circuits in this centre of associative learning. These are notoriously difficult to investigate due to the density of cells and structures within a cell. The authors detect and characterize cell-type-specific differences in BRP-dependent profiling of presynapses in different compartments of the MB, while intracellular AZ distribution was found to be stereotyped. Next to the descriptive part characterizing various AZ profiles in the MB, the authors apply an associative learning assay and detect consequent AZ re-organisation.

      Strengths:

      The strength of this study lies in the outstanding resolution of synapse profiling in the extremely dense compartments of the MB. This detailed analysis will be the entry point for many future analyses of synapse diversity in connection with functional specificity to uncover the molecular mechanisms underlying learning and memory formation and neuronal network logics. Therefore, this approach is of high importance for the scientific community and a valuable tool to investigate and correlate AZ architecture and synapse function in the CNS.

      Weaknesses:

      The results and conclusions presented in this study are, in many aspects, well-supported by the data presented. To further support the key findings of the manuscript, additional controls, comments, and possibly broader functional analysis would be helpful. In particular:

      (1) All experiments in the study are based on spilt-GFP lines (BRP:GFP11 and UAS-GFP1-10).The Materials and Methods section does not contain any cloning strategy (gRNA, primer, PCR/sequencing validation, exact position of tag insertion, etc.) and only refers to a bioRxiv publication. It might be helpful to add a Materials and Methods section (at least for the BRP:GFP11 line). Additionally, as this is an on locus insertion the in BRP-ORF, it needs a general validation of this line, including controls (Western Blot and correlative antibody staining against BRP) showing that overall BRP expression is not compromised due to the GFP insertion and localizes as BRP in wild type flies, that flies are viable, have no defects in locomotion and learning and memory formation and MB morphology is not affected compared to wild type animals.

      We thank the reviewer for suggesting these important validations. We included details of the design of the construct and insertion site to the Methods section, performed several new experiments to validate the split-GFP tagging of Brp, and present the data in the revision.

      First, to examine whether the transcription of the brp gene is unaffected by the insertion of GFP<sub>11</sub>, we conducted qRT-PCR to compare the brp mRNA levels between brp::GFP<sub>11</sub>, UAS-GFP1-10 and UAS-GFP1-10 and found no difference (Figure 1 - figure supplement 1A).

      To further verify the effect of GFP<sub>11</sub> tagging at the protein level, we performed anti-Brp (nc82) immunohistochemistry of brains where GFP is reconstituted pan-neuronally. We found unaltered neuropile localization of nc82 signals (Figure 1 - figure supplement 1C). In presynaptic terminals of the mushroom body calyx, we found integration of Brp::rGFP to nc82 accumulation (Figure 1D). We performed super-resolution microscopy to verify the configuration of Brp::rGFP and confirmed the donut-shape arrangement of Brp::rGFP in the terminals of motor neurons (see Wu, Eno et al., 2025 PLOS Biology), corroborating the nanoscopic assembly of Brp::rGFP at active zones (Kittel et al., 2006 Science).

      Furthermore, co-expression of RFP-tagged voltage-gated calcium channel alpha subunit Cacophony (Cac) and Brp::rGFP in PAM-γ5 dopaminergic neurons revealed strong presynaptic colocalization of their punctate clusters (Figure 1E), suggesting that rGFP tagging of Brp did not damage key protein assembly at active zones (Kawasaki et al., 2004 J Neuroscience; Kittel et al., Science).

      These lines of evidence suggest that the localization of endogenous Brp is barely affected by the C-terminal GFP<sub>11</sub> insertion or GFP reconstitution therewith. This is in line with a large body of studies confirming that the N-terminal region and coiled-coil domains, but not the C-terminal, region of Brp are necessary and sufficient for active zone localization (Fouquet et al., 2009 J Cell Biol; Oswald et al., 2010 J Cell Biol; Mosca and Luo, 2014 eLife; Kiragasi et al., 2017 Cell Rep; Akbergenova et al., 2018 eLife; Nieratschker et al., 2009 PLoS Genet; Johnson et al., 2009 PLoS Biol; Hallermann et al., 2010 J Neurosci). We nevertheless report homozygous lethality and found the decreased immunoreactive signals in flies carrying the GFP<sub>11</sub> insertion (Figure 1 - figure supplement 1B).

      For these reasons, we always use heterozygotes for all the experiments therefore there is no conspicuous defect in locomotion as reported in the original study (Wagh et al., 2005 Neuron). To functionally validate the heterozygotes, we measured the aversive olfactory memory performance of flies where GFP reconstitution was induced in Kenyon cells using R13F02-GAL4. We found that all these transgenes did not alter mushroom body morphology (Figure 7 - figure supplement 1) or memory performance as compared to wild-type flies (Figure 7 - figure supplement 2), suggesting the synapse function required for short-term memory formation is not affected by split-GFP tagging of Brp.

      (2) Several aspects of image acquisition and high-throughput quantification data analysis would benefit from a more detailed clarification.

      (a) For BRP cluster segmentation it is stated in the Materials and Methods state, that intensity threshold and noise tolerance were "set" - this setting has a large effect on the quantification, and it should be specified and setting criteria named and justified (if set manually (how and why) or automatically (to what)). Additionally, if Pyhton was used for "Nearest Neigbor" analysis, the code should be made available within this manuscript; otherwise, it is difficult to judge the quality of this quantification step.

      (b) To better evaluate the quality of both the imaging analysis and image presentation, it would be important to state, if presented and analysed images are deconvolved and if so, at least one proof of principle example of a comparison of original and deconvoluted file should be shown and quantified to show the impact of deconvolution on the output quality as this is central to this study.

      We thank the reviewer for suggesting these clarifications. We have included more description to the revised manuscript to clarify the setting of segmentation, which was manually adjusted to optimize the F-score (previous Figure 1D, now moved to Figure 1 -figure supplement 5). We have included the code used for analyzing nearest neighbor distance, AZ density and local Brp density in the revised manuscript (Supplementary file 1), together with a pre-processed sample data sheet (Supplementary file 2).

      Regarding image deconvolution, we have clarified the differential use of deconvolved and not-deconvolved images in the revised manuscript. We have also included a quantitative evaluation of Richardson-Lucy iterative deconvolution (Figure 1 - figure supplement 4). We used 20 iterations due to only marginal FWHM improvement beyond this point (Figure 1 - figure supplement 4).

      (3) The major part of this study focuses on the description and comparison of the divergent synapse parameters across cell-types in MB compartments, which is highly relevant and interesting. Yet it would be very interesting to connect this new method with functional aspects of the heterogeneous synapses. This is done in Figure 7 with an associative learning approach, which is, in part, not trivial to follow for the reader and would profit from a more comprehensive analysis.

      (a) It would be important for the understanding and validation of the learning induced changes, if not (only) a ratio (of AZ density/local intensity) would be presented, but both values on their own, especially to allow a comparison to the quoted, previous AZ remodelling analysis quantifying BRP intensities (ref. 17, 18). It should be elucidated in more detail why only the ratio was presented here.

      We thank the reviewer for the suggestion on the presentation of learning-induced Brp remodeling. The reported values in Figure 7C are the correlation coefficient of AZ density and local intensity in each compartment, but not the ratio. These results suggest that subcompartment-sized clusters of AZs with high Brp accumulation (Figure 6) undergo local structural remodeling upon associative learning (Figure 7). For clarity, we have included a schematic of this correlation and an example scatter plot to Figure 6. Unlike the previous studies (refs 17 and 18), we did not observe robust learning-dependent changes in the Brp intensity, possibly due to some confounding factors such as overall expression levels and conditioning protocols as described in the previous and following points, respectively.

      (b) The reason why a single instead of a dual odour conditioning was performed could be clarified and discussed (would that have the same effects?).

      (c) Additionally, "controls" for the unpaired values - that is, in flies receiving neither shock nor odour - it would help to evaluate the unpaired control values in the different MB compartments.

      We use single odor conditioning because it is the simplest way to examine the effect of odor-shock association by comparing the paired and unpaired group. Standard differential conditioning with two odors contains unpaired odor presentation (CS-) even in the ‘paired’ group. We now show that single-odor conditioning induces memory that lasts one day as in differential conditioning (Figure 7B; Tully and Quinn, J Comp Phys A 1985).

      (d) The temporal resolution of the effect is very interesting (Figure 7D), and at more time points, especially between 90 and 270 min, this might raise interesting results.

      The sampling time points after training was chosen based on approximately logarithmic intervals, as the memory decay is roughly exponential (Figure 7B). This transient remodeling is consistent with the previous studies reporting that the Brp plasticity was short-lived (Zhang et al., 2018 Neuron; Turrel et al., 2022 Current Biol).

      (e) Additionally, it would be very interesting and rewarding to have at least one additional assay, relating structure and function, e.g. on a molecular level by a correlative analysis of BRP and synaptic vesicles (by staining or co-expression of SV-protein markers) or calcium activity imaging or on a functional level by additional learning assays.

      We thank the reviewer for raising this important point. We have performed calcium imaging of KC presynaptic terminals to correlate the structure and function in another study (see Figure 2 in Wu, Eno et al., 2025 PLOS Biology for more detail). The basal presynaptic calcium pattern along the γ compartments is strikingly similar to the compartmental heterogeneity of Brp accumulation (see also Figure 2 in this study). Considering colocalization of other active-zone components, such as Cac (Figure 1E), we propose that the learning-induced remodeling of local Brp clusters should transiently modulate synaptic properties.

      As a response to other reviewers’ interest, we used Brp::rGFP to measure different forms of Brp-based structural plasticity upon constant light exposure in the photoreceptors and upon silencing rab3 in KCs. Since these experiments nicely reproduced the results of previous studies (Sugie et al., Neuron 2013; Graf et al., Neuron 2009), we believe the learning-induced plasticity of Brp clustering in KCs has a transient nature.

      Reviewer #3 (Public review):

      Summary:

      The authors develop a tool for marking presynaptic active zones in Drosophila brains, dependent on the GAL4 construct used to express a fragment of GFP, which will incorporate with a genome-engineered partial GFP attached to the active zone protein bruchpilot - signal will be specific to the GAL4-expressing neuronal compartment. They then use various GAL4s to examine innervation onto the mushroom bodies to dissect compartment-specific differences in the size and intensity of active zones. After a description of these differences, they induce learning in flies with classic odour/electric shock pairing and observe changes after conditioning that are specific to the paired conditioning/learning paradigm.

      Strengths:

      The imaging and analysis appear strong. The tool is novel and exciting.

      Weaknesses:

      I feel that the tool could do with a little more characterisation. It is assumed that the puncta observed are AZs with no further definition or characterisation.

      We performed additional validation on the tool, including (1) nanoscopic localization of Brp::rGFP using STED imaging; (2) colocalization between Brp::rGFP and anti-Brp signals/VGCCs (Figure 1D-E); 3) activity-dependent active zone remodeling in R8 photoreceptors (Figure 1F). These will be detailed in our point-by-point response below.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The authors keep stating, they profile or assess synaptic structure by analyzing BRP localization, cluster volume, and intensity. However, I do not think that BRP cluster volume and intensity warrant an educated statement about presynaptic structure as a whole. I do not challenge the usefulness of BRP cluster analysis for synapse evaluation, but as there are so many more players involved in synaptic function, BRP analysis certainly cannot explain it all. This should at least be discussed.

      It is correct that Brp is not the only player in the active zone. We have included more discussion on the specific role of Brp (line 84 to 89) and other synaptic markers (line 250) and edited potentially misunderstanding text.

      (2) I do see that changes in BRP expression were observed following associative learning, but is it certain, that synaptic plasticity is generally unaffected by the large GFP fluorophore? BRP is grabbing onto other proteins, both with its C- and N-termini. As the GFP is right before the stop codon, it should be at the N-terminus. How far could BRP function be hampered by this? Is there still enough space for other proteins to interact?

      We thank the reviewer for sharing the concerns. We here provided three lines of evidence to demonstrate that the Brp assembly at active zones required for synaptic plasticity is unaffected by split-GFP tagging.

      First, we assessed olfactory memory of flies that have Brp::rGFP labeled in Kenyon cells and found the performance comparable to wild-type (Figure 7 - figure supplement 2), suggesting the Brp function required for olfactory memory (Knapek et al., J Neurosci 2011) is unaffected by split-GFP tagging.

      Second, we measured Brp remodeling in photoreceptors induced by constant light exposure (LL; Sugie et al., 2015 Neuron). Consistent with the previous study, we found that LL decreased the numbers of Brp::rGFP clusters in R8 terminals in the medulla, as compared to constant dark condition (DD). This result validates the synaptic plasticity involving dynamic Brp rearrangement in the photoreceptors. We have included this result into the revised manuscript (Figure 1F).

      To further validate protein interaction of Brp::rGFP, we focused on Rab3, as it was previously shown to control Brp allocation at active zones (Graf et al., 2009 Neuron). To this end, we silenced rab3 expression in Kenyon cells using RNAi and measured the intensity of Brp::rGFP clusters in γ Kenyon cells. As previously reported in the neuromuscular junction, we found that rab3 knock-down increased Brp::rGFP accumulation to the active zones, suggesting that Brp::rGFP represents the interaction with Rab3. We have included all the new data to the revised manuscript (Figure 1 - figure supplement 3).

      (3) It may well be that not only active-zone-associated BRP is labeled but possibly also BRP molecules elsewhere in the neuron. I would like to see more validation, e.g., the percentage of tagged endogenous BRP associated with other presynaptic proteins.

      To answer to what extent Brp::rGFP clusters represent active zones, we double-labelled Brp::rGFP and Cac::tdTomato (Cacophony, the alpha subunit of the voltage-gated calcium channels). We found that 97% of Brp::rGFP clusters showed co-localization with Cac::tdTomato in PAM-γ5 dopamine neurons terminals (Figure 1E), suggesting most Brp::rGFP clusters represent functional AZs.

      (4) Z-size is ~200 nm, while x/y pixel size is ~75 nm during acquisition. How far down does the resolution go after deconvolution?

      The Z-step was 370 nm and XY pixel size was 79 nm for image acquisition. We performed 20 iterations of Richarson-Lucy deconvolution using an empirical point spread function (PSF). We found that the effect of deconvolution on the full-width at half maximum (FWHM) of Brp::rGFP clusters improves only marginally beyond 20 iterations, when the XY FWHM is around 200 nm and the XZ FWHM is around 450 nm (Figure 1 - figure supplement 4).

      (5) Figure Legend 7: What is a "cytoplasm membrane marker"? Does this mean membrane-bound tdTom is sticking into the cytoplasm?

      We apologize for the typo and have corrected it to “plasma membrane marker”.

      (6) At the end of the introduction: "characterizing multiple structural parameters..." - which were these parameters? I was under the assumption that BRP localization, cluster volume, and intensity were assessed. I do not see how these are structural parameters. Please define what exactly is meant by "structural parameters".

      We apologize for the confusion. By "structural parameters”, we indeed referred to the volume, intensity and molecular density of Brp::rGFP clusters. We have revised the sentence to “Characterizing the distinct parameters and localization of Brp::rGFP cluster.”

      (7) Next to last sentence of the introduction: "Characterizing multiple structural parameters revealed a significant synaptic heterogeneity within single neurons and AZ distribution stereotypy across individuals." What do the authors mean by "significant synaptic heterogeneity"?

      By “synaptic heterogeneity”, we refer to the intracellular variability of active zone cytomatrices reported by Brp clusters. For instance, the intensities of Brp::rGFP clusters within Kenyon cell subtypes were variable among compartments (Figure 2). Intracellular variability of the Brp concentration of individual active zones was higher in DPM and APL neurons than Kenyon cells (Figure 3). These variabilities demonstrate intracellular synaptic heterogeneity. We have revised the sentence to be more specific to the different characters of Brp clusters.

      (8) I do not understand the last sentence of the introduction. "These cell-type-specific synapse profiles suggest that AZs are organized at multiple scales, ranging from neighboring synapses to across individuals." What do the authors mean by "ranging from neighboring synapses to across individuals"? Does this mean that even neighboring synapses in the same cell can be different?

      We have revised the sentence to “These cell-type-specific synapse profiles suggest that AZs are spatially organized at multiple scales, ranging from interindividual stereotypy to neighboring synapses in the same cells.”

      By “neighboring synapses", we refer to the nearest neighbor similarity in Brp levels in some cell-types (Figure 6A-C), and also the sub-compartmental dense AZ clusters with high Brp level in Kenyon cells (Figure 6D-H). By “across individuals”, we refer to the individually conserved active zone distribution patterns in some neurons (Figure 5).

      (9) The title talks about cell-type-specific spatial configurations. I do not understand what is meant by "spatial configurations"? Do you mean BRP cluster volume? I think the title is a little misleading.

      By “spatial configuration”, we refer to the arrangement of Brp clusters within individual mushroom body neurons. This statement is based on our findings on the intracellular synaptic heterogeneity (see also response to comment #7). We have streamlined the text description in the revised manuscript for clarity.

      Reviewer #2 (Recommendations for the authors):

      (1) For Figure 3A: exemplary two AZs are compared here, a histogram comparing more AZs would aid in making the point that in general, AZ of similar size have different BRP level (intensities) and how much variation exists.

      We have included histograms for Brp::rGFP intensity and cluster volumes to Figure 3 in the revised manuscript.

      (2) Line 52: "endogenous synapses" is a confusing term; it's probably meant that the protein levels within the synapse are endogenous and not overexpressed. 

      We apologize for the confusion and have revised the term to “endogenous synaptic proteins.”

      (3) It is not clear from the Materials and Methods section, whether and where deconvolved or not-deconvolved images were used for the quantification pipeline. Please comment on this. 

      We have now revised the Method section to clarify how deconvolved or not-deconvolved images were differently used in the pipeline.

      (4) Line 664 (C) not bold.

      We have corrected the error.

      (5) 725 "Files" should be Flies.

      We have corrected the error.

      (6) 727 two times "first".

      We have corrected the error.

      (7) Figure 7. All (A) etc., not bold - there should be consistent annotation. 

      We want to thank the reviewer for the detailed proof and have corrected all the errors spotted.

      Reviewer #3 (Recommendations for the authors):

      (1) Has there been an expression of the construct in a non-neuronal cell? Astrocyte-like cell? Any glia? As some sort of control for background and activity?

      As the reviewer suggested, we verified the neuronal expression specificity of Brp::rGFP. Using R86E01-GAL4 and Amon-GAL4, we compared Brp::rGFP in astrocyte-like glia and neuropeptide-releasing neurons. We found no Brp::rGFP puncta in the neuropils in astrocyte-like glia compared to neurons, suggesting Brp::rGFP is specific to neurons. We have included this new dataset to the revised manuscript (Figure 1 - figure supplement 2).

      (2) Similarly, expression of the construct co-expressed with a channelrhodopsin, and induction of a 'learning'-like regime of activity, similarly in a control type of experiment, expression of an inwardly rectifying channel (e.g. Kir2.1) to show that increases in size of the BRP puncta are truly activity dependent? The NMJ may be an optimal neuron to use to see the 'donut' structures of the AZs and their increase with activity. Also, are these truly AZs we are seeing here? Perhaps try co-expressing cacophony-dsRed? If the GFP Puncta are active zones, then they should be surrounded by cacophony.

      We would like to clarify that we did not find Brp::rGFP size increase upon learning. Instead, we demonstrated that associative training transiently remodelled sub-compartment-sized AZ “hot spots” in Kenyon cells, indicated by the correlation of local intensity and AZ density (Figure 6-7).

      To demonstrate split-GFP tagging does not affect activity-dependent plasticity associated with Brp, we measured Brp remodeling in photoreceptors induced by constant light exposure (LL; Sugie et al., 2015 Neuron). Consistent with the previous study, we found that LL decreased the numbers of Brp::rGFP clusters in R8 terminals in the medulla, as compared to constant dark condition (DD). This result validates the synaptic plasticity involving dynamic Brp rearrangement in the photoreceptors (Figure 1F).

      As the reviewer suggested, we performed the STED microscopy for the larval motor neuron and confirmed the donut-shape arrangement of Brp::rGFP (Wu, Eno et al., PLOS Biol 2025).

      Also following the reviewer’s suggestion, we double-labelled Brp::rGFP and Cac::tdTomato (Cacophony, the alpha subunit of the voltage-gated calcium channels). We found that 97% Brp::rGFP clusters showed co-localization with Cac::tdTomato in PAM-γ5 dopamine neurons terminals (Figure 1E), suggesting most Brp::rGFP clusters represent functional AZs.

      (3) In the introduction: Intro, a sentence about BRP - central organiser of the active zone, so a key regulator of activity.

      We have included a few more sentences about the role Brp in the active zones to the revised manuscript.

      (4) Figure 1 E, line 650 'cite the resource here'. 

      We thank the reviewer for pointing out the error and we have corrected it.

      (5) Many readers may not be MB aficionados, and to make the data more accessible, perhaps use a cartoon of an MB with the cell bodies of the neurons around the MB expressing the constructs highlighted so that the reader can have a wider idea of the anatomy in relation to the MB.

      We appreciate these comments and have appended cartoons of the MB to figures to help readers understand the anatomy.

    1. Configure the Trigger Trigger type: When record matches conditions Table: User Tests Add condition 1: Status → is → Failed Add condition 2: Finalize → is → checked ✅ 3 Add Action: Create Record Click + Add advanced logic or action Select: Create record Table: Bugs 4 Map the Field Click + Choose field Select: Relevant Test Click the blue + insert button on the right Choose: Airtable record ID (from the trigger step "When record matches conditions")

      End step missing:

      Examples = -> Press confirm -> Move onto next step or move onto step 5

    1. Reviewer #1 (Public review):

      Summary:

      This study focuses on characterizing the EEG correlates of item-specific proportion congruency effects. Two types of learned associations are characterized, one being associations between stimulus features and control states (SC), and the other being stimulus features and responses (SR). Decoding methods are used to identify time-resolved SC and SR correlates, which are used to test properties of their dynamics.

      The conclusion is reached that SC and SR associations can independently and simultaneously guide behavior. This conclusion is based on results showing SC and SR correlates are: (1) not entirely overlapping in cross-decoding; (2) simultaneously observed on average over trials in overlapping time bins; (3) independently correlate with RT; and (4) have a positive within-trial correlation.

      Strengths:

      Fearless, creative use of EEG decoding to test tricky hypotheses regarding latent associations.

      Nice idea to orthogonalize ISPC condition (MC/MI) from stimulus features.

      Weaknesses:

      I still have my concern from the first round that the decoders are overfit to temporally structured noise. As I wrote before, the SC and SR classes are highly confounded with phase (chunk of session). I do not see how the control analyses conducted in the revision adequately deal with this issue.

      In the figures, there are several hints that these decoders are biased. Unfortunately, the figures are also constructed in such a way that hides or diminishes the salience of the clues of bias. This bias and lack of transparency discourage trust in the methods and results.

      I have two main suggestions:

      (1) Run a new experiment with a design that properly supports this question.

      I don't make this suggestion lightly, and I understand that it may not be feasible to implement given constraints; but I feel that this suggestion is warranted. The desired inferences rely on successful identification of SC and SR representations. Solidly identifying SC and SR representations necessitates an experimental design wherein these variables are sufficiently orthogonalized, within-subject, from temporally structured noise. The experimental design reported in this paper unfortunately does not meet this bar, in my opinion (and the opinion of a colleague I solicited).

      An adequate design would have enough phases to properly support "cross-phase" cross-validation. Deconfounding temporal noise is a basic requirement for decoding analyses of EEG and fMRI data (see e.g., leave-one-run-out CV that is effectively necessary in fMRI; in my experience, EEG is not much different, when the decoded classes are blocked in time, as here). In a journal with a typical acceptance-based review process, this would be grounds for rejection.

      Please note that this issue of decoder bias would seem to weaken the rest of the downstream analyses that are based on the decoded values. For instance, if the decoders are biased, in the within-trial correlation analysis, how can we be sure that co-fluctuations along certain dimensions within their projected values are driven by signal or noise? A similar issue clouds the LMM decoding-RT correlations.

      (2) Increase transparency in the reporting of results throughout main text.

      Please do not truncate stimulus-aligned timecourses at time=0. Displaying the baseline period is very useful to identify bias, that is, to verify that stimulus-dependent conditions cannot be decoded pre-stimulus. Bias is most expected to be revealed in the baseline interval when the data are NOT baseline-corrected, which is why I previously asked to see the results omitting baseline correction. (But also note that if the decoders are biased, baseline-correcting would not remove this bias; instead, it would spread it across the rest of the epoch, while the baseline interval would, on average, be centered at zero.)

      Please use a more standard p-value correction threshold, rather than Bonferroni-corrected p<0.001. This threshold is unusually conservative for this type of study. And yet, despite this conservativeness, stimulus-evoked information can be decoded from nearly every time bin, including at t=0. This does not encourage trust in the accuracy of these p-values. Instead, I suggest using permutation-based cluster correction, with corrected p<0.05. This is much more standard and would therefore allow for better comparison to many other studies.

      I don't think these things should be done as control analyses, tucked away in the supplemental materials, but instead should be done as a part of the figures in the main text -- including decoding, RSA, cross-trial correlations, and RT correlations.

      Other issues:

      Regarding the analysis of the within-trial correlation of RSA betas, and "Cai 2019" bias:

      The correction that authors perform in the revision -- estimating the correlation within the baseline time interval and subtracting this estimate from subsequent timepoints -- assumes that the "Cai 2019" bias is stationary. This is a fairly strong assumption, however, as this bias depends not only on the design matrix, but also on the structure of the noise (see the Cai paper), which can be non-stationary. No data were provided in support of stationarity. It seems safer and potentially more realistic to assume non-stationarity.

      This analysis was included in the supplemental material. However, given that the correlation analysis presented in the Results is subject to the "Cai 2019" bias, it would seem to be more appropriate to replace that analysis, rather than supplement it.

      Regardless, this seems to be a moot issue, given that the underlying decoders seem to be overfit to temporally structured noise (see point above regarding weakening of downstream analyses based on decoder bias).

      Outliers and t-values:

      More outliers with beta coefficients could be because the original SD estimates from the t-values are influenced more by extreme values. When you use a threshold on the median absolute deviation instead of mean +/-SD, do you still get more outliers with beta coefficients vs t-values?

      Random slopes:

      Were random slopes (by subject) for all within-subject variables included in the LMMs? If not, please include them, and report this in the Methods.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This useful study uses creative scalp EEG decoding methods to attempt to demonstrate that two forms of learned associations in a Stroop task are dissociable, despite sharing similar temporal dynamics. However, the evidence supporting the conclusions is incomplete due to concerns with the experimental design and methodology. This paper would be of interest to researchers studying cognitive control and adaptive behavior, if the concerns raised in the reviews can be addressed satisfactorily.

      We thank the editors and the reviewers for their positive assessment of our work and for providing us with an opportunity to strengthen this manuscript. Please see below our responses to each comment raised in the reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study focuses on characterizing the EEG correlates of item-specific proportion congruency effects. In particular, two types of learned associations are characterized. One being associations between stimulus features and control states (SC), and the other being stimulus features and responses (SR). Decoding methods are used to identify SC and SR correlates and to determine whether they have similar topographies and dynamics.

      The results suggest SC and SR associations are simultaneously coactivated and have shared topographies, with the inference being that these associations may share a common generator.

      Strengths:

      Fearless, creative use of EEG decoding to test tricky hypotheses regarding latent associations. Nice idea to orthogonalize the ISPC condition (MC/MI) from stimulus features.

      Thank you for acknowledging the strength in EEG decoding and design. We have addressed all your concerns raised below point by point.

      Weaknesses:

      (1a) I'm relatively concerned that these results may be spurious. I hope to be proven wrong, but I would suggest taking another look at a few things.

      While a nice idea in principle, the ISPC manipulation seems to be quite confounded with the trial number. E.g., color-red is MI only during phase 2, and is MC primarily only during Phase 3 (since phase 1 is so sparsely represented). In my experience, EEG noise is highly structured across a session and easily exploited by decoders. Plus, behavior seems quite different between Phase 2 and Phase 3. So, it seems likely that the classes you are asking the decoder to separate are highly confounded with temporally structured noise.

      I suggest thinking of how to handle this concern in a rigorous way. A compelling way to address this would be to perform "cross-phase" decoding, however I am not sure if that is possible given the design.

      Thank you for raising this important issue. To test whether decoding might be confounded by temporally structured noise, we performed a control decoding analysis. As the reviewer correctly pointed out, cross-phase decoding is not possible due to the experimental design. Alternatively, to maximize temporal separation between the training and test data, we divided the EEG data in phase 2 and phase 1&3 into the first and second half chronologically. Phase 1 and 3 were combined because they share the same MC and MI assignments. We then trained the decoders on one half and tested them on the other half. Finally, we averaged the decoding results across all possible assignments of training and test data. The similar patterns (Supplementary Fig.1) observed confirmed that the decoding results are unlikely to be driven by temporally structured noise in the EEG data. The clarification has been added to page 13 of the revised manuscript.

      (1b) The time courses also seem concerning. What are we to make of the SR and SC timecourses, which have aggregate decoding dynamics that look to be <1Hz?

      As detailed in the response to your next comment, some new results using data without baseline correction show a narrower time window of above-chance decoding. We speculate that the remaining results of long-lasting above-chance decoding could be attributed to trials with slow responses (some responses were made near the response deadline of 1500 ms). Additionally, as shown in Figure 6a, the long-lasting above-chance decoding seems to be driven by color and congruency representations. Thus, it is also possible that the binding of color and congruency contributes to decoding. This interpretation has been added to page 17 of the revised manuscript.

      (1c) Some sanity checks would be one place to start. Time courses were baselined, but this is often not necessary with decoding; it can cause bias (10.1016/j.jneumeth.2021.109080), and can mask deeper issues. What do things look like when not baselined? Can variables be decoded when they should not be decoded? What does cross-temporal decoding look like - everything stable across all times, etc.?

      As the reviewer mentioned, baseline-corrected data may introduce bias to the decoding results. Thus, we cited the van Driel et al (2021) paper in the revised manuscript to justify the use of EEG data without baseline-correction in decoding analysis (Page 27 of the revised manuscript), and re-ran all decoding analysis accordingly. The new results revealed largely similar results (Fig. 2, 4, 6 and 8 in the revised manuscript) with the following exceptions: narrower time window for separatable SC subspace and SR subspace (Fig. 4b), narrower time window for concurrent representations of SC and SR (Fig. 6a-b), and wider time window for the correlations of SC/SR representations with RTs (Fig. 8).

      (2) The nature of the shared features between SR and SC subspaces is unclear.

      The simulation is framed in terms of the amount of overlap, revealing the number of shared dimensions between subspaces. In reality, it seems like it's closer to 'proportion of volume shared', i.e., a small number of dominant dimensions could drive a large degree of alignment between subspaces.

      What features drive the similarity? What features drive the distinctions between SR and SC? Aside from the temporal confounds I mentioned above, is it possible that some low-dimensional feature, like EEG congruency effect (e.g., low-D ERPs associated with conflict), or RT dynamics, drives discriminability among these classes? It seems plausible to me - all one would need is non-homogeneity in the size of the congruency effect across different items (subject-level idiosyncracies could contribute: 10.1016/j.neuroimage.2013.03.039).

      Thank you for this question. To test what dimensions are shared between SC and SR subspaces, we first identify which factors can be shared across SC and SR subspaces. For SC, the eight conditions are the four colors × ISPC. Thus, the possible shared dimensions are color and ISPC. Additionally, because the four colors and words are divided into two groups (e.g., red-blue and green-yellow, counterbalanced across subjects, see Methods), the group is a third potential shared dimension. Similarly, for SR decoders, potential shared dimensions are word, ISPC and group. Note that each class in SC and SR decoders has both congruent and incongruent trials. Thus, congruency is not decodable from SC/SR decoders and hence unlikely to be a shared dimension in our analysis. To test the effect of sharing for each of the potential dimensions, we performed RSA on decoding results of the SC decoder trained on SR subspace (SR | SC) (Supplementary Fig. 4a) and the SR decoder trained on SC subspace (SC | SR) (Supplementary Fig. 4b), where the decoders indicated the decoding accuracy of shared SC and SR representations. In the SC classes of SR | SC, word red and blue were mixed within the same class, same were word yellow and green. The similarity matrix for “Group” of SR | SC (Supplementary Fig. 4a) shows the comparison between two word groups (red & blue vs. yellow & green). The similarity matrix for “Group” of SC | SR (Supplementary Fig. 4b) shows the comparison between two color groups (red & blue vs. yellow & green).

      The RSA results revealed that the contributions of group to the SC decoder (Supplementary Fig. 5a) and the SR decoder (Supplementary Fig. 5b) were significant. Meanwhile, a wider time window showed significant effect of color on the SC decoder (approximately 100 - 1100 ms post-stimulus onset, Supplementary Fig. 5a) and a narrower time window showed significant effect of word on SR decoder (approximately 100 - 500 ms post-stimulus onset, Supplementary Fig. 5b). However, we found no significant effect of ISPC on either SC or SR decoders. We also performed the same analyses on response-locked data from the time window -800 to 200 ms. The results showed shared representation of color in the SC decoder (Supplementary Fig. 5c) and group in both decoders (Supplementary Fig. 5c-d). Overall, the above results demonstrated that color, word and group information are shared between SC and SR subspaces.

      Lastly, we would like to stress that our main hypothesis for the cross-subspace decoding analysis is that SR and SC subspaces are not identical. This hypothesis was supported by lower decoding accuracy for cross-subspace than within-subspace decoders and enables following analyses that treated SC and SR as separate representations.

      We have added the interpretation to page 13-14 of the revised manuscript.

      (3) The time-resolved within-trial correlation of RSA betas is a cool idea, but I am concerned it is biased. Estimating correlations among different coefficients from the same GLM design matrix is, in general, biased, i.e., when the regressors are non-orthogonal. This bias comes from the expected covariance of the betas and is discussed in detail here (10.1371/journal.pcbi.1006299). In short, correlations could be inflated due to a combination of the design matrix and the structure of the noise. The most established solution, to cross-validate across different GLM estimations, is unfortunately not available here. I would suggest that the authors think of ways to handle this issue.

      Thank you for raising this important issue. Because the bias comes from the covariance between the regressors and the same GLM was applied to all time points in our analysis, we assume that the inflation would be similar at different time points. Therefore, we calculated the correlation of SC and SR betas ranging from -200 to 0 ms relative to stimulus onset as a baseline (i.e., no SC or SR representation is expected before the stimulus onset) and compared the post-stimulus onset correlation coefficients against this baseline. We hypothesized that if the positively within-trial correlation of SC and SR betas resulted from the simultaneous representation instead of inflation, we should observe significantly higher correlation when compared with the baseline. To examine this hypothesis, we first performed the linear discriminant analysis (Supplementary Fig. 7a) and RSA regression (Supplementary Fig. 7b) on the -200 - 0 ms window relative to stimulus onset. We then calculated the average r<sub>baseline</sub> of SC and SR betas on that time window for each participant (group results at each time point are shown in Supplementary Fig. 7c) and computed the relative correlation at each post-stimulus onset time point using (fisher-z (r) - fisher-z (r<sub>baseline</sub>)). Finally, we performed a simple t test at the group level on baseline-corrected correlation coefficients with Bonferroni correction. The results (Fig. 6c) showed significantly more positive correlation from 100 - 500 ms post-stimulus onset compared with baseline, supporting our hypothesis that the positive within-trial correlation of SC and SR betas arise from simultaneous representation rather than inflation. The related interpretation was added to page 17 of the revised manuscript.

      (4) Are results robust to running response-locked analyses? Especially the EEG-behavior correlation. Could this be driven by different RTs across trials & trial-types? I.e., at 400 ms poststim onset, some trials would be near or at RT/action execution, while others may not be nearly as close, and so EEG features would differ & "predict" RT.

      Thanks for this question. We now pair each of the stimulus-locked EEG analysis in the manuscript with response-locked analysis. To control for RT variations among trial types, when using the linear mixed model (LMM) to predict RTs from trial-wise RSA results, we included a separate intercept for each of the eight trial types in SC or SR. Furthermore, at each time point, we only included trials that have not generated a response (for stimulus-locked analysis) or already started (for response-locked analysis). All the results (Fig. 3, 5, 7, 9 in the revised manuscript) are in support of our hypothesis. We added these detailed to page 31 of the revised manuscript.

      (5) I suggest providing more explanation about the logic of the subspace decoding method - what trialtypes exactly constitute the different classes, why we would expect this method to capture something useful regarding ISPC, & what this something might be. I felt that the first paragraph of the results breezes by a lot of important logic.

      In general, this paper does not seem to be written for readers who are unfamiliar with this particular topic area. If authors think this is undesirable, I would suggest altering the text.

      To improve clarity, we revised the first paragraph of the SC and SR association subspace analysis to list the conditions for each of the SC and SR decoders and explain more about how the concept of being separatable can be tested by cross-decoding between SC and SR subspaces. The revised paragraph now reads:

      “Prior to testing whether controlled and non-controlled associations were represented simultaneously, we first tested whether the two representations were separable in the EEG data.

      In other words, we reorganized the 16 experimental conditions into 8 conditions for SC (4 colors × MC/MI, while collapsing across SR levels) and SR (4 words × 2 possible responses per word, while collapsing across SC levels) associations separately. If SC and SR associations are not separable, it follows that they encode the same information, such that both SC and SR associations can be represented in the same subspace (i.e., by the same information encoded in both associations). For example, because (1) the word can be determined by the color and congruency and (2) the most-likely response can be determined by color and ISPC, the SR association (i.e., association between word and most-likely response) can in theory be represented using the same information as the SC association. On the other hand, if SC and SR associations are separable, they are expected to be represented in different subspaces (i.e., the information used to encode the two associations is different). Notably, if some, but not all, information is shared between SC and SR associations, they are still separable by the unique information encoded. In this case, the SC and SR subspaces will partially overlap but still differ in some dimensions. To summarize, whether SC and SR associations are separable is operationalized as whether the associations are represented in the same subspace of EEG data. To test this, we leveraged the subspace created by the LDA (see Methods). Briefly, to capture the subspace that best distinguishes our experimental conditions, we trained SC and SR decoders using their respective aforementioned 8 experimental conditions. We then projected the EEG data onto the decoding weights of the LDA for each of the SC and SR decoders to obtain its respective subspace. We hypothesized that if SC and SR subspaces are identical (i.e., not separable), SC/SR decoding accuracy should not differ by which subspace (SC or SR) the decoder is trained on. For example, SC decoders trained in SC subspace should show similar decoding performance as SC decoders trained in SR subspace. On the other hand, if SC and SR association representations are in different subspaces, the SC/SR subspace will not encode all information for SR/SC associations. As a result, decoding accuracy should be higher using its own subspace (e.g., decoding SC using the SC subspace) than using the other subspace (e.g., decoding SC using the SR subspace). We used cross-validation to avoid artificially higher decoding accuracy for decoders using their own subspace (see Methods).” (Page 11-12).

      We also explicitly tested what information is shared between SC and SR representations (see response to comment #2). Lastly, to help the readers navigate the EEG results, we added a section “Overview of EEG analysis” to summarize the EEG analysis and their relations in the following manner:

      “EEG analysis overview. We started by validating that the 16 experimental conditions (8 unique stimuli × MC/MI) were represented in the EEG data. Evidence of representation was provided by above-chance decoding of the experimental conditions (Fig. 2-3). We then examined whether the SC and SR associations were separable (i.e., whether SC and SR associations were different representations of equivalent information). As our results supported separable representations of SC and SR association (Fig. 4-5), we further estimated the temporal dynamics of each representation within a trial using RSA. This analysis revealed that the temporal dynamics of SC and SR association representations overlapped (Fig. 6a-b, Fig. 7a-b). To explore the potential reason behind the temporal overlap of the two representations, we investigated whether SC and SR associations were represented simultaneously as part of the task representation, independently from each other, or competitively/exclusively (e.g., on some trials only SC association was represented, while on other trials only SR association was represented). This was done by assessing the correlation between the strength of SC and SR representations across trials (Fig. 6c, Fig. 7c). Lastly, we tested how SC and SR representations facilitated performance (Fig.8-9).” (Page 8-9).

      Minor suggestions:

      (6) I'd suggest using single-trial RSA beta coefficients, not t-values, as they can be more stable (it's a t-value based on 16 observations against 9 or so regressors.... the SE can be tiny).

      Thank you for your suggestion. To choose between using betas and t-values, we calculate the proportion of outliers (defined as values beyond mean ± 5 SD) for each predictor of the design matrix and each subject. We found that outliers were less frequent for t-values than for beta coefficients (t-values: mean = 0.07%, SD = 0.009%; beta-values: mean = 0.19%, SD = 0.033%). Thus, we decided to stay with t-values.

      (7) Instead of prewhitening the RTs before the HLM with drift terms, try putting those in the HLM itself, to avoid two-stage regression bias.

      Thank you for your suggestion. Because our current LMM included each of the eight trial types in SC or SR as separate predictors with their own intercepts (as mentioned above), adding regressors of trial number and mini blocks (1-100 blocks) introduced collinearity (as ISPC flipped during the experiment). We therefore excluded these regressors from the current LMM (Page 31).

      (8) The text says classical MDS was performed on decoding *accuracy* - is this accurate?

      We now clarify in the manuscript that it is the decoders’ probabilistic classification results (Page 28).

      (9) At a few points, it was claimed that a negative correlation between SC and SR would be expected within single trials, if the two were temporally dissociable. Wouldn't it also be possible that they are not correlated/orthogonal?

      We agree with the reviewer and revised the null hypothesis in the cross-trial correlation analysis to include no correlation as SC and SR association representations may be independent from each other (Page 17, 22).

      Reviewer #2 (Public review):

      Summary:

      In this EEG study, Huang et al. investigated the relative contribution of two accounts to the process of conflict control, namely the stimulus-control association (SC), which refers to the phenomenon that the ratio of congruent vs. incongruent trials affects the overall control demands, and the stimulus-response association (SR), stating that the frequency of stimulusresponse pairings can also impact the level of control. The authors extended the Stroop task with novel manipulation of item congruencies across blocks in order to test whether both types of information are encoded and related to behaviour. Using decoding and RSA, they showed that the SC and SR representations were concurrently present in voltage signals, and they also positively co-varied. In addition, the variability in both of their strengths was predictive of reaction time. In general, the experiment has a solid design, but there are some confounding factors in the analyses that should be addressed to provide strong support for the conclusions.

      Strengths:

      (1) The authors used an interesting task design that extended the classic Stroop paradigm and is potentially effective in teasing apart the relative contribution of the two different accounts regarding item-specific proportion congruency effect, provided that some confounds are addressed.

      (2) Linking the strength of RSA scores with behavioural measures is critical to demonstrating the functional significance of the task representations in question.

      Thank you for your positive feedback. We hope our responses below address your concerns.

      Weakness:

      (1) While the use of RSA to model the decoding strength vector is a fitting choice, looking at the RDMs in Figure 7, it seems that SC, SR, ISPC, and Identity matrices are all somewhat correlated. I wouldn't be surprised if some correlations would be quite high if they were reported. Total orthogonality is, of course, impossible depending on the hypothesis, but from experience, having highly covaried predictors in a regression can lead to unexpected results, such as artificially boosting the significance of one predictor in one direction, and the other one to the opposite direction. Perhaps some efforts to address how stable the timed-resolved RSA correlations for SC and SR are with and without the other highly correlated predictors will be valuable to raising confidence in the findings.

      Thank you for this important point. The results of proportion of variability explained shown in the Author response table 1 below, indicated relatively higher correlation of SC/SR with Color and Identity. We agree that it is impossible to fully orthogonalize them. To address the issue of collinearity, we performed a control RSA by removing predictors highly correlated with others. Specifically, we calculated the variance inflation factor (VIF) for each predictor. The Identity predictor had a high VIF of 5 and was removed from the RSA. All other predictors had VIFs < 4 and were kept in the RSA. The results (Supplementary Fig. 6) showed patterns similar to the results with the Identity predictor, suggesting that the findings are not significantly influenced by collinearity. We have added the interpretation to page 17 of the revised manuscript.

      Author response table 1.

      Proportion of variability explained (r<sup>2</sup>) of RSA predictors.

      (2) In "task overview", SR is defined as the word-response pair; however, in the Methods, lines 495-496, the definition changed to "the pairing between word and ISPC" which is in accordance with the values in the RDMs (e.g., mccbb and mcirb have similarity of 1, but they are linked to different responses, so should they not be considered different in terms of SR?). This needs clarification as they have very different implications for the task design and interpretation of results, e.g., how correlated the SC and SR manipulations were.

      Thank you for pointing out this important issue with how our operationalization captures the concept in questions. In the revised manuscript, we clarified the stimulus-response (SR) association is the link between the word and the most-likely response (i.e., not necessarily the actual response on the current trial). This association is likely to be encoded based on statistical learning over several trials. On each trial, the association is updated based on the stimulus and the actual response. Over multiple trials, the accumulated association will be driven towards the most-common (i.e., most-likely) response. In our ISPC manipulation, a color is presented in mostly congruent/incongruent (MC/MI) trials, which will also pair a word with a most-likely response. For example, if the color blue is MC, the color blue, which leads to the response blue, will co-occur with the word blue with high frequency. In other words, the SR association here is between the word blue and the response blue. As the actual response is not part of the SR association, in the RDM two trial types with different responses may share the same SR association, as long as they share the same word and the same ISPC manipulation, which, by the logic above, will produce the same most-likely response. These clarifications have been added to page 4 and 29 of the revised manuscript.

      In the revised manuscript (Page 17), we addressed how much the correlated SC and SR predictors in the RDM could affect the correlation analysis between SC and SR association representation strength. Specifically, we conducted the RSA using the same GLM on EEG data prior to stimulus onset (Supplementary Fig. 7a-b). As no SC and SR associations are expected to be present before stimulus onset, the correlation between SC and SR representation would serve as a baseline of inflation due to correlated predictors in the GLM (Supplementary Fig. 7c, also see comment #3 of R1). The SC-SR correlation coefficients following stimulus onset was then compared to the baseline to control for potential inflation (Fig. 6c). Significantly above-baseline correlation was still observed between ~100-500 ms post-stimulus onset, providing support for the hypothesis that SC and SR are encoded in the same task representation.

      Minor suggestions:

      (3) Overall, I find that calling SC-controlled and SR-uncontrolled representations unwarranted. How is the level controlledness defined? Both are essentially types of statistical expectation that provide contextual information for the block of tasks. Is one really more automatic and requires less conscious processing than the other? More background/justification could be provided if the authors would like to use these terms.

      Following your advice, we have added more discussion on how controlledness is conceptualized in this work and in the literature, which reads:

      “We consider SC and SR as controlled and uncontrolled respectively based on the literature investigating the mechanism of ISPC effect. The SC account posits that the ISPC effect results from conflict and involves conflict adaptation, which requires the regulation of attention or control (Bugg & Hutchison, 2013; Bugg et al., 2011; Schmidt, 2018; Schmidt & Besner, 2008). On the other hand, the SR account argues that ISPC effect does not require conflict adaptation but instead reflects contingency leaning. That is, the response can be directly retrieved from the association between the stimulus and the most-likely response without top-down regulation of attention or control. As more empirical evidence emerged, researchers advocating control view began to acknowledge the role of associative learning in cognitive control regarding the ISPC effect (Abrahamse et al., 2016). SC association has been thought to include both automatic that is fast and resource saving and controlled processes that is flexible and generalizable (Chiu, 2019). Overall, we do not intend to claim that SC is entirely controlled or SR is completely automatic. We use SC-controlled and SR-uncontrolled representations to align with the original theoretical motivation and to highlight the conceptual difference between SC and SR associations.” (Page 24-25)

      (4) Figures 3c and d: the figures could benefit from more explanation of what they try to show to the readers. Also for 3d, the dimensions were aligned with color sets and congruencies, but word identities were not linearly separable, at least for the first 3 axes. Shouldn't one expect that words can be decoded in the SR subspace if word-response pairs were decodable (e.g., Figure 3b)?

      Thank you for the insightful observation. We now clarified that Fig. 3c and d in the original manuscript (Fig. 4c and d in the current manuscript) aim to show how each of the 8 trial types in the SC and SR subspaces are represented. The MDS approach we used for visualization tries to preserve dissimilarity between trial types when projecting from data from a high dimensional to a low dimensional space. However, such projection may also make patterns linearly separatable in high dimensional space not linearly separatable in low dimensional space. For example, if the word blue has two points (-1, -1) and (1, 1) and the word red has two points (-1, 1) and (1, -1), they are not linearly separatable in the 2D space. Yet, if they are projected from a 3D space with coordinates of (-1, -1, -0.1), (1, 1, -0.1), (-1, 1, 0.1) and (1, -1, 0.1), the two words can be linearly separatable using the 3<sup>rd</sup> dimension. Thus, a better way to test whether word can be linearly separated in SR subspace is to perform RSA on the original high dimensional space. We performed the RSA with word (Supplementary Fig. 2) on the SR decoder trained on the SR subspace. Note that in Fig. 3c and d of the original script (Fig. 4c and d in the current manuscript) there are two pairs of words that are not linearly separable: red-blue and yellow-green. Thus, we specifically tested the separability within the two pairs using the one predictor for each pair, as shown in Supplementary Fig. 2. The results showed that within both word pairs individual words were presented above chance level (Supplementary Fig. 3). Considering that the decoders are linear, this finding indicates linear separability of the word pairs in the original SR subspace. The clarification has been added to page 13 (the end of the second paragraph) of the revised manuscript.

      References

      Abrahamse, E., Braem, S., Notebaert, W., & Verguts, T. (2016). Grounding cognitive control in associative learning. Psychological Bulletin, 142(7), 693-728.doi:10.1037/bul0000047.

      Bugg, J. M., & Hutchison, K. A. (2013). Converging evidence for control of color-word Stroop interference at the item level. Journal of Experimental Psychology:Human Perception and Performance, 39(2), 433-449. doi:10.1037/a0029145.

      Bugg, J. M., Jacoby, L. L., & Chanani, S. (2011). Why it is too early to lose control in accounts of item-specific proportion congruency effects. Journal of Experimental Psychology: Human Perception and Performance, 37(3), 844-859. doi:10.1037/a0019957.

      Chiu, Y.-C. (2019). Automating adaptive control with item-specific learning. In Psychology of Learning and Motivation (Vol. 71, pp. 1-37).

      Schmidt, J. R. (2018). Evidence against conflict monitoring and adaptation: An updated review. Psychonomic Bulletin & Review, 26(3), 753-771. doi:10.3758/s13423018-1520-z.

      Schmidt, J. R., & Besner, D. (2008). The Stroop effect: Why proportion congruent has nothing to do with congruency and everything to do with contingency. Journal of Experimental Psychology: Learning, Memory, and Cognition, 34(3), 514-523. doi:10.1037/0278-7393.34.3.514.

    1. Each circle identifies what students do. Students 1) imagine, examine, and perceive; 2) explore, experiment, and develop craft; 3) create; 4) reflect, assess, and revise, and 5) share their products with others. The arrows indicate the ways teachers can guide students through the creative process.

      I appreciate having these steps to help students through the creative process. It is so important that students take ownership of their creativity, but I have often asked, " How do you do this? These steps really help lay it out and help us know how to motivate them.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General Response to Review

      We would like to thank all three reviewers for their encouraging comments on our manuscript. We now submit our revised study after considerable efforts to address each of the reviewer concerns. I will first provide a response related to a major change we have made in the revision that addressed a concern common to all three reviewers, followed by a point-by-point response to individual comments.

      Replacing LRRK2ARM data with a LRRK2 specific type II kinase inhibitor: The most critical issue for all 3 reviewers was the use of our new CRISPR-generated truncation mutant of LRRK2 that we called LRRK2ARM. We had not provided direct evidence of the protein product of this truncation, which was a significant limitation. To address this we performed proteomics analysis of all clones, and to our surprise, we identified 7 peptides that were C-terminal to our "predicted" stop codon we had engineered into the CRISPR design. A repeat of the deep sequencing analysis in both directions then more clearly revealed site specific mutations leading to 4 amino acid changes at the junction of exon 19, without introducing a stop codon. Given that we could not detect the protein by western blot (even though proteomics now indicated the region of LRRK2 recognized by our antibodies was present) we decided to remove this clone from the manuscript. In the meantime we had compared the ineffectiveness of MLi-2 to block Rab8 phosphorylation during iron overload in the LRRK2G2019S cells with a type II kinase inhibitor called rebastinib. The data showed very clearly that treatment with rebastinib reversed the iron-induced phospho-Rab8 at the plasma membrane (and by western blot, in new Fig 3). Since this inhibitor is very broad spectrum inhibiting ~30% of the kinome we reached out to Sam Reck-Peterson and Andres Leschziner, experts in LRRK2 structure/function, who recently developed a much more selective LRRK2-specific type II kinase inhibitor they called RN341 and RN277 (developed with Stefan Knapp PMID: 40465731). These compounds effectively coupled the MLi-2 compound through an indole ring to a rebastinib type II compound to provide LRRK2 binding specificity to the efficient DYG "out" type II inhibitor. As with rebastinib, the new LRRK-specific kinase inhibitors also effectively reversed the cell surface p-Rab8 seen in LRRK2G2019S, iron loaded cells. These new data provide the first biological paradigm where the kinase activity of LRRK2 is resistant to type I MLi-2, yet remains highly sensitive to type II inhibitors. While the loss of our LRRK2ARM clone marks a significant change in the manuscript we believe the main message is stronger with the addition of the new LRRK2 specific type II kinase inhibitor. Our data show that it is indeed the active kinase function of LRRK2G2019S that is impacting the iron phenotypes we observe but highlight the conformational specificity upon iron overload such that MLi-2 is ineffective. The overall phenotypes we observe in LRRK2G2019S macrophages remain unchanged and are now expanded within the manuscript. We hope reviewers will agree that our work provides important new insights into LRRK2 function in iron homeostasis while opening new avenues of research in future studies.

      Given this new information we have changed the title from "LRRK2G2019S acts as a dominant interfering mutant in the context of iron overload" to the more accurate "LRRK2G2019S interferes with NCOA4 trafficking in response to iron overload leading to oxidative stress and ferroptotic cell death."

      Response to Reviewer 1

      Reviewer 1 (R1): There are two major concerns with the data in their present form. In brief, first, the G2019S cells express much less LRRK2 and more Rab8 that the WT cells and this severely affects interpretability.

      Heidi McBride (HM): We agree that the LRRK2G2019S lines express lower levels of LRRK2 than wild type, which is a previously documented phenomenon, presumably as the cell attempts to downregulate the increased kinase activity by reducing protein expression. However, the levels of Rab8 across 10s of experiments do not consistently show any differences between the wild type, G2019S and KO. We have provided more comprehensive quantifications of the blots in the revised version, and the Rab8 levels are consistent across all the blots presented in the manuscript (Figure 1A and 1B).

      R1: Second, the investigators used CRISPR to truncate the endogenous LRRK2 locus to produce a hypothetical truncated LRRK2-ARM polypeptide. This appears to have robust effects on NCOA4, in particular, which drives the overall interpretation of the data. However, the expression of this novel LRRK2 species is not confirmed nor compared to WT or G2019S in these cells (although admittedly the investigators did seek to address this with subsequent KO in the ARM cells). It would be premature to account for the changes reported without evidence of protein expression. This latter issue may be more easily addressed and could provide very strong support for a novel function/finding, see more detailed comments below, most seeking clarifications beyond the above.

      HM: As described in my common response above, we have removed the LRRK2ARM data from the manuscript.

      R1: Need to make clear in the results whether the G2019S CRISPR mutant is heterozygous or homozygous (presumably homozygous, same for ARM)

      HM: The RAW cell line we generated is homozygous for the G2019S and the KO alleles. We added this to the beginning of the results section and methods.

      R1: The text of the results implies that MLi2 was used in both WT and G2019S Raw cells, but it's only shown for G2019S. Given the premise for the use of RAW cells, it's important to show that there is basal LRRK2 kinase activity in WT cells to go along with its high protein expression. This is particularly important as the G2019S blot suggests minor LRRK2-independent phosphorylation of Rab8a (and other detected pRabs). One would imagine that pRab8 levels in both WT and G2019S would reduce to the same base line or ratio of total Rab in the presence of MLi2, but WT untreated is similar to G2019S with MLi2. This suggests no basal LRRK2 activity in the Raw cells, but I don't think that is the case.

      HM: We have included the data from MLi-2 treatment of wild type cells in Fig 3C quantified in D. Again, the baseline levels of Rab8 are unchanged across the genotypes. However, the reviewer is correct that there is some baseline LRRK2 kinase activity that is sensitive to MLi2 in wild type cells. This is seen most clearly on the autophosphorylation of LRRK2 at S1292 in Fig 3C. The pRab8 blots is not as clear in wild type cells. It is likely that LRRK2 must be actively recruited to membranes (as seen by others with LLOME, etc) to easily visualize p-Rabs in wild type cells. Nevertheless, we do clearly see the activity of autophosphorylation in wild type cells. Therefore while we understand the reviewers point that there should be some Rab8 phosphorylation in wild type cells, we don't see a significant, or very convincing, amount of it in our RAW macrophages.

      R1: Also, in terms of these cells, the levels of LRRK2 are surprisingly unmatched (Fig 1A, 1D, 1H, S1D, etc.) as are total levels of Rab8 (but in opposite directions) between the WT and G2019S. This is not mentioned in the Results text and is clearly reproducible and significant. Why do the investigators think this is? If Rab8 plays a role in iron, how do these differences affect the interpretation of the G2019S cells (especially given that MLi2 does not rescue)? Are other LRRK2-related Rabs affected at the protein (not phosphorylation level)? Could reduced levels of LRRK2 or increase Rab 8 alone or together account for some of these differences? Substantial further characterization is required as this seriously affects the interpretability of the data. Since pRab8 is not normalized to total Rab8, this G2019S model may not reflect a total increase in LRRK2 kinase activity, and could in fact have both less LRRK2 protein and less cellular kinase activity than WT (in this case).

      HM: In our hands, the RAW cells with homozygous LRRK2G2019S mutations show clearly that the total protein levels of LRRK2 is reduced compared to wild type, which is likely a compensatory effect to reduce cellular kinase activity overall. We understand that some of our previous blots were not so clear on the total Rab8 levels across the different experiments. We have repeated many of these experiments and hope the reviewer can see in Figs 1A, 3C, 3E, 3J, and Sup3A that the total Rab8 levels are stable across the conditions. We also present quantifications from 3 independent experiments normalizing the pRab8/Rab8 levels in all three genotypes in untreated and iron-loaded conditions (Supp Fig 3A and B), and upon MLi2 treatment (Fig 3C). In 3C and D the data show the effectiveness of MLi-2 to reduce pRab8 in control conditions, but the resistance to MLi-2 in FAS treated cells.

      R1: Presumably, the blots in 1H are whole cell lysates and account for the pooled soluble and insoluble NCOA4 (increased in G2019S), as there is no difference in soluble NCOA4 (Fig 2H). I suspect the prior difference is nicely reflected in the insoluble fraction (Fig 2H). This should be better explained in the Results text. This is a very interesting finding and I wonder what the investigators believe is driving this phenotype? Is the NCOA4 partitioning into a detergent-inaccessible compartment? Does this replicate with other detergents, those perhaps better at solubilizing lipid rafts? Is this a phenotype reversible with MLi2? Very interesting data.

      HM: We apologize for not being clearer in the text describing the behavior of NCOA4. The reviewer is correct that the major change in G2019S is the increased triton-X100 insoluble NCOA4. Previous work has established that NCOA4 segregates into detergent-insoluble foci upon iron overload as a way to release it from ferritin cages, and this fraction is then internalized into lysosomes through a microautophagy pathway (see Mizushima's work PMID: 36066504). In Fig 1I we show that the elevation in NCOA4 and ferritin heavy chain seen in untreated G2019S cells can be cleared upon iron chelation with DFO, indicating that the canonical NCOA4 mediated ferritinophagy (macroautophagy) pathway remains intact to recycle the iron in conditions of iron starvation. However in Figure 2 we show that conditions of iron overload, when NCOA4 segregates from ferritin (to allow cytosolic storage of iron), this form of NCOA4 cannot be degraded within the lysosome through the microautophagy pathway, and begins to accumulate. We see this with our live and fixed imaging compared to wild type cells (Fig 2A,D), and by the lack of clearance seen by western blot (Fig 2E). As for the impact of MLi-2, we observe some reversal of NCOA4 accumulation in untreated cells at 4 and 8 hrs after MLi-2 treatment (Supp Fig 2F). However, in iron loaded conditions the high NCOA4 levels in G2019S cells are MLi2 insensitive, while the elevated NCOA4 in wild type cells is reduced upon MLi2 addition (Fig. 2F, compare lates 3vs4 in wt with lanes 7vs8 in G2019S). This is consistent with a block in the microautophagy pathway of phase-separated NCOA4 degradation in G2019S cells.

      R1: Figure 2 describes the increased NCOA4-positive iron structures after iron load, but does not emphasize that the G2019S cells begin preloaded with more NCOA4. How do the investigators account for differential NCOA4 in this interpretation? Is this simply a reflection of more NCOA4 available in G2019S cells? This seems reasonable.

      HM: The reviewer is correct, we showed that there is some turnover of NCOA4 in untreated conditions through canonical ferritinophagy, but in iron overload this appears to be blocked, the NCOA4 segregates from ferritin and remains within insoluble, phase-separated structures that cannot be degraded through microautophagy. We have written the text to be more clear on these points.

      R1: These are very long exposures to iron, some as high as 48 hr which will then take into account novel transcriptomic and protein changes. Did the investigators evaluate cell death? Iron uptake would be trackable much quicker.

      HM: We agree that many things will change after our FAS treatments and now provide a full proteomics dataset on wild type and G2019S cells with and without iron overload, which is presented in Figure 4A-B. Indeed Figure 4 is entirely new to this revised submission. The proteomics highlighted a series of cellular changes that reflect major cell stress responses including the upregulation of HMOX1 (western blots to validate in Supp Fig 4A), an NRF2 transcriptional target consistent with our observation that NRF2 is stabilized and translocated to the nucleus in G2019S iron loaded cells (Sup Fig 4B,C). There are several interesting changes, and we highlighted the three major nodes, which are changes in iron response proteins, lysosomal proteins - particularly a loss of catalytic enzymes like lysozymes and granzymes consistent with the loss of hydrolytic capacity we show in Fig. 4C,D. We also noted changes in cytoskeletal proteins we suspect is consistent with the "blebbing" of the plasma membrane we see decorated with pRab8 in Fig 3. To test the activation of lipid oxidation likely resulting from the elevation in Fe2+ and oxidation signatures we employed the C11-bodipy probe and observe strong signal specific to the G2019 iron-loaded cells, particularly labelling endocytic compartments and the cell surface (Fig. 4E-G).

      Lastly, an analysis of SYTOX green uptake experiments was done to monitor the uptake of the dye into cells that have died of cell membrane rupture, commonly used to examine ferroptotic cell death. We now show the G2019S cells are very susceptible to this form of death (Fig 4H,I). These data add new functional evidence for the consequence of the G2019S mutation in an increased susceptibility to iron stress.

      R1: The legend for 2F is awkward (BSADQRED)

      HM: We have changed this to BSA-DQRed, which is a widely used probe to monitor the hydrolytic capacity of the lysosome.

      R1: Why are WT cells not included in Fig 2G?

      HM: We have now included new panels in Fig 3C,D showing wild type and G2019S +/- FAS and +/-ML-i2 with quantifications of pRab8/Rab8.

      R1: The biochemical characterization of NCOA4 in the LRRK2-arm cells is a great experiment and strength of the paper. The field would benefit by a bit further interrogation, other detergents, etc.

      HM: We have removed all of the LRRK2ARM data given our confusion over the impact of the 4 amino acid changes in exon 19 and our inability to monitor this protein by western blot. The concept that NCOA4 enters into TX100 insoluble, phase separated compartments has been well established, so we didn't explore other detergents at this point.

      R1: Have the investigators looked for aberrant Rab trafficking to lysosomes in the LRRK2-arm cells? Is pRab8 mislocalized compared to WT? Other pRabs?

      HM: We did initially show that pRab8 was also at the plasma membrane in the LRRK2ARM cells, and we still focus on this finding for the G2019S, seen in Fig 3A,B,F,H. We did try to look at other p-Rabs known to be targets of LRRK2 but none of them worked in immunofluorescence so we couldn't easily monitor specific traffic and/or localization changes for them.

      R1: The expression levels and therefore stability of the ARM fragment is not shown. This is necessary for interpretation. While very intriguing, the data in Aim 3 rely on the assumption that the ARM fragment is expressed, and at comparable levels to G2019S to account for phenotypes. The generation of second clone is admirable, but the expression of the protein must be characterized. This is especially true because of the different LRRK2 levels between WT and G2019S. One could easily conceive of exogenous expression of a tagged-ARM fragment into LRRK2 KO cells, for example, as another proof-of-concept experiment. If it is truly dominant, does this effect require or benefit from some FL LRRK2? It seems easy enough to express the LRRK2-ARM in at least WT and KO RAW cells.

      HM: We agree and our attempts to understand this clone resulted in its removal from the manuscript. We did also express cDNA encoding our ARM domain (up to exon 19), but it didn't phenocopy the CRISPR clone, which of course made sense once we had better proteomics and repeated our deep sequencing.

      In our further efforts to understand why our phenotype was MLi-2 resistant upon iron overload we expanded to examine the impact of pan-specific TypeII kinase inhibitors, and then reached out to the Reck-Peterson and Leschziner labs to obtain a newly developed LRRK2 selective type II kinase inhibitor. These all very efficiently reversed the pRab8 signals seen at the plasma membrane of G2019S cells upon iron overload (Fig 3E-K). Therefore the G2019S is not dominant negative, as we had initially supposed, rather there is a specific conformation of LRRK2 in high iron that potentially opens the ATP binding pocket to bind the type II inhibitors, but not MLi2. We do not understand exactly what this conformation is but likely involves new protein interactions specific to high iron, or perhaps LRRK2 binds iron directly as a sensor somehow that ultimately leads to the differential sensitivity we observe between type I and type II kinase inhibitors. Our data indicate that MLi-2 treatment in clinic will not be protective against iron toxicity phenotypes that may contribute to PD, where these newer selective type II LRRK2 kinase inhibitors would be effective in this conformation-specific context of iron toxicity.

      R1: Does iron overload induce Rab8a phosphorylation in a LRRK2 KO cell? This would be a solid extension on the ARM data and support the important finding that an additional kinase(s) can phosphorylate Rab8a under these conditions, and while not unexpected, this may not have been demonstrated by others as clearly. It also addresses whether the ARM domain is important to this other putative kinase(s), which may add value to the authors' model.

      HM: Iron overload does not induce pRab8 in LRRK2 KO cells, as seen by immunofluorescence in Fig 3A,B, and western blot in Supp Fig 3 A,B. With our new type II kinase inhibitor data we can confirm that the plasma membrane localized Rab8 is indeed phosphorylated by LRRK2.

      R1: Minor concern - the abstract but not the introduction emphasizes a hypothesis that loss of neuromelanin may promote cell loss in PD (through loss of iron chelation), while post mortem studies are by definition only correlative, early works suggested that the higher melanized DA neurons were preferentially lost when compared to poorly melanized neurons in PD. This speculation in the abstract is not necessary to the novel findings of the paper.

      HM: We appreciate that the links to iron in PD are correlative, we have maintained some of our discussion on this point within the manuscript given the lack of attention the field has paid to the cell biology of iron homeostasis in PD models. If there is a cell autonomous nature to the loss of DA neurons in PD, iron is very likely to be a part of this specificity in our opinion. Most of the newer MRI studies looking at iron levels in patient brains are showing higher free iron and working on this as potential biomarkers of disease. The precise timing of this relative to the stability/loss of neuromelanin is, I agree, not really clear.

      R1: (Significance (Required)): This study could shed light on a both novel and unexpected behavior of the LRRK2 protein, and open new insights into how pathogenic mutations may affect the cell. While studied in one cell line known for unusually high LRRK2 expression levels, data in this cell type have been broadly applicable elsewhere. Give the link to Parkinson's disease, Rab-dependent trafficking, and iron homeostasis, the findings could have import and relevance to a rather broad audience.

      HM: We are so very appreciative that reviewer 1 feels our work will be of interest to the PD and cell biology communities.

      Response to Reviewer 2

      Reviewer 2 (R2): Major: Please confirm that the observed phenotype is conserved within bone marrow-derived macrophages of LRRK2 G2019S mice. These mice are widely available within the community and frozen bone marrow could be sent to the labs. The main reason for this experiment is that CRISPR macrophage cell lines do sometimes acquire weird phenotypes (at least in our lab they sometimes do!) and it would strengthen the validity of the observations.

      HM: We did a series of experiments on primary BMDM derived from 3 pairs of wild type, LRRK2G2019S and LRRK2KO mice. We examined levels of ferritin heavy and light chains in steady state and withFAS treatment experiments. Unfortunately the data did not phenocopy the RAW macrophage lines we present here since FTL and FTH were mostly unchanged. We did observe an increase in NCOA4 levels, consistent with potential issues with microautophagy as observed in our RAW system.

      While we understand the danger that our phenotypes are nonspecific and linked to a CRISPR-based anomaly, there are a number of arguments we would make that these data and pathways are potentially very important to our understanding of LRRK2 mutant phenotypes and pathology. The first point is that we now include a LRRK2-specific type II kinase inhibitor that reverses the iron-overload pRab8 accumulation at the plasma membrane in LRRK2G2019S cells, showing that this is at least directly linked to LRRK2 kinase activity, even though it is resistant to MLi2.

      Second, Suzanne Pfeffer recently published their single cell RNAseq datasets from brains of untreated LRRK2G2019S mice (PMID: 39088390). She reported major changes in Ferritin heavy chain (it is lost) in very specific cell types of the brain, astrocytes, microglia and oligodendrocytes, with no changes in other cell types at all (her Fig 6 included left). This is consistent with a very context specific impact of LRRK2 on iron homeostasis that we don't yet understand.

      Third, the labs of both Cookson, Mamais and Lavoie have been working on the impact of LRRK2 mutations on iron handling in a few different model systems, including iPSCs, and see changes in transferrin recycling and iron accumulation. Those studies did not go into much detail on ferritin, NCOA4 and other readouts of iron homeostasis but are roughly in agreement with our work here. In the last biorxiv study submitted after we sent this work for review they concluded their phenotypes were reversed by MLi2 treatment, however they required 7 days of treatment for a ~20% restoration in iron levels. Given our work it would seem the impact of LRRK2G019S in high iron conditions is also very resistant to MLi2 treatment. In all these studies we do not yet know for sure whether iron overload in the brain may be a precursor to DA neuron cell death, which could be exacerbated in G2019S carriers. But we hope the reviewer will agree that our approach and findings will be useful for the field to expand on these concepts within different models of PD.

      R2: Minor comments: Supplementary Fig 1: I don't think one should normalize all controls to 1 and then do a statistical test as obviously the standard deviation of control is 0.

      HM: We agree with the reviewer that statistical testing is not appropriate when the WT control is fixed to a value of 1, as this necessarily eliminates variance in that group; accordingly, we have removed both statistical comparisons and standard deviation from the WT control while retaining variability measures for all experimental conditions. Raw densitometry values could not be pooled across independent experiments due to substantial inter-blot variability, and therefore normalization to the WT control was used solely to allow relative comparison within experiments, acknowledging the inherent quantitative limitations of Western blot densitometry. Ultimately the magnitude of the changes relative to the control lanes in each biological replicate was consistent across experiments, even if the absolute density of the bands between experiments was not always the same.

      R2: The raw data needs to be submitted to PRIDE or similar.

      HM: All of our data is being uploaded to the GEO databases, protocols to protocols.io and raw data deposited on Zenodo site in compliance with our ASAP funding requirements and the journals.

      R2: Some of the western blots could be improved. If these are the best shown, I am a little concerned about the reproducibility. How often has they been done?

      HM: We now ensure there is quantification of all the blots for at least 3 independent experiments and have worked to improve the quality of them throughout the revision period.

      R2: (Significance (Required)): Considering the importance of LRRK2 biology in Parkinson's and the new biology shown, this paper will be of great interest to the community and wider research fields.

      HM: We are so very grateful that the reviewer appreciates that the LRRK2 and PD community will find our work of interest. We hope our revisions will prove satisfactory even in the absence of ferritin changes in primary G2019S BMDM.

      Response to Reviewer 3

      Reviewer 3 (R3): What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells.

      HM: We thank the reviewer for pushing us to monitor the functional implications of the iron mishandling upon iron overload in the G2019S RAW cell system. We now add a completely new Figure 4 to get to these functional points. We employed two tools to look at established aspects of ferroptosis, first the C11-bodipy probe that labels oxidized lipids and we see significant signals specific to the G2019S iron loaded cells, where it labels endocytic membranes and the cell surface (Fig 4 E-G). This is consistent with the elevation of free iron 2+. We also used the SYTOX green death assay where the dye is internalized into cells when the cell surface is ruptured and show that G2019S cells die upon iron overload, but not the LRRK2KO or wild type cells (Fig 4 H,I). Lastly, we performed full proteomics analysis of the wt and G2019S RAW cells in iron overload conditions. These data provide a better view of the full stress response initiated in the G2019S cells, including the upregulation of HMOX1 (an NRF2 target gene), changes in lysosomal hydrolytic enzymes consistent with the reduction in BSA-DQRed signals, and in cytoskeleton, which is consistent with the plasma membrane blebbing phenotypes we see in G2019S (Fig. 4A-D and Supp. Fig 4 data). We hope these new data help to position the phenotype into a more physiological output.

      R3: Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      HM: We took this comment to heart and, as explained in the general response we removed the LRRK2ARM clones from the study. To understand the kinase function in the iron overload conditions we first explored the pan-specific type II kinase inhibitor rebastinib, shown to inhibit LRRK2. In contrast to MLi2, this drug effectively blocked p-Rab8 in G2019S cells exposed to high iron. However, since it is not specific and likely inhibits about 30-40% of all kinases we reached out to the Reck-Peterson and Leschziner labs who have developed a LRRK2 specific type II kinase inhibitor (published in June 2025 PMID: 40465731). They provided these to us (along with a great deal of discussion) and the two drugs both blocked the effect of LRRK2G2019 on p-Rab8 at the plasma membrane. These data show that the phenotypes we observe are indeed linked to the increased kinase activity of LRRK2, even though they are fully resistant to MLi-2. It suggests that high iron results in some alteration in LRRK2 conformation that alters the ability of MLi2 to block the kinase activity, while still allowing the type II kinase inhibitors that bind deeper in the ATP-binding pocket, to functionally block activity. We believe that these new data remove a great deal of confusion we had in the initial submission to explain the MLi-2 resistance.

      R3: There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?

      HM: We agree that the G2019S mutation leads a reduction in total LRRK2 levels in the cell, which is likely a compensatory effect to lower kinase activity in the cell. We do show that the G2019S mutation has clear activation of phosphorylation on both Rab8 and at the autophosphorylation site S1292 of LRRK2, as seen in Fig 1A, quantified in Fig 1B. In untreated conditions, these phosphorylation events are reversible upon treatment with MLi-2. We also provide the sequencing data in the supplement to confirm the presence of the G2019S mutation in this clone, shown in Supp Fig. 1A.

      R3: The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.

      HM: The G2019S and LRRK2 KO are both homozygous. We state this early in the results section and the methods.

      R3: The transferrin phenotype validated through proteomics and western blot is solid. HM: We agree, thank you very much!

      R3: Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.

      HM: We apologize for the confusion. In control cells the Cherry tagged FTL is efficiently cycling through the lysosomes and we don't see a strong cytosolic (diffuse) pool, which likely reflects the relatively iron-poor culture conditions. However, in G2019S cells, there is a highly elevated amount of FTL, with a strong cytosolic/diffuse stain in steady state, with some flux into lysosomes. In this experiment we chelated iron to test whether this cytosolic pool of FTL was capable of clearing through the lysosomes (ferritinophagy). While there is a cytosolic (diffuse) pool that remains, the pool that fluxes into the lysosome increases in G2019S chelated cells. This is also seen by the reduction in total FTL seen by western blot (endogenous FTL). Our conclusion here is that the general ferritinophagy machinery remains functional in G2019S cells. We have changed the term "diffuse" to "cytosolic" and improved our description of this experiment in the text.

      R3: Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability. HM: We have improved the text to be more descriptive of the conclusions and added new sections

      R3: If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?

      HM: Our new data employing the LRRK2-specific type II kinase inhibitors now confirm that the effects of the G2019S on iron overload are indeed kinase dependent, it's just insensitive to MLi2.

      R3: The authors do not show any controls (PCR, sequencing) confirming knockout or truncation. HM: We did higher resolution proteomics and deep sequencing and learned that the "Arm" mutation was not a truncation but a series of 4 point mutations around exon 19. Therefore we removed all data referring to this clone and replaced it with the use of the type II kinase inhibitor experiments. We feel this removed a lot of confusion and provides much clearer conclusions on the role of the kinase activity in iron overload. We may continue to explore what the 4 amino acid mutations created such strong phenotypes, as it could reflect a critical conformational change that impacts the kinase activity. But that is for future work. We now include the sequencing files of the G2019 and KO as Supplementary Data Files 1 and 2.

      R3: The data is interesting and the image quality with the insets is very high. HM: We thank the reviewer for their positive comments!

      R3: Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear. HM: We have removed the clone from the manuscript.

      R3: The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains. HM: We agree and our new drugs allow us to confirm that the phenotypes are due to kinase activity, but there is a new conformation of LRRK2 induced in high iron that renders the kinase domain resistant to MLi-2 inhibition. We discuss this in the manuscript now.

      R3: In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown. HM: these data are now removed.

      R3: In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.

      HM: We show that the pRab8 is specific to the G2019S lines and not seen in LRRK2 KO (Fig 3A,B, Supp. Fig. 3A,B).

      R3: The vinculin bands in figure 4A are misaligned with the rest of the bands.

      HM: We now provide new blots for all of these experiments (in Fig 3) as we removed the LRRK2ARM data from the manuscript and the appropriate loading controls are all included.

      R3: The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed. HM: We now include siRNA validation of Rab8 (vs Rab10) to confirm the specificity of the antibody to pRab8 in IF where it labels the plasma membrane in G2019S iron loaded cells.

      R3: The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration.

      HM: As mentioned above, we have added extensively to our new Fig 4 to include full proteomics analysis of the changes in iron loaded G2019S cells, we use C11-Bodipy probes to monitor lipid oxidation, and SYTOX green assays to monitor cell death through cell surface rupture (consistent with ferroptosis). We thank the reviewer for pushing us to do these experiments and provide further relevance to the potential for LRRK2 mutations to promote cell toxicity during neurodegeneration.

      R3: Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findingshttps://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract

      HM: We thank the reviewers for alerting us to the biorxiv papers, one of which was submitted after we sent our manuscript to review. We are excited to see the growing interest in the impact of LRRK2 function in iron homeostasis and hope our work will contribute to this. Upon reading the study from the LaVoie lab they do show some sensitivity of the iron loaded phenotype in G2019S cells, however they see a ~20% reduction in lysosomal iron after 7 days of MLi treatment in Astrocytes (their Fig 2L). To us, this is very likely an indication of a relatively high resistance to the drug. I'm sure if they tried these new Type II inhibitors the iron load would be much more rapidly reversed. The specificity of their phenotype to Rab8 is also very interesting considering the cell surface localization we see for pRab8 in our iron loaded system. Similar comments for the Guttierez study in macrophages. We have included the findings of these papers within the manuscript and thank the reviewer for pointing them out.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      In this paper, the authors report an interesting phenotype of the LRRK2 G2019S mutation on iron homeostasis in RAW264.7 macrophages. The phenotype is well characterised through proteomic and western blot approaches investigating transferrin and ferritin trafficking. The study is well conducted and data of high quality. The authors also appear to have discovered a cellular context where Rab8 is phosphorylated independently of LRRK2. This is a major finding which can potentially have an important impact in the LRRK2 field. What is missing in the study is the physiological relevance of these findings, mainly whether this effect actually results in higher cell death during iron overload. Since iron overload is known to result in ferroptosis, it is surprising that the authors have not checked whether the LRRK2 G2019S and ARM cells undergo more ferroptosis relative to LRRK2 WT cells. Moreover, their conclusion of the findings as "resistant to LRRK2 kinase inhibitors" is not convincing, since in most of the studies, they have removed the kinase domain, and this description implies the use of pharmacological kinase inhibition which has not been done in this paper.

      Significance

      Major comments

      In Figure 1:

      • There is lower LRRK2 expression in LRRK2 G2019S cells, have the authors checked Rab phosphorylation to validate the mutation?
      • The authors should specify if their cells are heterozygous or homozygous since they are discussing a dominant interfering mutant.
      • The transferrin phenotype validated through proteomics and western blot is solid.
      • Quantification in figure 1F-G is problematic, not clear what they mean by "diffuse and lysosomal". Puncta is either colocalising with lysosomes or not colocalising. This needs to be clarified and re-analysed.
      • Text in the first results part called "LRRK2G2019S RAW macrophages have altered iron homeostasis" is very long. It could be divided into more sections to improve readability.

      In Figure 2:

      • If the effect is armadillo-dependent, where does LRRK2 G2019S is implicated since there is no kinase domain in these cells?
      • The authors do not show any controls (PCR, sequencing) confirming knockout or truncation.
      • The data is interesting and the image quality with the insets is very high.

      In Figure 3:

      • Mutant not clearly described in text, did the authors remove just the kinase and ROC-COR domains or all the domains downstream of the Armadillo domain? This is not clear.
      • The authors cannot conclude that their phenotype is due to the independence of the kinase domain specifically as they are also interfering with the GTPase activity by removing the ROC-COR domains.
      • In Figure 3E, is the difference between the "ARM CTRL" and the "ARM FAS" conditions significant? A trend appears to be there, but the p-value is not shown.

      In Figure 4:

      • In figure 4A, it would have been important to check if Rab8 phosphorylation is also observed in LRRK2 KO cells after administration of FAS to further evaluate the mechanism through which this Rab8 phosphorylation is occurring.
      • The vinculin bands in figure 4A are misaligned with the rest of the bands.
      • The authors do not have any controls to validate the pRab8 staining in IF. This is an important caveat and needs to be addressed.
      • The authors should have checked if FAS administration in the LRRK2 G2019S and the ARM cells is leading to ferroptotic cell death (or cell death in general). This is key to validate the link between the altered iron homeostasis in LRRK2 G2019S cells and increased cytotoxicity observed during neurodegeneration. Regarding the literature, the authors are missing some important papers that are preprinted and these studies need to be discussed. This includes a report with opposite findings https://www.biorxiv.org/content/10.1101/2025.09.26.678370v1.full and a report showing kinase independent cell death in macrophages https://www.biorxiv.org/content/10.1101/2023.09.27.559807v1.abstract
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We would like to thank all the reviewers for their comments and suggestions.

      Please find below our point-by-point response to the Reviewers' comments, which details the corrections already made and outlines the planned revisions, experiments, and analyses.

      Reviewer 1

      Major comments:

      • Reviewer 1 commented that the 'manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible'. We agree wholeheartedly with the reviewer and apologise. We have now revisited the text, figures, and associated figure legends to ensure that they are more easily accessible and fully comprehensible to readers from across disciplines. This includes adding labels to point out specific anatomical features on images, and ensuring figures and text align. Further specific examples are included in the points below.
      • In response to concerns raised by Reviewer 1 relating to: Figure 1 and the lack of figure citations; 'the persistence of mCherry in the H2B Fucci'; how mCherry seems to persist longer in H1 (compare Figs 1D and 1G)':
      • We apologise for the lack of figure citations in the text. We have now reworked the figures relating to the constructs (original Figures 1 and S1) and have made these Figures 1, 2 and S1 in our updated version.
      • Figure 1 is now an introductory background figure which illustrates the differences between Fucci(SA) and Fucci(CA) reporters, with additional details provided in the associated legend, and call outs to the figure starting in the introduction.
      • Regarding 'the persistence of mCherry in the H2B Fucci', what we are trying to articulate is that the mCherry degradation that we observed in the Fucci(2A) expressing DF1 cells extended beyond the end of S phase and into G2/M, compared with what would be expected (Revised Figure 2H, arrows).
      • We have now replaced these montages with a more representative example. Additionally, the new images (Figures 2C and 2G) are synchronised (both starting at G2/M), restricted to a single cell cycle, are larger in size, and have the cell cycle stage labelled. We believe these changes will aid interpretation.
      • Specifically relating to the lack of labelling in Figure 3A, we agree that this figure was not labelled sufficiently, and neither was there enough detail included in the text or figure legend for readers to follow easily and make their own conclusions. We have now added additional labels to this figure, broken the figure down into more panels (Figures 4A-4D in revised manuscript), and included more detailed descriptions in the associated figure legend and text.
      • We thank the reviewer for making the important point that it is 'hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are' which was made more challenging by insufficient references to previous studies.
      • Firstly, as for the point above, we have now added labels to many of the panels (Figure 4 in revision), including highlighting features such as the non-proliferative dermal condensates and demarcating the proliferative retinal pigmented epithelium (Figures 4F and 4G in revision). Secondly, we have also now included additional references in the text, specifically relating to the neural tube, digits, and forming feathers, where our proliferation profiles are consistent with previous literature.
      • With regards to the Reviewer's comment regarding the difficulty in drawing conclusions 'about cell cycle in different tissue layers without sectioning' in original Figure 3B we will include more sections of FuChi embryos which include structures such as mesenchymal condensates.
      • To make our data on cell cycle stages as 'cells egress from the primitive streak, to form prechordal plate' clearer we have added additional labels to the figures (Figures 4B and 6E in revised manuscript). We will complement this adding sections of gastrulating FuChi embryos to further demonstrate the cell cycle status of cells that form the pre-chordal plates.

      Minor comments

      • We have added additional references relating to the data in original Figure 3 (now Figure 4 see above), and any new descriptions of known proliferation profiles that we include will have appropriate citations.
      • In this current revision we have addressed figure call out issues, and added labels to enhance readability, clarity and data interpretation. Reviewer 2

      Major comments

      • Reviewer 2 rightly pointed out that the 'description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors'. We have now added additional text to properly explain the CDT1 probe dynamics, as per the cited manuscript, and also referenced the schematics to help readers.
      • To address whether the FuChi model can be accurately 'used to study embryogenesis' and following up on the suggestion to 'indicate if the size of the embryos is comparable to the wildtype' we have now included size comparisons of FuChi and wild-type/non-transgenic embryos at mid (E9) and late (E18) gestational stages demonstrating that there is no significant difference between genotypes during embryogenesis (Figure 3D in revised manuscript). For all earlier stages, we did not see any developmental or size differences. We believe if there were any differences, these would be reflected in size at the mid and late gestational stages we analysed.
      • Reviewer 2 made very valuable observations and suggestions regarding our data and interpretation of somitogenesis, specifically in response to our sentence saying that "the mesenchyme, which is predominantly in G1 as they undergo condensation". Furthermore, they noted that Supplementary Video 4 "shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1)". We were asked to review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      • We thank the reviewer for this suggestion. From looking again at our timelapse movies, and also analysing additional static images, we agree that presomitic mesoderm (PSM) does appear to be green (S phase), which then may transition to G1 as the somites form. To address this, we plan to quantify cell cycle status in the PSM on embryos to see if this is a significant finding.
      • We hope this quantification of the PSM may also enable us to include discussion on how our findings relate to the Cell Cycle model for somitogenesis proposed in the Collier et al, 2000 paper suggested by the Reviewer.
      • We agree with the Reviewer that "the fluorescence profiles in original Figure 4C do not seem similar regarding the Myc-tag epitope" and believe this difference is likely just a reflection of the part of the image we used. We will include a more representative image once we have repeated the staining.
      • Reviewer 2 has asked for quantitative support for our fluorescence-based interpretations. We thank the reviewer for this suggestion and are now planning to perform quantitative analyses of different tissues (similar to our quantification in germ cells) and in embryos to support our observations. These will include the PSM (see above), neural tube, intestine, and early embryos (also see Reviewer 3 response for blastoderm quantification).
      • Since our original submission, we have further refined our in situ hybridisation protocol on FuChi embryos (Figures 5A & B in revision), finding that strong reporter expression is maintained for all the fluorescent proteins of the H1-Fucci(CA)2 reporter. Therefore, the "notably fainter" appearance of the hGMNN-mVenus in Figure 4A from the first version of the paper was likely a result of the experimental protocol not being 100% optimal.
      • *

      Minor comments

      • We have reordered the paragraphs relating to the different Fucci versions in the introduction as per the suggestions by the reviewer for better clarity.
      • To address the issues with Fucci system nomenclatures which made reading difficult, we have now added a background figure (new Figure 1 in revised draft) which is cited in the introduction, made sure constructs are introduced appropriately, and ensured we are consistent with our nomenclature.
      • Supplementary Figure lettering corrected.
      • All figure panels are now mentioned in the main text, and the incorrect call outs noted by the Reviewer have been corrected
      • Removed period and included clarifying statement in the figure legend relating to the comment regarding the extraembryonic region in Figure 5 (original) / Figure 6 (revised).
      • Other issues raised relating to reference duplication and missing words have been resolved.
      • We have corrected the legend of Figure 1 of the original paper, see related Reviewer 1 response provided above.

      Reviewer #3

      Minor comments

      • We have corrected all the figure call outs (see responses to similar comments by Reviewers 1 and 2) to ensure that all data presented is accurately reported.
      • We would like to thank the reviewer for suggesting modifications to the cell cycle montages (original figures 1D, 1G and 2F). We agree it would help the reader to enlarge the image, and therefore reduced the montage to include just one cell cycle, and have also included annotations of cell cycle stages in Figures 2C and 2G of the revised manuscript. We have also added some labels to Figure 3E (original figure 2F) and enlarged this.
      • In response to Reviewer 3's comment regarding fluorescent intensity. We quantified fluorescence levels in multiple individual DF1 cells expressing either the H1.0-Fucci(CA)2 or H2B-Fucci(SA)2 reporters, and this is shown as the fluorescent index in Figures 2D, 2E, 2H and 2I of the revised manuscript, where reporter levels were measured across time. In terms of overall mean intensity levels of the reporters, we found the reporters to be comparable in brightness and have similar mean intensity levels across the cell populations in the flow cytometry data (Figures 2F and 2J).
      • To enhance speedy interpretation, we will also process our supplementary videos to include annotations and arrows to highlight key cells and events (e.g. a cell undergoing mitosis).
      • As recommended by Reviewer 3, we have now quantified cell cycle status in blastoderm cells, confirming that a high proportion are in the G2/M phase. We will include these data in the final revision, which will complement our planned quantification of cell cycle status in other tissues (see response to Reviewer 2).
      • For our final revision, we will include higher magnification/zoomed in images of selected regions of the somites, neural tube (lumen) and retina (epithelium). Revisiting our images of the neural tube showed that dividing cells lumen did so in the perpendicular plane and we will include these images in our revision to provide further evidence of the fidelity of the FuChi reporter. We thank the reviewer for this excellent idea to show the efficacy of our system.
      • To address the levels of proliferation in somites, we plan to generate a cropped video with a fixed ROI to enable proliferation in individual cells of the forming somites to be more readily visualised. This will be further complemented by the quantification of cell cycle status in forming somites (see responses to other reviewers).
      • We have added lines to the discussion regarding the use of our reporter in other conventional model systems.
    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by Sudderick and colleagues describes the development and characterisation of a new generation of cell cycle reporter that can distinguish between cells in G1, S, G2 and M phases. Furthermore, the authors have developed a transgenic chicken line incorporating this reporter and demonstrated faithful discrimination of cell cycle stages in the in vivo context of developing transgenic embryos. Of note is the addition of epitope tags, which facilitate discrimination of cell cycle stages in tissue fixed using various techniques. This is a very important paper for the following reasons:

      • The authors have achieved faithful discrimination of all four cell cycle stages, which is a major advance in itself.
      • This generation of the FuChi transgenic chick is of enormous importance. This will facilitate accurate in vivo studies in a broad range of fixed and living tissue types and is a major milestone in the further establishment of the chick as a transgenic model system.

      Th characterisation of the cell cycle reporter as presented is robust and convincing. The authors further demonstrate the potential utility of the FuChi chickens through their observation of partial cell cycle synchrony during onset of development. I therefore only have minor suggestions that may facilitate easier interpretation of their data.

      Results 2

      • I can't see any mention of Figures 1C and D. Presumably the authors have carried out fluorescence intensity measurements using the two cell cycle reporters here, but this is not mentioned in the main text.
      • Figure 1D&G: I find these difficult to follow given the small size of the cells as presented. The authors may consider enlarging these and clearly annotating for cell cycle stage. They may find it helpful to focus on a single cell cycle, although I appreciate that displaying two cell cycles strengthens the claim of efficacy of the newly developed sensor. The supplementary videos associated with these figure panels are excellent as they display several cells with faithful reporter activity, but again, the authors may wish to annotate a few of these cells to enhance speedy interpretation. I have similar comments for Figure 2F and the associated movie.

      Results 4

      • The authors state that a large proportion of blastoderm cells were in G2/M. They may wish to formally quantify this, perhaps by performing simple cell counts in designated regions of interest. A similar quantification for gastrulating embryos would also be helpful.
      • It would be helpful to see zoomed in images of selected regions of the somites, neural tube and retina displayed in Figure 3B. This would be particularly appropriate in the context of the neural tube and retina (which are not discussed in the main text) as the positioning of the nucleus is defined by the stage of the cell cycle and should therefore serve to highlight the efficacy of the reporter.
      • Video 4 beautifully demonstrates the high levels of proliferation in somites, but again, it would be useful to have a zoomed in view. I appreciate the difficulty involved in doing this, given the movement of the embryo, but perhaps the authors could focus on a fixed ROI or present a separate movie of a few cells undergoing a full cell cycle.

      Discussion

      • The authors could perhaps expand on their discussion about potential utility in other conventional model systems (e.g. mouse, fish, etc).

      Significance

      General assessment: A timely piece of work that introduces a faithful cell cycle reporter that will be of broad interest.

      Advance: The ability to discriminate between all four stages of the cell cycle is a clear advance here.

      Audience: Broad interest, including those studying cell cycle and embryonic development in several tissue contexts.

      Expertise: Chick embryology, in vivo live imaging, neurogenesis, cellular developmental biology

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This work presents a novel transgenic chicken model with fluorescent reporters that allow in vivo monitoring of the four phases of the cell cycle. To achieve this, the authors clearly identify the limitations of previous Fucci systems and developed an optimised reporter construct that overcomes the major technical challenges identified. Addition of epitope tags to cell cycle stage-specific markers further enables antibody detection in fixed tissues. Proof of concept is provided by live imaging of chick embryos in early developmental stages, evidencing dynamic cell cycle states in tissues and migrating cells.

      Major comments:

      1. Introduction: Description of the bicistronic tandem-Fucci(CA) system in paragraph 6 is not consistent with what is described in the original bibliographic reference indicated by the authors. Namely: "...accumulation of the CTD1 probe..." should be expected in the G1-S transition (not S-G2) and the yellow reporter should be expected in G2 and M phases (not S and G2, as described). Please review this portion of the text.
      2. The authors state that "Of note, hatched FuChi chicks are initially smaller than wild type counterparts but grow at comparative rates and are fertile". If the model is to be used to study embryogenesis, it would be useful to indicate if the size of the embryos is comparable to the wildtype, at least for the major developmental stages mentioned in the manuscript.
      3. When referring to somitogenesis, the authors state "...the mesenchyme, which is predominantly in G1 as they undergo condensation". Suppl Video 4, however, shows distinct green fluorescence (S) in the presomitic mesoderm for the first hour or so, only then turning to magenta (G1). The authors should review the sentence/video to clarify if this is a significant finding or if this is not representative of their observations.
      4. (Optional) It would be interesting to describe if the authors' observations of cell cycle dynamics in the presomitic mesoderm support the proposed Cell Cycle model for somitogenesis (Collier et al., J.Theor.Biol.2000).
      5. The fluorescence profiles in Figure 4C do not seem similar regarding the Myc-tag epitope (contrarily to what is stated). The authors should rephrase or revisit this image to clarify their findings.
      6. Quantitative support for several fluorescence-based interpretations made throughout the manuscript. In some instances, conclusions are drawn from qualitative differences in signal intensity. For example, the statement in Fig. 4A that hGMNN-mVenus appears "notably fainter" than the other reporters. Incorporating simple quantitative analyses would strengthen these claims and ensure that observed differences reflect biological behaviour rather than technical or optical factors.

      Minor comments:

      1. Organization of the information in the Introduction: Paragraphs 3-5 introduce sequentially improved versions of the Fucci system. Then, paragraph 6 returns to the system described in the 4th paragraph. Authors should consider including paragraph 5 (description of Fucci4 and its limitations) just prior to the description of chickens as valuable developmental models (current paragraph 8) for clarity of the text.
      2. Fucci system nomenclature. Many different Fucci systems are mentioned, but nomenclature consistency throughout the manuscript is lacking, which makes reading difficult. For example, the terms "Fucci(SA)2" and "Fucci(CA)2" should be defined in the introduction, as they are employed to describe the construction of the new biosensor in the following sections.
      3. Some figure panels are not mentioned in the main text (for ex. Figures 1B and C, Figure 2C)
      4. The legend of Figure 1 (D & G) mentions "denoted by *", but the * seems to be missing in the figure.
      5. Supplementary Figure 1 has two D panels (and is missing the E).
      6. In the main text, where it reads "...Flow cytometry analysis of three independent PGC lines... (Figures 2G & S2E)", S2E should be replaced by S1E.
      7. In the Figure 4A legend, hCDT1-mVenus should be corrected to hCDT1-mcherry. Also, it is not clear why the authors state that "hGMNN-mVenus expression is notably fainter compared with hCDT1-mVenus and H1.0-mCerulean expression".
      8. In Figure 5E, the optical sections "i" seem to pertain to the extraembryonic tissue/area opaca and not to anterior mesoderm, as stated in the figure legend. Also, there is a period between "prechordal plate" and "and" in the legend's last sentence.
      9. Discussion: The last sentence of the third paragraph lacks "to" between "used" and "interrogate".
      10. References 10 and 23 are identical.

      Referee cross-commenting

      I agree with all comments from reviewers 1 and 3

      Significance

      This is a beautiful paper, describing a long sought-after model system to study cell cycle dynamics in vivo. The methodological details are thorough, and the results obtained are clearly presented, highlighting the utility of the new model in various embryonic stages and tissues/organs.

      This work is of pivotal importance to the developmental/stem cell biology community, as well as to the wider community that employs the chicken embryo as a preclinical model to assess therapeutic or teratogenic potential of biologically- or chemically-derived products.

      My expertise is in chicken embryo development, namely gastrulation, somitogenesis and limb bud outgrowth.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript reports the development of a novel Fucci (Fluorescent Ubiquitination-based cell cycle indicator) system for analysing cell cycle analysis, including live imaging of cell cycle. The novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. The authors conclude that the novel system has advantages over previous Fucci systems, including faithful labelling of all four cell cycle phases. Importantly, the authors have generated a stable germline of H1.0-Fucci(CA)2 transgenic chicks, enabling, for the first time, the discrimination and tracking of cells in all 4 phases of the cell cycle - i.e. in vivo studies of cell cycle progression in vivo, in intact tissues and organs. Additional epitope tags mean that the biosensor can be detected in fixed tissues, enabling comparison of cell cycle with expression of mRNA and proteins that mediate other aspects of development/label particular cells and tissues. The authors map proliferation dynamics across numerous tissues in the developing chick, at numerous stages of development, and conclude in particular that transition from S phase may be a key morphogenetic event in gastrulation, as mesendoderm cells leave the primitive streak to form embryonic stuctures such as prechordal plate

      Major comments:

      The novel biosensor looks to be an incredibly useful tool, and the manuscript suggests patterns of cell cycle progression in different tissues, and at different points in time, that look intriguing. But it is sometimes difficult to draw the strong conclusions suggested by the authors because the text and figures are sometimes difficult to follow. The manuscript would greatly benefit from having someone spend time on the figures, and associated text, to ensure they are fully comprehensible.

      Specifically:

      Conclusion1: That the new FUCCI biosensor is a superior cell cycle probe, better at discriminating all cell cycle phases than previous versions. I was very convinced by the vidoes (video 1 and 2) but had problems with Figure 1. Potentially, this is because I am not an expert in these types of analyses - but it was not helped by the fact that components of the figure were not cited in the text. I was particularly confused by the statement remarking on 'the persistence of mCherry in the H2B Fucci' as mCherry seems to persist longer in H1 (compare Figs 1D and 1G). Please explain, in the Figure legend, why this appears to be the case.

      Conclusion 2: that the FuChi chicks are the first viable stably expressing avian cell cycle biosensor model. I agree, and the authors should be congratulated on the development of this important tool.

      Conclusion 3: the authors monitor cell cycle progression in chicks, in vivo, looking at stages from blastoderm, through gastrulation, and into organogenesis, and draw various conclusions

      For example: Fig 3A and text: 'as gastrulation progresses, the primitive streak an presomitic mesoderm display...., whereas the .... And neural plate contains...'

      Figure 3A covers an enormous range of stages and tissues. The figure is barely labelled. The text and figure need to better align, and key features in each figure panel need to be labelled so that the reader can better follow, and draw conclusions.

      Fig 3B: Reports expression in numerous tissues. There are some beautiful examples of cells segregating relative to cell cycle - for instance, in the neural tube. But I found it hard to know where the biosensor is reporting patterns that are already well established (eg neural tube), and where the biosensor is reporting patterns that are novel - and if so, what these patterns are. Again, this is not described adequately in the text (for instance, there is no mention of the neural tube). And in some cases, references are provided (allowing comparison with previous studies) - but in other cases, there are no references to previous studies. The reader must be given the opportunity to compare this study with previous studies.

      Overall - I can appreciate that there are some fascinating patterns, but it is very difficult to draw the conclusions suggested by the authors. Primarily this is due to poor labelling of figures, and lack of clarity between figures and text, and poor referencing. Additionally, it is not clear that strong conclusions can be drawn about cell cycle in different tissue layers without sectioning some embryos.

      Fig 3C: The authors remark 'The results confirm that the ... FuChi embryos recapitulate known cell cycle profiles of those tissues'. See my comments in 3B.

      Conclusion 4: Robust stability of biosensor in fixed tissues. I agree, and the authors should be congratulated for having made a construct that can be paired with in situ hybridisation and immunohistochemistry - this is invaluable.

      Conclusion 5: The authors investigate the potential of the new system for live imaging, and focus on a couple of novel dynamic examples.

      The data indicating that PGCs at initial migratory stages are not undergoing frequent cell division is clear.

      However, the data indicating that cell cycle status changes as cells egress form the primitive streak, to form prechordal plate, is not clear. The figures need to be better labelled, and the text needs to be more clear (eg ' and prechordal plate. and anterior mesoderm'..

      Minor comments:

      • Specific experimental issues that are easily addressable.

      I would recommend that the authors section some embryos, to better support key conclusions (eg in figure 3 and 5) - Are prior studies referenced appropriately?

      Not always - see comment above (Fig 3) - Are the text and figures clear and accurate?

      No - this needs work. Not all figures cited in text, or cited in wrong order; Figures are poorly labelled - making it hard to follow - Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      Label figures more carefully and ensure figures and text align

      Referee cross-commenting

      I agree with all comments from reviewers 2 and 3

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      Technically this is a fantastic resource. As detailed above, the novel biosensor (H1.0-Fucci(CA)2) has been developed for analyses of chick cells and tissues: chick embryos are a valuable developmental model that have (and in the future, will) particularly informed our understanding of early stages of embryogenesis, and of development of numerous tissues, including the neural tube, somites, limb bud. Increasingly, studies show the importance of cell cycle for development, differentiation and morphogenesis - it is a huge breakthrough to be able to perform in vivo studies of cell cycle progression in intact tissues and organs.<br /> - State what audience might be interested in and influenced by the reported findings.

      Broad basic research, including developmental biologists, stem cell biologists, modellers. - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Developmental biologist, with expertise in chick

    1. Information science[1][2][3] (abbreviated as infosci) is an academic field that is primarily concerned with the analysis, collection, classification, manipulation, storage, retrieval, movement, dissemination, and protection of information.[4]

      Si bien es una concepción general muy contundente y clara, desde un punto subjetivo no deja de ser susceptible a cambios y redefiniciones dependiendo de la persona y el campo desde el que se explique, ya que esto es un campo de acción tan amplio que no se puede limitar a una simple (o única) definición

    2. La ciencia de la información[1][2][3] (abreviada como infosci) es un campo académico que se ocupa principalmente del análisis, recopilación, clasificación, manipulación, almacenamiento, recuperación, movimiento, difusión y protección de la información. [4] Los profesionales dentro y fuera del campo participan en el estudio de la aplicación y uso del conocimiento en las organizaciones. Además, examinan la interacción entre personas, organizaciones y cualquier sistema de información existente. El objetivo de este estudio es crear, reemplazar, mejorar o comprender los sistemas de información.

      Para mi la Ciencia de la Información es un campo interdisciplinario que se encarga de analizar cómo se genera, recolecta, organiza, almacena, recupera y transmite la información.

      En lugar de centrarse solo en los cables o el código , se enfoca en el vínculo entre las personas y los datos como tal objetivo principal es asegurar que la información sea accesible y útil para quien la necesite.

    1. Rejet, victimisation par les pairs et émotions négatives : Synthèse des dynamiques d'influence en milieu scolaire

      Synthèse opérationnelle

      Ce document présente une analyse approfondie des recherches récentes menées par l'Institut universitaire Jeunes en difficulté concernant les liens entre l'isolement social, la victimisation par les pairs et les émotions négatives chez les élèves du primaire.

      Les points saillants de cette étude sont les suivants :

      Prévalence élevée : Un nombre significatif de jeunes, particulièrement les filles, éprouvent une détresse émotionnelle quotidienne et un sentiment de non-acceptation dès le début du secondaire, des tendances amorcées au primaire.

      Renversement de la perspective traditionnelle :

      Contrairement à l'idée reçue voulant que les problèmes relationnels causent les émotions négatives, les résultats indiquent que les émotions négatives (tristesse, désespoir) précèdent et prédisent souvent la victimisation.

      Boucle de rétroaction pour l'isolement : Il existe une relation bidirectionnelle entre l'isolement et les émotions négatives, créant un cycle d'aggravation mutuelle.

      Stabilité des traits vs États changeants : L'étude distingue les caractéristiques chroniques des élèves des fluctuations momentanées, révélant que si les relations sociales peuvent se réinitialiser partiellement entre deux années scolaires, les émotions négatives ont tendance à persister, voire à s'intensifier lors des transitions.

      Nécessité d'interventions multidimensionnelles : La simple prévention de l'intimidation est jugée insuffisante.

      Les interventions doivent impérativement intégrer la promotion du bien-être et la gestion des émotions pour rompre les cycles de victimisation.

      --------------------------------------------------------------------------------

      1. État des lieux : Un portrait préoccupant chez les jeunes

      Les données statistiques issues d'enquêtes canadiennes et québécoises révèlent une réalité complexe pour les élèves :

      | Indicateur | Garçons | Filles | | --- | --- | --- | | Tristesse ou désespoir quotidien (début secondaire) | 19 % | 36 % | | Sentiment de ne pas être accepté tel que l'on est | 36 % | 52 % | | Victimes d'intimidation (12 derniers mois - Québec) | ~11 % | ~11 % |

      Note sur la victimisation : Bien que le chiffre de 11 % soit cité, la proportion peut grimper jusqu'à 20 %, voire 40 % pour des événements isolés, soulignant la difficulté de cerner précisément ce phénomène.

      --------------------------------------------------------------------------------

      2. Définition des concepts fondamentaux

      L'étude s'articule autour de trois réalités distinctes mais interconnectées :

      Émotions négatives : Comprennent la tristesse, le sentiment de désespoir et les idées négatives.

      Elles sont considérées comme des précurseurs de la dépression, bien qu'elles ne correspondent pas nécessairement à un diagnostic clinique à ce stade (primaire).

      Isolement des pairs : Fait d'avoir peu d'interactions sociales, que ce soit par choix ou par rejet subi. Le rejet est la forme d'isolement non volontaire la plus fréquente.

      Victimisation : Actes d'agressivité intentionnels et répétitifs caractérisés par un déséquilibre des forces (physiques ou de réputation).

      Elle peut être directe (frapper, insulter) ou indirecte (nuire à la réputation, propager des rumeurs).

      --------------------------------------------------------------------------------

      3. Modèles théoriques de la relation pairs-émotions

      Trois modèles alternatifs tentent d'expliquer l'interaction entre ces variables :

      1. Modèle des risques interpersonnels : Les expériences difficiles avec les pairs agissent comme des stresseurs qui s'accumulent et génèrent des émotions négatives.

      C'est le modèle le plus testé et documenté à ce jour.

      2. Modèle axé sur les symptômes : Les émotions négatives (ou l'affectivité négative) entraînent un retrait social ou une vulnérabilité qui fait de l'élève une cible privilégiée pour la victimisation.

      3. Modèle transactionnel : Suppose une influence réciproque et un renforcement mutuel entre les émotions et les expériences sociales.

      --------------------------------------------------------------------------------

      4. Méthodologie de la recherche

      L'étude a suivi 992 élèves de la 3e à la 6e année du primaire (Québec) sur deux années scolaires, avec quatre points de mesure.

      L'originalité de l'approche réside dans l'utilisation de modèles statistiques ("modèles à décalage croisé avec intercept aléatoire") permettant de distinguer :

      Le Trait (stable/chronique) : La tendance d'un élève à être d'une certaine façon sur le long terme.

      L'État (changeant) : Les fluctuations d'un élève autour de sa propre tendance stable à un moment précis.

      --------------------------------------------------------------------------------

      5. Analyse des résultats : Des dynamiques différenciées

      Interrelations stables (Traits)

      De manière chronique, les trois dimensions sont liées : un élève ayant une tendance stable à l'isolement aura également une tendance stable à la victimisation et aux émotions négatives.

      Ces réalités co-occurrent sans ordre temporel défini.

      Dynamiques temporelles (États changeants)

      L'analyse des fluctuations d'un moment à l'autre révèle des mécanismes distincts :

      Émotions négatives et Isolement : Suivent un modèle transactionnel.

      Un niveau élevé d'émotions négatives en début d'année prédit un isolement accru en fin d'année, et inversement. C'est une boucle d'accentuation.

      Émotions négatives et Victimisation : Suivent un modèle axé sur les symptômes.

      Les émotions négatives en début d'année prédisent une victimisation accrue plus tard, mais la victimisation ne semble pas augmenter les émotions négatives de manière immédiate.

      Ce lien est direct et ne passe pas par l'intermédiaire de l'isolement.

      Stabilité temporelle :

      ◦ La victimisation et l'isolement sont plus stables au sein d'une même année qu'entre deux années.

      Le changement de classe ou d'enseignant atténue l'effet de réputation.    ◦

      Les émotions négatives sont plus stables entre les années scolaires, suggérant une anticipation anxieuse de la rentrée ou une persistance des traits internes malgré les changements d'environnement.

      Constat important : Ces mécanismes sont identiques pour les garçons et les filles, ainsi que pour les élèves plus jeunes ou plus vieux au sein du primaire.

      --------------------------------------------------------------------------------

      6. Conclusions et orientations pour l'action

      Pour la recherche

      Les résultats de cette étude québécoise, bien que novateurs, ne font pas encore consensus au niveau international, d'autres études montrant parfois des résultats inverses ou sexués.

      Une réplication du modèle est prévue en Belgique (Flandre) pour valider ces observations.

      Pour l'intervention en milieu scolaire

      L'étude remet en question les stratégies d'intervention uniquement centrées sur le comportement social :

      Insuffisance de la lutte contre l'intimidation seule : Retirer un élève d'une situation de victimisation ne garantit pas la disparition de ses émotions négatives.

      Approche multifactorielle : Il est impératif d'agir simultanément sur l'environnement social et sur le bien-être psychologique interne.

      Priorité à la promotion du bien-être : La prévention de la dépression et la gestion des émotions négatives dès le primaire sont des leviers essentiels pour réduire, par ricochet, les risques de victimisation et d'isolement.

      "Les efforts de prévenir la victimisation sont essentiels, mais nos résultats suggèrent qu'ils ne sont potentiellement pas suffisants parce qu'il y a une dynamique plus large."

    1. Briefing : L’autorégulation chez les enfants victimes d’agression sexuelle

      Résumé exécutif

      Ce document synthétise les résultats de recherches doctorales portant sur l’autorégulation des enfants ayant survécu à une agression sexuelle (AS).

      L’autorégulation, définie comme la capacité à moduler ses réponses cognitives et émotionnelles pour générer des comportements adaptatifs, est un processus clé souvent altéré par le trauma.

      Les conclusions principales soulignent que si l’agression sexuelle est globalement associée à des difficultés de fonctionnement exécutif (inhibition et flexibilité cognitive), l'impact n'est pas uniforme.

      La recherche identifie quatre profils distincts d'autorégulation chez les victimes : disrégulé, inhibé, flexible et régulation identifiée par les parents.

      L'étude démontre également que des facteurs tels que le sexe de l'enfant, l'historique de maltraitance multiple et l'environnement socio-économique (défavorisation du quartier) influencent de manière significative les capacités d'autorégulation.

      Les implications cliniques suggèrent d'abandonner les approches universelles au profit d'interventions différenciées et d'évaluations multi-méthodes (tâches cognitives et questionnaires) impliquant plusieurs répondants (parents et enseignants).

      --------------------------------------------------------------------------------

      1. Cadre théorique et définitions

      L'agression sexuelle est une problématique de santé publique mondiale touchant environ une fille sur cinq et un garçon sur dix avant l'âge de 18 ans.

      Elle entraîne des conséquences psychologiques variées, notamment des problèmes de comportement intériorisés (dépression, retrait) et extériorisés (agression, opposition).

      L'autorégulation

      Le concept d'autorégulation repose sur deux composantes interdépendantes :

      La régulation émotionnelle : Stratégies et compétences modulant l'expression et l'expérience des émotions.

      Les fonctions exécutives : Processus mentaux orientés vers un but, incluant :

      L'inhibition : Capacité à freiner une réponse automatique face à un stimulus (ex: répondre "nuit" quand on montre un soleil).    ◦ La flexibilité cognitive : Capacité à s'adapter au changement de règles dans l'environnement.

      Le mécanisme biologique du trauma

      L'exposition précoce à un stress intense (maltraitance, pauvreté) provoque une dysrégulation des hormones de stress, entraînant des atteintes structurelles et fonctionnelles au cerveau, ce qui fragilise les capacités d'autorégulation.

      --------------------------------------------------------------------------------

      2. Impact de l'agression sexuelle sur les fonctions exécutives

      Les recherches présentées indiquent que l'agression sexuelle est un prédicteur significatif de difficultés exécutives, même après avoir contrôlé d'autres facteurs comme le TDAH ou la défavorisation sociale.

      Constats par type de fonction

      Flexibilité cognitive : L'agression sexuelle est directement associée à une moins bonne performance dans les tâches mesurant cette capacité.

      Inhibition : Les enfants victimes montrent une performance significativement inférieure aux enfants non victimes.

      Effet modérateur du sexe

      L'étude révèle des différences marquées selon le sexe de l'enfant :

      Garçons : Les enseignants rapportent beaucoup plus de difficultés de fonctionnement exécutif chez les garçons victimes que chez les non-victimes. Ils affichent également des performances plus faibles aux tâches d'inhibition.

      Filles : Il y a peu de différence significative entre les filles victimes et non victimes sur le plan de l'évaluation des fonctions exécutives par les enseignants ou dans les tâches d'inhibition.

      --------------------------------------------------------------------------------

      3. Typologie des profils d'autorégulation

      L'analyse a permis de dégager quatre profils types chez les enfants victimes d'agression sexuelle (échantillon de 225 enfants) :

      | Profil | Proportion | Caractéristiques principales | Problèmes de comportement associés | | --- | --- | --- | --- | | Disrégulé | 39 % | Faible performance cognitive, forte labilité émotionnelle, difficultés rapportées par les parents. | Problèmes intériorisés et extériorisés élevés (comorbidité). | | Inhibé | 19 % | Excellente performance aux tâches d'inhibition, mais faibles compétences émotionnelles perçues par les parents. | Niveaux les plus élevés de problèmes intériorisés. | | Flexible | ~28 % | Autorégulation supérieure à la moyenne, profil concordant (maison/école), résilience. | Faible symptomatologie. | | Régulation (Parents) | 14 % | Performance cognitive faible, mais parents rapportant de très bonnes capacités (profil discordant). | Symptômes visibles par les enseignants mais sous-estimés par les parents. |

      Analyse des profils spécifiques

      Le profil "Inhibé" : Ces enfants semblent utiliser une sur-régulation cognitive pour contrôler leurs impulsions, mais au prix d'une grande détresse interne.

      Chez les filles, ce profil est un facteur de risque pour les problèmes intériorisés, tandis que chez les garçons, il semble agir comme un facteur de protection apparent contre les problèmes extériorisés.

      Le profil "Discordant" : Souvent associé à des agressions sexuelles intrafamiliales (80-90 % des cas dans ce groupe). Les parents peuvent surévaluer les compétences de l'enfant par désir de normalité ou sous l'effet d'un cadre familial trop rigide.

      --------------------------------------------------------------------------------

      4. Facteurs de risque et de protection contextuels

      L'autorégulation ne dépend pas uniquement de l'acte traumatique, mais d'un écosystème de facteurs :

      Historique de maltraitance : Les profils "disrégulé" et "inhibé" sont corrélés à une exposition à un plus grand nombre de formes de maltraitance.

      Défavorisation du quartier : Les enfants vivant dans des quartiers favorisés présentent une meilleure autorégulation. Cela s'expliquerait par l'accès aux ressources (bibliothèques, musées, espaces verts) et une moindre exposition à la violence communautaire.

      Éducation parentale : Un niveau d'études plus élevé chez les parents favorise le développement des compétences langagières, lesquelles soutiennent directement l'autorégulation de l'enfant.

      --------------------------------------------------------------------------------

      5. Recommandations pour l'intervention clinique

      Évaluation multidimensionnelle

      Il est impératif de multiplier les sources d'information :

      1. Multi-modalité : Combiner les questionnaires (perceptions) et les tâches cognitives (mesures objectives), car les résultats sont souvent divergents.

      2. Multi-répondants : Inclure systématiquement le point de vue des enseignants pour identifier les difficultés qui pourraient être masquées dans le cadre familial.

      Approche différenciée

      L'intervention ne doit pas être identique pour tous les profils :

      Pour les enfants disregulés : Approche standard axée sur le renforcement des fonctions exécutives et de la régulation émotionnelle.

      Pour les enfants inhibés : Éviter de renforcer l'inhibition (potentiellement néfaste). Prioriser la reconnaissance, la compréhension et l'expression des émotions, ainsi que la flexibilité cognitive.

      Pour les enfants "flexibles" : L'intervention sur l'autorégulation peut être inutile. Se concentrer sur le soutien psychosocial et la prévention de la revictimisation.

      Pour le profil discordant : Évaluer la flexibilité des parents et utiliser des sources d'évaluation externes pour pallier la sous-estimation parentale des difficultés.

      Pistes d'activités pratiques

      Pour l'inhibition : Jeux de type "1, 2, 3 Soleil", coloriage attentionnel (arrêter au signal), ou jeux de rôle où l'enfant doit attendre son tour face à une frustration.

      Pour la flexibilité : Jeux avec changement de règles fréquent (ex: varier qui gagne à "Roche-Papier-Ciseau"), résolution de problèmes avec des solutions multiples ou inversions de rôles.

      Implication des parents : Travailler sur l'autorégulation propre des parents et favoriser un attachement sécurisant, facteur de protection majeur pour l'enfant.

      --------------------------------------------------------------------------------

      Conclusion

      La recherche souligne la complexité des trajectoires de développement après une agression sexuelle.

      Le constat majeur est que le trauma n'entraîne pas systématiquement une dysrégulation.

      Près de 42 % des enfants présentent des profils adaptés.

      L'enjeu clinique réside dans l'identification des profils "surrégulés" ou "discordants", qui peuvent passer inaperçus tout en présentant des risques élevés de pathologie à long terme.

    1. Comportements Parentaux Disrégulés et Fonctionnement des Enfants Victimes de Maltraitance : Document de Synthèse

      Résumé Analytique

      Ce document synthétise les résultats d'une thèse doctorale portant sur les liens entre les comportements parentaux disrégulés (CPD) et le développement socio-émotionnel de jeunes enfants suivis par les services de protection de la jeunesse.

      L'analyse met en lumière un cycle de transmission intergénérationnelle de la maltraitance : les parents ayant vécu des traumatismes durant leur propre enfance sont plus susceptibles de manifester des comportements parentaux atypiques, effrayants ou intrusifs.

      Les conclusions majeures de la recherche indiquent que :

      1. Impact des CPD : Des niveaux élevés de comportements parentaux disrégulés sont directement associés à l'attachement désorganisé et à des problèmes de comportement (intériorisés et extériorisés) chez l'enfant.

      2. Effet Protecteur : L'attachement sécurisant agit comme un modérateur crucial, protégeant l'enfant des impacts néfastes des CPD sur son développement comportemental.

      3. Efficacité de l'Intervention : L'Intervention Relationnelle (IR), basée sur la rétroaction vidéo, réduit significativement la sévérité des comportements parentaux disrégulés, offrant ainsi une avenue clinique prometteuse pour les services de protection de l'enfance.

      --------------------------------------------------------------------------------

      1. Caractérisation des Comportements Parentaux Disrégulés (CPD)

      Les comportements parentaux disrégulés sont des manifestations atypiques et perturbatrices qui surviennent lors des interactions avec l'enfant, particulièrement face à sa détresse.

      Ces comportements sont souvent observés chez les parents signalés pour abus ou négligence.

      Typologie des comportements selon l'échelle AMBIANCE

      La recherche s'appuie sur la mesure AMBIANCE pour catégoriser cinq sous-types de comportements disrégulés :

      | Sous-type de comportement | Description | | --- | --- | | Erreurs de communication affective | Minimiser, ignorer ou répondre de manière inappropriée à la détresse (ex: rire ou imiter l'enfant qui pleure). | | Confusion des rôles | Le parent aborde l'enfant comme s'il devait répondre aux propres besoins du parent (renversement de rôle) ou traite l'enfant comme un partenaire intime. | | Comportements effrayants ou apeurés | Manifestations d'effroi face aux besoins de l'enfant ou adoption d'une posture menaçante. | | Intrusion et négativité | Hostilité physique ou verbale, contrôle excessif des mouvements ou des interactions. | | Retrait | Création active d'une distance physique ou verbale, position d'impuissance et évitement de l'enfant lors des réunions. |

      Le paradoxe de la peur sans solution

      Ces comportements placent l'enfant dans un paradoxe insoluble.

      La source habituelle de réconfort (le parent) devient simultanément la source de menace ou de détresse.

      L'enfant ne peut donc pas élaborer de stratégie cohérente pour réguler son stress, ce qui mène à une désorganisation de l'attachement.

      --------------------------------------------------------------------------------

      2. Analyse des Impacts Développementaux et Facteurs de Protection

      L'étude de 70 familles signalées au centre jeunesse de Montréal révèle les dynamiques entre l'exposition aux CPD et le fonctionnement de l'enfant.

      Corrélations entre CPD et dysfonctionnement

      L'exposition à des niveaux élevés de CPD est associée à :

      L'attachement désorganisé : Présent chez 50 % des enfants de l'échantillon.

      Problèmes de comportement : Augmentation des comportements agressifs (extériorisés) et des symptômes de retrait ou d'anxiété (intériorisés).

      Difficultés sociales et cognitives : Méfiance envers autrui, difficultés d'apprentissage et déficits de régulation émotionnelle.

      L'attachement sécurisant comme bouclier

      Un résultat central de la recherche montre que l'attachement sécurisant joue un rôle de facteur de protection.

      • Pour les enfants ayant un attachement insécurisant, il existe un lien direct et significatif entre la sévérité des CPD et la présence de problèmes de comportement.

      • À l'inverse, chez les enfants ayant un attachement sécurisant, ce lien n'est pas significatif.

      Ces enfants présentent moins de problèmes de comportement malgré l'exposition aux mauvais traitements ou aux CPD.

      --------------------------------------------------------------------------------

      3. L'Intervention Relationnelle (IR) : Mécanismes et Efficacité

      La recherche a évalué l'efficacité de l'Intervention Relationnelle par rapport aux services habituels (psycho-éducatifs).

      Protocole de l'intervention

      L'IR se déroule généralement sur 8 séances d'environ 1h30 et utilise la rétroaction vidéo comme levier de changement :

      1. Discussion thématique : Aborde le rôle parental et le développement de l'enfant.

      2. Période de jeu filmée (10-15 min) : Le parent réalise une activité spécifique avec une consigne orientée (ex: "observez votre enfant et décrivez ce qu'il fait").

      3. Rétroaction vidéo : L'intervenant souligne les forces du parent et ses comportements sensibles.

      Cela permet au parent de constater l'impact positif de ses actions sur son enfant (contacts visuels, rires, apaisement).

      Résultats cliniques

      L'intervention a démontré une réduction significative de plusieurs types de CPD comparativement au groupe contrôle :

      • Diminution des erreurs de communication affective.

      • Diminution des comportements d'intrusion.

      • Diminution des comportements de retrait.

      • Amélioration du score global de régulation parentale.

      Note : Les comportements apeurés/effrayants et la confusion des rôles se sont révélés plus difficiles à modifier, étant plus subtils et moins facilement identifiables par le parent lors de la rétroaction vidéo.

      --------------------------------------------------------------------------------

      4. Implications pour les Services de Protection

      L'étude conclut à la nécessité d'intégrer l'évaluation des CPD dans les pratiques cliniques courantes.

      Utilisation d'outils adaptés : L'adoption de l'instrument AMBIANCE brief est recommandée pour permettre aux intervenants de terrain de repérer les CPD sans nécessiter les protocoles lourds de recherche.

      Ciblage de l'attachement : Les interventions doivent viser prioritairement la sécurité d'attachement comme levier pour atténuer les conséquences des traumatismes.

      Formation continue : Former les intervenants à la reconnaissance des signaux de disrégulation subtils (hésitations, expressions faciales, postures) pour mieux accompagner les parents dans la réparation des interactions perturbées.

      En résumé, l'Intervention Relationnelle s'avère être un outil puissant non seulement pour optimiser la sensibilité parentale, mais aussi pour réduire les placements à l'extérieur du milieu familial en améliorant la qualité fondamentale du lien parent-enfant.

    1. Synthèse du Séminaire sur l'Enseignement Explicite : Des Coulisses à la Classe

      Ce document de breffage synthétise les interventions du séminaire organisé par l'Université de Mons (UMons) et l'Institut d'administration scolaire.

      Il détaille les fondements théoriques, les modalités pratiques et les outils de recherche liés à l'enseignement explicite, une approche pédagogique éprouvée pour favoriser l'équité et l'efficacité des systèmes éducatifs.

      Résumé Exécutif

      L'enseignement explicite (EE) est une approche pédagogique issue de l'observation de pratiques de classe efficaces, particulièrement dans les milieux défavorisés.

      Son principe central est de « rendre visible » ce qui est invisible : les démarches cognitives de l'enseignant et les processus d'apprentissage des élèves.

      Fondée sur le modèle PIC (Préparation, Interaction, Consolidation), cette méthode suit une progression rigoureuse : ouverture, modelage (« Je fais »), pratique guidée (« Nous faisons »), pratique autonome (« Tu fais ») et clôture.

      Au-delà de la transmission des savoirs, l'EE s'applique également à la gestion des comportements et s'appuie sur une « vision professionnelle » que les outils technologiques, comme le suivi oculaire (eye-tracking), permettent désormais d'objectiver.

      La formation des enseignants repose sur une collaboration étroite au sein d'une triade (stagiaire, maître de stage, superviseur) visant à transformer le novice en un praticien réflexif capable d'ajuster ses gestes professionnels aux besoins de ses élèves.

      --------------------------------------------------------------------------------

      1. Cadre de Référence et Principes Fondamentaux

      L'intérêt de l'Université de Mons pour l'enseignement explicite s'inscrit dans une réflexion de vingt ans sur l'amélioration des systèmes éducatifs.

      Objectifs de l'Éducation

      Équité et Efficacité : L'objectif est de réduire les écarts entre les élèves et d'élever la moyenne des résultats, tant sur le plan cognitif (instruction) que comportemental (éducation).

      Liberté et Responsabilité : Si la liberté d'enseignement est garantie, elle doit s'appuyer sur des choix documentés et éclairés par la recherche pour éviter les modes passagères.

      Libération du Déterminisme : L'école doit permettre à chaque individu de se libérer des déterminismes sociaux dont il n'est pas responsable.

      Le Modèle de l'Enseignant Efficace

      L'enseignement est comparé à la médecine ou au sport de haut niveau : c'est un métier complexe qui repose sur des savoir-faire qui ne sont pas innés, mais qui s'apprennent et se développent par l'accumulation de connaissances et la pratique.

      --------------------------------------------------------------------------------

      2. Le Modèle de l'Enseignement Explicite

      L'enseignement explicite n'est pas une théorie abstraite mais une approche issue de recherches corrélationnelles débutées dans les années 70.

      La Structure PIC (Préparation, Interaction, Consolidation)

      Préparation (Planification) : Travail de l'enseignant en amont de la classe.

      Interaction : Le cœur de la leçon, décomposé en cinq étapes chronologiques.

      Consolidation : Automatisation des acquis et évaluation.

      Les 5 Étapes de l'Interaction en Classe

      | Étape | Rôle de l'Enseignant | Description Clé | | --- | --- | --- | | Ouverture | Présenter | Annonce des objectifs, du plan de cours et réactivation des connaissances préalables. | | Modelage | « Je fais » | L'enseignant met un « haut-parleur sur sa pensée » pour expliciter ses démarches à voix haute. | | Pratique Guidée | « Nous faisons » | Vérification constante de la compréhension. L'enseignant questionne les élèves jusqu'à obtenir 80 % de réussite. | | Pratique Autonome | « Tu fais » | L'élève travaille seul. L'enseignant circule pour apporter un support individualisé. | | Clôture | Objectiver | Synthèse de la leçon, métacognition et lien avec la leçon suivante. |

      Caractère Itératif : Cette démarche n'est pas figée. Si la pratique guidée échoue, l'enseignant doit revenir au modelage. Elle permet ainsi une différenciation pédagogique réelle en fonction des besoins des élèves.

      --------------------------------------------------------------------------------

      3. Gestion de Classe et des Comportements

      L'enseignement explicite considère que la gestion des apprentissages et la gestion de classe sont deux rouages indissociables : l'un ne peut fonctionner sans l'autre.

      L'Objectivation de la Compréhension

      L'enseignant doit rendre observable le cheminement de pensée des élèves. On distingue plusieurs types d'objectivations :

      Stéréotypée : « Ça va ? Vous avez compris ? » (Peu efficace car l'élève répond souvent par l'affirmative sans preuve).

      Spécifique : « Peux-tu reformuler avec tes propres mots ? » ou « Cite les caractéristiques de... ».

      Métacognitive : Questionner les étapes par lesquelles l'élève est passé pour trouver une réponse.

      L'Enseignement Explicite des Comportements

      Plutôt que de punir l'élève qui ne sait pas se comporter, on lui enseigne les attentes sociales.

      1. Définir les valeurs : (ex: Respect, Responsabilité, Sécurité).

      2. Traduire en comportements observables : Utiliser des formulations positives (ex: « Je marche calmement » au lieu de « Ne pas courir »).

      3. Appliquer la démarche EE : Modelage du comportement attendu, pratique guidée et renforcement en contexte réel (classe, couloirs, réfectoire).

      --------------------------------------------------------------------------------

      4. Vision Professionnelle et Observation des Pratiques

      L'expertise enseignante réside dans la capacité à balayer l'environnement, repérer les indices pertinents et raisonner avant d'agir.

      Différences entre Novices et Experts (Apports de l'Eye-Tracking)

      Grâce au suivi oculaire, la recherche à l'UMons a identifié des différences marquées dans l'observation d'une classe :

      Enseignants Experts / Formateurs :

      ◦ Focus prioritaire sur les élèves, notamment ceux à risque ou discrets.  

      ◦ Balayage visuel dynamique et itératif (stratégies de « coup d'œil »).  

      ◦ Raisonnement basé sur l'anticipation des conséquences et les cadres théoriques.

      Enseignants Novices / Futurs Enseignants :

      ◦ Focus excessif sur l'enseignant ou les éléments visuels saillants (bruit, mouvement).   

      ◦ Attention portée uniquement aux élèves « hyper-participatifs » ou très perturbateurs.   

      ◦ Difficulté à se détacher de la gestion disciplinaire immédiate.

      Outils de Formation

      Micro-enseignement : Entraînement en milieu sécurisé devant ses pairs avant de faire face à de vrais élèves.

      Grille Miroir : Outil de codage des gestes professionnels permettant un feedback objectif basé sur la vidéo.

      Vidéos enrichies : Utilisation de prompts (indices visuels) pour orienter le regard du novice vers les zones importantes.

      --------------------------------------------------------------------------------

      5. La Triade de l'Accompagnement en Stage

      Le développement du futur enseignant repose sur une interaction entre trois acteurs clés : le stagiaire, le maître de stage (terrain) et le superviseur (institution).

      Le Dialogue Collaboratif

      La recherche souligne l'importance de dépasser le simple échange « question-réponse » pour viser la co-construction.

      Style de Supervision : Les superviseurs doivent être capables de moduler leur style (directif ou non-directif) comme un musicien change de registre.

      Défis de la Collaboration : Le dialogue peut être freiné par la peur de l'évaluation ou par des visions discordantes entre l'université et le terrain.

      Objectif : Transformer le stage en un espace de réflexion où le stagiaire n'est pas un simple exécutant, mais un praticien capable d'analyser ses propres erreurs comme des leviers d'apprentissage.

      --------------------------------------------------------------------------------

      Conclusion

      L'enseignement explicite est une approche pragmatique qui refuse l'opposition entre instruction et éducation.

      En outillant les enseignants avec des gestes professionnels documentés et en développant leur vision professionnelle, ce modèle vise à instaurer une culture de la réussite où l'enseignant est pleinement responsable de la progression de chaque élève, tout en conservant sa liberté pédagogique au sein d'un cadre scientifique rigoureux.

    1. Reviewer #1 (Public review):

      Summary:

      Using high-precision eyetracking, the authors measure foveolar sensitivity modulations before, during, and after instructed microsaccades to a centrally cued orientation stimulus.

      Strengths:

      The article is clearly written, and the stimulus presentation method is sophisticated and well-established. The data provide interesting insights that will be useful for comparisons between trans-saccadic and trans-microsaccadic sensitivity modulations.

      Weaknesses:

      Nonetheless, I have major concerns regarding the interpretation of the measured time courses (in particular, inconsistencies in distinguishing enhancement from suppression), the attempt to disentangle these effects from endogenous attention shifts, and the overstatement of the findings' novelty.

      (1) Overstatement of novelty

      The authors motivate their study by stating that "the temporal dynamics of these pre-microsaccadic modulations remain unknown" (l. 55-56). However, Shelchkova & Poletti (2020) already report a microsaccade-aligned sensitivity time course. I understand that the present study uses shorter target durations and thus provides a more resolved estimate. Nonetheless, a fairer characterization of the study's novelty would be that observers' discrimination performance is continuously measured across the pre-, intra-, and post-movement interval, within the same observers and experimental design. Relatedly, the authors state that it is unclear whether pre-microsaccadic sensitivity modulations reflect "suppression at the non-foveated location, enhancement at the microsaccade target, or both" (l. 70). Guzhang et al. (2024) examined the spatial spread of pre-microsaccadic sensitivity modulations by measuring performance at the PRL, the movement target, and several other equidistant locations. They report that "whereas fine spatial vision is enhanced at the microsaccade goal location, it drops at the very center of gaze". The current authors' reasoning seems to be that performances at locations that are neither the target nor the PRL may behave differently. Why would that be the case? If my understanding is correct, I would recommend incorporating these clarifications into the motivation paragraph, so that readers less familiar with the literature do not overestimate the novelty of the findings. Moreover, and related to point 3, I am unsure if the current analyses provide decisive evidence to distinguish enhancement from suppression, as claimed by the authors.

      (2) Distinction from endogenous attention

      To "rule out the possible influence of covert attention" (l. 232), the authors compute a cue-aligned in addition to the movement-aligned performance time course. A difference in alignment cannot rule out the influence of a certain mechanism; it can only dilute it. Just like endogenous attention may contribute to the movement-aligned time course, movement preparation will necessarily contribute to the cue-aligned time course, since these timelines are intrinsically correlated: as the trial progresses, observers will be in later and later stages of saccade preparation. For this and several additional reasons, an effect in the cue-aligned time course is in fact expected-and, in my view, clearly present (see below). As the authors themselves note, endogenous attention has been shown to operate within the foveola and should therefore be engaged in the present experiment in addition to movement-related attentional shifts (unless the authors believe that specific design features, e.g., stimulus timing, preclude its involvement?). Regardless of the theoretical considerations, the empirical data show a pronounced, near-linear increase in performance at the target location, with d′ doubling from approximately 1 to 2. Although the interaction between condition and time does not reach significance (p = 0.09), this result should not be taken as conclusive evidence against a plausible and perhaps expected contribution of endogenous attention. I suggest an additional analysis that could more directly address these issues. In previous work (Rolfs & Carrasco, 2012; Kroell & Rolfs, 2025; see Figure 3), the relative contributions of cue-alinged influences and pre-saccadic attention were disentangled by reweighting each data point according to its position on both the cue-locked and saccade-locked timelines. Applied to the present study, the authors could compute, for each cue-to-target offset bin, its proportional contribution to each pre-movement time bin. Microsaccade-locked sensitivities could then be reweighted based on these proportions. As a result, each movement-locked time bin would contain equal contributions from all cue-locked time bins, effectively isolating the effect of microsaccade preparation.

      (3) Interpretation and analysis of the time course

      (3.1) Discrimination before microsaccade onset<br /> In lines 151-153, the author state "While the enhancement at the target location did not reach significance relative to baseline, the impairment at the non-target location did", suggesting that pre-movement sensitivity advantages for information presented at the target location are due to a decrease in performance at the non-target location and not an enhancement at the target location per se. After analyzing the difference between the two locations, the authors state, "These results show that approximately 100 milliseconds before microsaccade onset, discrimination rapidly improved at the intended target location while decreasing at the non-target location." (l. 159-161). How is the statement that discrimination performance rapidly improved (which is repeated throughout the manuscript) justified by the results?

      More generally, the authors may benefit from applying bootstrapping or permutation-based analyses to their data. Such approaches would, for example, allow direct comparisons between congruent and incongruent conditions at every individual time point in Figure 3B and may be more sensitive to temporally confined sensitivity variations while requiring fewer assumptions than analyses based on manually segregated temporal bins and aggregate measures. If enhancement at the target location does not reach significance even in these analyses, all corresponding statements should be removed throughout the manuscript. The term "enhancement" should then be rephrased as "detection advantage" or "relative performance benefit" to emphasize the contrast to enhancement effects classically associated with pre-saccadic attention shifts.

      Relatedly, the authors state that pre-microsaccadic enhancement peaks around 70 ms before microsaccade onset, which is earlier than sensitivity enhancements preceding large-scale saccades that often increase monotonically up until movement onset. The authors suggest potential reasons for this in the Discussion, yet an additional one seems conceivable based on Figure 3B. Performances at both the cue-congruent and incongruent location decrease leading up to the movement, reaching values even below their early baselines around 100 ms and 25 ms before movement onset for the incongruent and congruent location, respectively. A spatially non-specific decline that drives sensitivities toward a common absolute minimum may thus dictate the time course of detection advantages. In other words, a spatially widespread decrease in foveolar sensitivity likely contributes to both "suppression" at the non-target location and the decrease in "enhancement" at the target location. If this general decrease is due to saccadic suppression, as the authors suggest, it appears to exert a much more pronounced influence on sensitivity modulations than it does before large-scale saccades (which is interesting). Are there other findings suggesting an increased magnitude of micro-saccadic (as compared to saccadic) suppression? Another potentially related phenomenon is the decrease in pre-saccadic foveal detection performances reported twice before (Hanning & Deubel, 2022; Kroell & Rolfs, 2022). It is possible that whatever mechanism triggers this decrease is engaged by the preparation of microsaccadic and saccadic motor programs alike. In any case, I would ask the authors to acknowledge this general decrease early on to clarify that any currently significant advantage for the target location relies on varied degrees of suppression, and not on true enhancement similar to pre-saccadic attention shifts.

      Moreover, in Figure 3C, the final 25 ms before microsaccade onset are excluded from the aggregate measure, presumably since including this interval substantially reduces the effect size. Since the last 25 ms before movement onset is the interval most commonly associated with saccadic suppression, I think that this choice can be justified. Nonetheless, it should be mentioned explicitly in the main text. On a minor note, the authors state that "Performance (evaluated as percent of correct responses) was averaged within a 50 millisecond sliding window, advancing in 1 ms steps (with 24 ms overlap)". Why is the overlap not 49 ms?

      (3.2) Discrimination during the microsaccade:<br /> The authors state that "in the "during" trials the target must be presented during the peak speed of the microsaccade." Since the target was presented for 50 ms and the average microsaccade duration was around 60 ms, this implies that the intra-microsaccadic condition includes many trials in which the target overlapped with the pre- or post-movement fixation interval. Were there not enough trials to isolate purely intra-microsaccadic presentations? Are the results descriptively comparable?

      (4) Additional analyses

      Several additional analyses could strengthen the authors' conclusions. If there are enough trials in which observers erroneously saccaded to the uncued (i.e., wrong) location, these trials could experimentally isolate the influence of pre-microsaccadic attention, assuming that endogenous attention went to the cued location. In addition, the authors speculate whether differences in saccadic and microsaccadic movement latencies may underlie the differences in perceptual time courses. The latency distributions provided in the manuscript look sufficiently broad, such that intra-individual variation could be harnessed to explore this question. Do sensitivity time courses differ before microsaccades with shorter vs. longer latencies?

      (5) Clarifications regarding the design

      At 50 ms, the duration of the to-be discriminated stimulus, although shorter than in previous investigations, is still rather long. What is the reason for this? I would encourage the authors to state in the main text that the duration of the analyzed/plotted time bins is often shorter than the stimulus duration (i.e., there is some overlap between bins that likely introduces smoothing). In Figure 3A, it would be helpful to plot raw data points computed from non-overlapping bins on top of the moving-window estimates, to allow readers to assess the degree of smoothing and potential temporal delays introduced by this analysis. Moreover, I wonder whether the abrupt onset of the target unmasked by flickering noise masks might induce saccadic inhibition, which would manifest as a transient dip in saccade execution probability. The distributions shown in Figure 2B appear too smoothed or fitted to clearly reveal such a dip. How exactly are all distributions in the manuscript computed (e.g., binning, smoothing, fitting procedures)? Finally, on a minor note, explicitly stating on line 105 that two different orientations can be presented at the cued and non-cued location would help avoid potential confusion.

    2. Reviewer #2 (Public review):

      Summary and overall evaluation:

      The authors assessed how visual discrimination of stimuli in the foveola changes before, during, and after small instructed eye movements (in the "micro" range). Consistent with (and advancing) related prior work, their main finding regards a pre-saccadic modulation of visual performance at the saccade target vs. the opposite location. This pre-saccadic modulation in foveal vision peaks ~70 ms prior to the instructed small saccade.

      Strengths:

      The study uses an impressive, technically advanced set-up and zooms in on peri-saccadic modulations in visual acuity at the micro scale. The findings build on related prior findings from the literature on smaller and larger eye movements and add temporal granularity over prior work from the same lab. The writing is easy to follow, and the figures are clear.

      Weaknesses:

      At the same time, the findings remain relatively empirical in nature and do not profoundly advance theoretical understanding beyond adding valuable granularity to existing knowledge. Relevant prior literature could be better introduced and acknowledged. In addition, there remain concerns regarding potential cue-driven attentional influences that may confound the reported effects (leaving the possibility that the reported effects may be related to cue-driven attention, rather than saccade planning/execution per se). There are also some issues regarding specific statistical inferences. I detail these points below.

      Major Points:

      (1) Novelty framing and introduction of relevant prior literature

      At times, this study is introduced as if no prior study explored the time course of changes in visual perception surrounding small (micro) saccades. Yet, it appears that a prior study from the same lab, using a very similar task, already showed a time course (Figure 5 in Shelchkova & Poletti, 2020). While this study is discussed in the introduction, it is not mentioned that at least some pre-saccade time course was already reported there, albeit a more crude one than the one in the current article. Moreover, the 2013 study by Hafed also specifically looked at "peri-microsaccade modulation in visual perception" and also already showed a temporal modulation that peaked ~50 ms before microsaccade onset. I appreciate how the current study differs in a number of ways (focusing on visual acuity in the foveola), but I was nevertheless surprised to see the first reference to this relevant prior finding in the discussion (and without any elaboration). Though more recent, the same could be argued for the 2025 study by Bouhnik et al. on pre-microsaccade modulations in visual processing in V1, which, like the Hafed study, is first mentioned only in the discussion. Perhaps these studies could be introduced in the paragraph starting at line 48, or in the next paragraph, to do better justice to the existing literature on this topic when motivating the study. This would likely also help to better point out the major advances provided by the current study.

      Relatedly, in Shelchkova & Poletti (PNAS, 2020), an apparently similar congruency effect on performance was reported >200 ms milliseconds before saccade onset, as evident from Fig 5 in that article. How should readers rhyme this with the current findings? Ideally, the authors would not only acknowledge that such a time course was already reported previously, but also discuss the discrepancies between these findings further: why may the performance effects appear much earlier in this prior study compared to in the current study, where the congruency effect emerges only ~100 ms prior to the instructed small saccade?

      (2) Saccade- or cue-driven? (assumption that attention is unaltered in failed saccade trials)

      Because the authors used a cue to instruct saccade direction, it remains a possibility that the reported modulations in visual performance may be driven directly by the spatial cue (cue-related attentional allocation), rather than the instructed small saccade per se. While the authors are clearly aware of this potential confound, questions remain regarding the convincingness of the presented control analyses. In my view, a more compelling control would require an additional experiment.

      The central argument against a cue-locked (purely attentional) modulation is the absence of a performance modulation in so-called "failed" saccade trials. However, a key assumption here is that putative cue-driven attention was unaltered in these trials. This is never verified and, in my opinion, highly unlikely. Rather, trials with failed microsaccades could very well be the result of failing to process the cue in the first place (indeed, if the task is to make a saccade to the cue, failure to make a saccade equates failure to perform the task). In such trials, any putative cue-driven influences over spatial attention would also be expected to be substantially reduced. Accordingly, just because failed saccade trials show little performance modulation does not rule out cue-driven attention effects, because attention may also have "failed" in these failed saccade trials. The control for potential cue-driven attention effects would be more convincing if the authors included a condition with the same cues, where participants are simply not instructed to make any saccades to the cues. Unfortunately, such an experimental condition appears not to have been included here. The author may still consider adding such a control experiment.

      Another argument against a cue-driven effect is that the authors found no interaction with time in the cue-locked data, whereas they did find such an interaction in the saccade-locked data. However, the lack of significance in the cue-locked data but significance in the saccade-locked data is not strong evidence against a cue-driven influence. Statistically, there is no direct comparison here, and more importantly, with longer delays, the cue-locked data may also start to show a dip (this could potentially be tested by the authors if they have enough trials available to extend their cue-locked analysis further in time). Indeed, exogenous attention, that may have been automatically evoked by the spatial cue, is known to be transient and to eventually even reverse after a brief initial facilitation (see e.g., Klein TiCS, 2000).

      Finally, the authors consistently refer to "endogenous" attention (starting at line 221) when addressing potential cue-driven attention confounds. However, because the cue is not predictive, but is a spatial cue that differs in a bottom-up manner between left and right cues, "exogenous" attention is a more likely confound here in my view. Specifically, the spatial cue may automatically trigger attention in the direction of the target location it points to (and such exogenous effects would be expected even for unpredictive cues).

      (3) Benefit and cost, or just cost?

      Line 151 states that no statistically significant benefit for the saccade target was found compared to the neutral baseline. Yet, the claim throughout the article is distinct, such as in line 159: "These results show that approximately 100 milliseconds before microsaccade onset, discrimination rapidly improved at the intended target location". I do not question the robustness of the congruency effect, but the authors should be more careful when inferring "improved" perception at the target location because, as far as I could tell (as well as in the authors' own writing in line 151), this is not substantiated statistically when compared to the neutral baseline.

      Related to this point, in Figure 3B, it would be informative to also see the average performance in the neutral cue condition (for example, as a straight line as in some other figures). This would help to better appreciate the relative benefits and/or costs compared to the neutral condition, also in the time-resolved data.

      (4) Statistical inference for the comparison between failed and non-failed trials

      Currently, the lack of modulation in the failed saccade trials hinges on a null effect. It would be stronger to support the claims with a significant difference in the congruency effect between failed and non-failed trials. Indeed, lack of significance in failed saccade trials does by itself not constitute valid evidence that the congruency effect is larger in saccade compared to failed saccade trials. For this, a significant interaction between saccade-trial-type (failed/non-failed) and congruency (congruent/incongruent) should be established (see e.g., Nieuwenhuis et al., Nat Neurosci, 2011).

      (5) Time window justification

      While the authors nicely depict their data across the full time axis, all statistics are currently performed on data extracted from specific time windows. How exactly were these time windows determined and justified? Likewise, how were the specific times picked for visualizing and statistically quantifying the data in e.g., Figures 3D and E? It would be reassuring to add justification for these specific time windows and/or to verify (using follow-up analyses) that the presented results are robust when different time windows are chosen.

      (6) Microsaccade definition

      Microsaccades are explicitly defined as being below half a degree. This appears rather arbitrary and rigid. Does the size of saccades not ultimately depend on the task and stimulus (e.g., Otero-Millan et al., PNAS, 2013) rather than being a fixed biological property? Perhaps this could be stated less rigidly, such as by stating how microsaccades are often observed below 0.5 degrees.

      (Relatedly, one may wonder whether the type of instructed saccades that the authors studied here involves the same type of eye movements as the type of fixational microsaccades that have been the focus of ample prior studies. However, I recognize that this specific reflection may open a debate that is beyond the scope of this article.

    1. Reviewer #1 (Public review):

      Review of the revised submission:

      I thank the authors for their detailed consideration of my comments and for the additional data, analyses, and clarifications they have incorporated. The new behavioral experiments, quantification of targeted manipulations, and expanded methodological details strengthen the manuscript and address many of my initial concerns. While some questions remain for future work, the authors' careful responses and the additional evidence provided help resolve the main issues I raised, and I am generally satisfied with the revisions.

      Review of original submission:

      Summary

      In this article, Kawanabe-Kobayashi et al., aim to examine the mechanisms by which stress can modulate pain in mice. They focus on the contribution of noradrenergic neurons (NA) of the locus coeruleus (LC). The authors use acute restraint stress as a stress paradigm and found that following one hour of restraint stress mice display mechanical hypersensitivity. They show that restraint stress causes the activation of LC NA neurons and the release of NA in the spinal cord dorsal horn (SDH). They then examine the spinal mechanisms by which LC→SDH NA produces mechanical hypersensitivity. The authors provide evidence that NA can act on alphaA1Rs expressed by a class of astrocytes defined by the expression of Hes (Hes+). Furthermore, they found that NA, presumably through astrocytic release of ATP following NA action on alphaA1Rs Hes+ astrocytes, can cause an adenosine-mediated inhibition of SDH inhibitory interneurons. They propose that this disinhibition mechanism could explain how restraint stress can cause the mechanical hypersensitivity they measured in their behavioral experiments.

      Strengths:

      (1) Significance. Stress profoundly influences pain perception; resolving the mechanisms by which stress alters nociception in rodents may explain the well-known phenomenon of stress-induced analgesia and/or facilitate the development of therapies to mitigate the negative consequences of chronic stress on chronic pain.

      (2) Novelty. The authors' findings reveal a crucial contribution of Hes+ spinal astrocytes in the modulation of pain thresholds during stress.

      (3) Techniques. This study combines multiple approaches to dissect circuit, cellular, and molecular mechanisms including optical recordings of neural and astrocytic Ca2+ activity in behaving mice, intersectional genetic strategies, cell ablation, optogenetics, chemogenetics, CRISPR-based gene knockdown, slice electrophysiology, and behavior.

      Weaknesses:

      (1) Mouse model of stress. Although chronic stress can increase sensitivity to somatosensory stimuli and contribute to hyperalgesia and anhedonia, particularly in the context of chronic pain states, acute stress is well known to produce analgesia in humans and rodents. The experimental design used by the authors consists of a single one-hour session of restraint stress followed by 30 min to one hour of habituation and measurement of cutaneous mechanical sensitivity with von Frey filaments. This acute stress behavioral paradigm corresponds to the conditions in which the clinical phenomenon of stress-induced analgesia is observed in humans, as well as in animal models. Surprisingly, however, the authors measured that this acute stressor produced hypersensitivity rather than antinociception. This discrepancy is significant and requires further investigation.

      (2) Specifically, is the hypersensitivity to mechanical stimulation also observed in response to heat or cold on a hotplate or coldplate?

      (3) Using other stress models, such as a forced swim, do the authors also observe acute stress-induced hypersensitivity instead of stress-induced antinociception?

      (4) Measurement of stress hormones in blood would provide an objective measure of the stress of the animals.

      (5) Results:

      (a) Optical recordings of Ca2+ activity in behaving rodents are particularly useful to investigate the relationship between Ca2+ dynamics and the behaviors displayed by rodents.

      (b) The authors report an increase in Ca2+ events in LC NA neurons during restraint stress: Did mice display specific behaviors at the time these Ca2+ events were observed such as movements to escape or orofacial behaviors including head movements or whisking?

      (c) Additionally, are similar increases in Ca2+ events in LC NA neurons observed during other stressful behavioral paradigms versus non-stressful paradigms?

      (d) Neuronal ablation to reveal the function of a cell population.

      (e) The proportion of LC NA neurons and LC→SDH NA neurons expressing DTR-GFP and ablated should be quantified (Figures 1G and J) to validate the methods and permit interpretation of the behavioral data (Figures 1H and K). Importantly, the nocifensive responses and behavior of these mice in other pain assays in the absence of stress (e.g., hotplate) and a few standard assays (open field, rotarod, elevated plus maze) would help determine the consequences of cell ablation on processing of nociceptive information and general behavior.

      (f) Confirmation of LC NA neuron function with other methods that alter neuronal excitability or neurotransmission instead of destroying the circuit investigated, such as chemogenetics or chemogenetics, would greatly strengthen the findings. Optogenetics is used in Figure 1M, N but excitation of LC→SDH NA neuron terminals is tested instead of inhibition (to mimic ablation), and in naïve mice instead of stressed mice.

      (g) Alpha1Ars. The authors noted that "Adra1a mRNA is also expressed in INs in the SDH".

      (h) The authors should comprehensively indicate what other cell types present in the spinal cord and neurons projecting to the spinal cord express alpha1Ars and what is the relative expression level of alpha1Ars in these different cell types.

      (i) The conditional KO of alpha1Ars specifically in Hes5+ astrocytes and not in other cell types expressing alpha1Ars should be quantified and validated (Figure 2H).

      (j) Depolarization of SDH inhibitory interneurons by NA (Figure 3). The authors' bath applied NA, which presumably activates all NA receptors present in the preparation.

      k) The authors' model (Figure 4H) implies that NA released by LC→SDH NA neurons leads to the inhibition of SDH inhibitory interneurons by NA. In other experiments (Figure 1L, Figure 2A), the authors used optogenetics to promote the release of endogenous NA in SDH by LC→SDH NA neurons. This approach would investigate the function of NA endogenously released by LC NA neurons at presynaptic terminals in the SDH and at physiological concentrations and would test the model more convincingly compared to the bath application of NA.

      (l) As for other experiments, the proportion of Hes+ astrocytes that express hM3Dq, and the absence of expression in other cells, should be quantified and validated to interpret behavioral data.

      (m) Showing that the effect of CNO is dose-dependent would strengthen the authors' findings.

      (n) The proportion of SG neurons for which CNO bath application resulted in a reduction in recorded sIPSCs is not clear.

      (o) A1Rs. The specific expression of Cas9 and guide RNAs, and the specific KD of A1Rs, in inhibitory interneurons but not in other cell types expressing A1Rs should be quantified and validated.

      (6) Methods:

      It is unclear how fiber photometry is performed using "optic cannula" during restraint stress while mice are in a 50ml falcon tube (as shown in Figure 1A).

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public review):

      Summary:

      In this article, Kawanabe-Kobayashi et al., aim to examine the mechanisms by which stress can modulate pain in mice. They focus on the contribution of noradrenergic neurons (NA) of the locus coeruleus (LC). The authors use acute restraint stress as a stress paradigm and found that following one hour of restraint stress mice display mechanical hypersensitivity. They show that restraint stress causes the activation of LC NA neurons and the release of NA in the spinal cord dorsal horn (SDH). They then examine the spinal mechanisms by which LC→SDH NA produces mechanical hypersensitivity. The authors provide evidence that NA can act on alphaA1Rs expressed by a class of astrocytes defined by the expression of Hes (Hes+). Furthermore, they found that NA, presumably through astrocytic release of ATP following NA action on alphaA1Rs Hes+ astrocytes, can cause an adenosine-mediated inhibition of SDH inhibitory interneurons. They propose that this disinhibition mechanism could explain how restraint stress can cause the mechanical hypersensitivity they measured in their behavioral experiments.

      Strengths:

      (1) Significance. Stress profoundly influences pain perception; resolving the mechanisms by which stress alters nociception in rodents may explain the well-known phenomenon of stress-induced analgesia and/or facilitate the development of therapies to mitigate the negative consequences of chronic stress on chronic pain.

      (2) Novelty. The authors' findings reveal a crucial contribution of Hes+ spinal astrocytes in the modulation of pain thresholds during stress.

      (3) Techniques. This study combines multiple approaches to dissect circuit, cellular, and molecular mechanisms including optical recordings of neural and astrocytic Ca2+ activity in behaving mice, intersectional genetic strategies, cell ablation, optogenetics, chemogenetics, CRISPR-based gene knockdown, slice electrophysiology, and behavior.

      Weaknesses:

      (1) Mouse model of stress. Although chronic stress can increase sensitivity to somatosensory stimuli and contribute to hyperalgesia and anhedonia, particularly in the context of chronic pain states, acute stress is well known to produce analgesia in humans and rodents. The experimental design used by the authors consists of a single one-hour session of restraint stress followed by 30 min to one hour of habituation and measurement of cutaneous mechanical sensitivity with von Frey filaments. This acute stress behavioral paradigm corresponds to the conditions in which the clinical phenomenon of stress-induced analgesia is observed in humans, as well as in animal models. Surprisingly, however, the authors measured that this acute stressor produced hypersensitivity rather than antinociception. This discrepancy is significant and requires further investigation.

      We thank the reviewer for evaluating our work and for highlighting both its strengths and weaknesses. As stated by the reviewer, numerous studies have reported acute stress-induced antinociception. However, as shown in a new additional table (Table S1) in which we have summarized previously published data using the acute restraint stress model employed in our present study, most studies reporting antinociceptive effects of acute restraint stress assessed behavioral responses to heat stimuli or formalin. This observation is consistent with the findings from our previous study (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)). The present study also confirms that acute restraint stress reduces behavioral responses to noxious heat (see also our response to Comment #2 below). In contrast to the robust and consistent antinociceptive effects observed with thermal stimuli, some studies evaluating behavioral responses to mechanical stimuli have reported stress-induced hypersensitivity (see Table S1), which aligns with our current findings. Taken together, these data support our original notion that the effects of acute stress on pain-related behaviors depend on several factors, including the nature, duration, and intensity of the stressor, as well as the sensory modality assessed in behavioral tests. We have incorporated this discussion and Table S1 into the revised manuscript (lines 344-353). Furthermore, we have slightly modified the text including the title, replacing "pain facilitation" with "mechanical pain hypersensitivity" to more accurately reflect our research focus and the conclusion of this study that LC<sup>→SDH</sup> NAergic signaling to spinal astrocytes is required for stress-induced mechanical pain hypersensitivity. Finally, while mouse models of stress could provide valuable insights, the clinical relevance of stress-induced mechanical pain hypersensitivity remains to be elucidated and requires further investigation. We hope these clarifications address your concerns.

      (2) Specifically, is the hypersensitivity to mechanical stimulation also observed in response to heat or cold on a hotplate or coldplate?

      Thank you for your important comment. We have now conducted additional behavioral experiments to assess responses to heat using the hot-plate test. We found that mice subjected to restraint stress did not exhibit behavioral hypersensitivity to heat stimuli; instead, they displayed antinociceptive responses (Figure S2; lines 95-98). These results are consistent with our previous findings (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)) as well as numerous other reports (Table S1).

      (3) Using other stress models, such as a forced swim, do the authors also observe acute stress-induced hypersensitivity instead of stress-induced antinociception?

      As suggested by the reviewer, we conducted a forced swim test. We found that mice subjected to forced swimming, which has been reported to produce analgesic effects on thermal stimuli (Contet et al., Neuropsychopharmacology, 2006 (PMID: 16237385)), did not exhibit any changes in mechanical pain hypersensitivity (Figure S2; lines 98-99). Furthermore, a previous study demonstrated that mechanical pain sensitivity is enhanced by other stress models, such as exposure to an elevated open platform for 30 min (Kawabata et al., Neuroscience, 2023 (PMID: 37211084)). However, considering our data showing that changes in mechanosensory behavior induced by restraint stress depend on the duration of exposure (Figure S1), and that restraint stress also produced an antinociceptive effect on heat stimuli (Figure S2), stress-induced modulation of pain is a complex phenomenon influenced by multiple factors, including the stress model, intensity, and duration, as well as the sensory modality used for behavioral testing (lines 100-103).

      (4) Measurement of stress hormones in blood would provide an objective measure of the stress of the animals.

      A previous study has demonstrated that plasma corticosterone levels—a stress hormone—are elevated following a 1-hour exposure to restraint stress in mice (Kim et al., Sci Rep, 2018 (PMID: 30104581)), using a stress protocol similar to that employed in our current study. We have included this information with citing this paper (lines 104-105).

      (5) Results:

      (a) Optical recordings of Ca2+ activity in behaving rodents are particularly useful to investigate the relationship between Ca2+ dynamics and the behaviors displayed by rodents.

      In the optical recordings of Ca<sup>2+</sup> activity in LC neurons, we monitored mouse behavior during stress exposure. We have now included a video of this in the revised manuscript (video; lines 111-114).

      (b) The authors report an increase in Ca2+ events in LC NA neurons during restraint stress: Did mice display specific behaviors at the time these Ca2+ events were observed such as movements to escape or orofacial behaviors including head movements or whisking?

      By reanalyzing the temporal relationship between Ca<sup>2+</sup> events and mouse behavior during stress exposure, we found that the Ca<sup>2+</sup> transients and escape behaviors (struggling) occurred almost simultaneously (video). A similar temporal correlation is also observed in Ca<sup>2+</sup> responses in the bed nucleus of the stria terminalis (Luchsinger et al., Nat Commun, 2021 (PMID: 34117229)). The video file has been included in the revised manuscript (video; lines 111-113, 552-553, 573-575).

      Additionally, as described in the Methods section and shown in Figure S2 of the initial version (now Figure S3), non-specific signals or artifacts—such as those caused by head movements—were corrected (although such responses were minimal in our recordings).

      (c) Additionally, are similar increases in Ca2+ events in LC NA neurons observed during other stressful behavioral paradigms versus non-stressful paradigms?

      We appreciate the reviewer's valuable suggestion. Since the present, initial version of our manuscript focused on acute restraint stress, we did not measure Ca<sup>2+</sup> events in LC-NA neurons in other stress models, but a recent study has shown an increase in Ca<sup>2+</sup> responses in LC-NA neurons by social defeat stress (Seiriki et al., BioRxiv, https://www.biorxiv.org/content/10.1101/2025.03.07.641347v1).

      (d) Neuronal ablation to reveal the function of a cell population.

      This method has been widely used in numerous previous studies as an effective experimental approach to investigate the role of specific neuronal populations—including SDH-projecting LC-NA neurons (Ma et al., Brain Res, 2022 (PMID: 34929182); Kawanabe et al., Mol Brain, 2021 (PMID: 33971918))—in CNS function.

      (e) The proportion of LC NA neurons and LC→SDH NA neurons expressing DTR-GFP and ablated should be quantified (Figures 1G and J) to validate the methods and permit interpretation of the behavioral data (Figures 1H and K). Importantly, the nocifensive responses and behavior of these mice in other pain assays in the absence of stress (e.g., hotplate) and a few standard assays (open field, rotarod, elevated plus maze) would help determine the consequences of cell ablation on processing of nociceptive information and general behavior.

      As suggested, we conducted additional experiments to quantitatively analyze the number of LC<sup>→SDH</sup>-NA neurons. We used WT mice injected with AAVretro-Cre into the SDH (L4 segment) and AAV-FLEx[DTR-EGFP] into the LC. In these mice, 4.4% of total LC-NA neurons [positive for tyrosine hydroxylase (TH)] expressed DTR-GFP, representing the LC<sup>→SDH</sup>-NA neuronal population (Figure S4; lines 126-127). Furthermore, treatment with DTX successfully ablated the DTR-expressing LC<sup>→SDH</sup>-NA neurons. Importantly, the neurons quantified in this analysis were specifically those projecting to the L4 segment of the SDH; therefore, the total number of SDH-projecting LC-NA neurons across all spinal segments is expected to be much higher.

      We also performed the rotarod and paw-flick tests to assess motor function and thermal sensitivity following ablation of LC<sup>→SDH</sup>-NA neurons. No significant differences were observed between the ablated and control groups (Figure S5; lines 131-134), indicating that ablation of these neurons does not produce non-specific behavioral deficits in motor function or other sensory modalities.

      (f) Confirmation of LC NA neuron function with other methods that alter neuronal excitability or neurotransmission instead of destroying the circuit investigated, such as chemogenetics or chemogenetics, would greatly strengthen the findings. Optogenetics is used in Figure 1M, N but excitation of LCLC<sup>→SDH</sup> NA neuron terminals is tested instead of inhibition (to mimic ablation), and in naïve mice instead of stressed mice.

      We appreciate the reviewer’s comment. The optogenetic approach is useful for manipulating neuronal excitability; however, prolonged light illumination (> tens of seconds) can lead to undesirable tissue heating, ionic imbalance, and rebound spikes (Wiegert et al., Neuron, 2017 (PMID: 28772120)), making it difficult to apply in our experiments, in which mice are exposed to stress for 60 min. For this reason, we decided to employ the cell-ablation approach in stress experiments, as it is more suitable than optogenetic inhibition. In addition, as described in our response to weakness (1)-a) by Reviewer 3 (Public review), we have now demonstrated the specific expression of DTRs in NA neurons in the LC, but not in A5 or A7 (Figure S4; lines 127-128), confirming the specificity of LCLC<sup>→SDH</sup>-NAergic pathway targeting in our study. Chemogenetics represent another promising approach to further strengthen our findings on the role of LCLC<sup>→SDH</sup>-NA neurons, but this will be an important subject for future studies, as it will require extensive experiments to assess, for example, the effectiveness of chemogenetic inhibition of these neurons during 60 min of restraint stress, as well as optimization of key parameters (e.g., systemic DCZ doses).

      (g) Alpha1Ars. The authors noted that "Adra1a mRNA is also expressed in INs in the SDH".

      The expression of α<sub>1A</sub>Rs in inhibitory interneurons in the SDH is consistent with our previous findings (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215)) as well as with scRNA-seq data (http://linnarssonlab.org/dorsalhorn/, Häring et al., Nat Neurosci, 2018 (PMID: 29686262)).

      (h) The authors should comprehensively indicate what other cell types present in the spinal cord and neurons projecting to the spinal cord express alpha1Ars and what is the relative expression level of alpha1Ars in these different cell types.

      According to the scRNA-seq data (https://seqseek.ninds.nih.gov/genes, Russ et al., Nat Commun, 2021 (PMID: 34588430); http://linnarssonlab.org/dorsalhorn/, Häring et al., Nat Neurosci, 2018 (PMID: 29686262)), we confirmed that α<sub>1A</sub>Rs are predominantly expressed in astrocytes and inhibitory interneurons in the spinal cord. Also, an α<sub>1A</sub>R-expressing excitatory neuron population (Glut14) expresses Tacr1, GPR83, and Tac1 mRNAs, markers that are known to be enriched in projection neurons of the SDH. This raises the possibility that α<sub>1A</sub> Rs may also be expressed in a subset of projection neurons, although further experiments are required to confirm this. In DRG neurons, α<sub>1A</sub>R expression was detected to some extent, but its level seems to be much lower than in the spinal cord (http://linnarssonlab.org/drg/ Usoskin et al., Nat Neurosci, 2015 (PMID: 25420068)). Consistent with this, primary afferent glutamatergic synaptic transmission has been shown to be unaffected by α<sub>1A</sub>R agonists (Kawasaki et al., Anesthesiology, 2003 (PMID: 12606912); Li and Eisenach, JPET, 2001 (PMID: 11714880)). This information has been incorporated into the Discussion section (lines 317-319).

      (i) The conditional KO of alpha1Ars specifically in Hes5+ astrocytes and not in other cell types expressing alpha1Ars should be quantified and validated (Figure 2H).

      We have previously shown a selective KO of α<sub>1A</sub>R in Hes5<sup>+</sup> astrocytes in the same mouse line (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This information has been included in the revised text (line 166-167).

      (j) Depolarization of SDH inhibitory interneurons by NA (Figure 3). The authors' bath applied NA, which presumably activates all NA receptors present in the preparation.

      We believe that the reviewer’s concern may pertain to the possibility that NA acts on non-Vgat<sup>+</sup> neurons, thereby indirectly causing depolarization of Vgat<sup>+</sup> neurons. As described in the Method section of the initial version, in our electrophysiological experiments, we added four antagonists for excitatory and inhibitory neurotransmitter receptors—CNQX (AMPA receptor), MK-801 (NMDA receptor), bicuculline (GABA<sub>A</sub> receptor), and strychnine (glycine receptor)—to the artificial cerebrospinal fluid to block synaptic inputs from other neurons to the recorded Vgat<sup>+</sup> neurons. Since this method is widely used for this purpose in many previous studies (Wu et al., J Neurosci, 2004 (PMID: 15140934); Liu et al., Nat Neurosci, 2010 (PMID: 20835251)), it is reasonable to conclude that NA directly acts on the recorded SDH Vgat<sup>+</sup> interneurons to produce excitation (lines 193-196).

      (k) The authors' model (Figure 4H) implies that NA released by LC→SDH NA neurons leads to the inhibition of SDH inhibitory interneurons by NA. In other experiments (Figure 1L, Figure 2A), the authors used optogenetics to promote the release of endogenous NA in SDH by LC→SDH NA neurons. This approach would investigate the function of NA endogenously released by LC NA neurons at presynaptic terminals in the SDH and at physiological concentrations and would test the model more convincingly compared to the bath application of NA.

      We appreciate the reviewer’s valuable comment. As noted, optogenetic stimulation of LC<sup>→SDH</sup>-NA neurons would indeed be useful to test this model. However, in our case, it is technically difficult to investigate the responses of Vgat<sup>+</sup> inhibitory neurons and Hes5<sup>+</sup> astrocytes to NA endogenously released from LC<sup>→SDH</sup>-NA neurons. This would require the use of Vgat-Cre or Hes5-CreERT2 mice, but employing these lines precludes the use of NET-Cre mice, which are necessary for specific and efficient expression of ChrimsonR in LC<sup>→SDH</sup>-NA neurons. Nevertheless, all of our experimental data consistently support the proposed model, and we believe that the reviewer will agree with this, without additional experiments that is difficult to conduct because of technical limitations (lines 382-388).

      (l) As for other experiments, the proportion of Hes+ astrocytes that express hM3Dq, and the absence of expression in other cells, should be quantified and validated to interpret behavioral data.

      We thank the reviewer for raising this point. In our experiments, we used an HA-tag (fused with hM3Dq) to confirm hM3Dq expression. However, it is difficult to precisely analyze individual astrocytes because, as shown in Figure 3J, the boundaries of many HA-tag<sup>+</sup> astrocytes are indistinguishable. This seems to be due to the membrane localization of HA-tag, the complex morphology of astrocytes, and their tile-like distribution pattern (Baldwin et al., Trends Cell Biol, 2024 (PMID: 38180380)). Nevertheless, our previous study demonstrated that ~90% of astrocytes in the superficial laminae are Hes5<sup>+</sup> (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), and intra-SDH injection of AAV-hM3Dq labeled the majority of superficial astrocytes (Figure 3J). Thus, AAV-FLEx[hM3Dq] injection into Hes5-CreERT2 mice allows efficient expression of hM3Dq in Hes5<sup>+</sup> astrocytes in the SDH. Importantly, our previous studies using Hes5-CreERT2 mice have confirmed that hM3Dq is not expressed in other cell types (neurons, oligodendrocytes, or microglia) (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Kagiyama et al., Mol Brain, 2025 (PMID: 40289116)). This information regarding the cell-type specificity has now been briefly described in the revised version (lines 218-219).

      (m) Showing that the effect of CNO is dose-dependent would strengthen the authors' findings.

      Thank you for your comment. We have now demonstrated a dose-dependent effect of CNO on Ca<sup>2+</sup> responses in SDH astrocytes (please see our response to Major Point (4) from Reviewer #2 (Recommendations for the Authors) (Figure S7; lines 225-228). In addition, we also confirmed that the effect of CNO is not nonspecific, as CNO application did not alter sIPSCs in spinal cord slices prepared from mice lacking hM3Dq expression in astrocytes (Figure S7; lines 225-228).

      (n) The proportion of SG neurons for which CNO bath application resulted in a reduction in recorded sIPSCs is not clear.

      We have included individual data points in each bar graph to more clearly illustrate the effect of CNO on each neuron (Figure 3L, N).

      (o) A1Rs. The specific expression of Cas9 and guide RNAs, and the specific KD of A1Rs, in inhibitory interneurons but not in other cell types expressing A1Rs should be quantified and validated.

      In addition to the data demonstrating the specific expression of SaCas9 and sgAdora1 in Vgat<sup>+</sup> inhibitory neurons shown in Figure 3G of the initial version, we have now conducted the same experiments with a different sample and confirmed this specificity: SaCas9 (detected via HA-tag) and sgAdora1 (detected via mCherry) were expressed in PAX2<sup>+</sup> inhibitory neurons (Author response image 1). Furthermore, as shown in Figure 3H and I in the initial version, the functional reduction of A<sub>1</sub>Rs in inhibitory neurons was validated by electrophysiological recordings. Together, these results support the successful deletion of A<sub>1</sub>Rs in inhibitory neurons.

      Author response image 1.

      Expression of HA-tag and mCherry in inhibitory neurons (a different sample from Figure 3G) SaCas9 (yellow, detected by HA-tag) and mCherry (magenta) expression in the PAX2<sup>+</sup> inhibitory neurons (cyan) at 3 weeks after intra-SDH injection of AAV-FLEx[SaCas9-HA] and AAV-FLEx[mCherry]-U6-sgAdora1 in Vgat-Cre mice. Arrowheads indicate genome-editing Vgat<sup>+</sup> cells. Scale bar, 25 µm.

      (6) Methods:

      It is unclear how fiber photometry is performed using "optic cannula" during restraint stress while mice are in a 50ml falcon tube (as shown in Figure 1A).

      We apologize for the omission of this detail in the Methods section. To monitor Ca<sup>2+</sup> events in LC-NA neurons during restraint stress, we created a narrow slit on the top of the conical tube, allowing mice to undergo restraint stress while connected to the optic fiber (see video). This information has now been added to the Methods section (lines 552-553).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Scientific rigor:

      It is unclear if the normal distribution of the data was determined before selecting statistical tests.

      We apologize for omitting this description. For all statistical analyses in this study, we first assessed the normality of the data and then selected appropriate statistical tests accordingly. We have added this information to the revised manuscript (lines 711-712).

      (2) Nomenclature:

      (a) Mouse Genome Informatics (MGI) nomenclature should be used to describe mouse genotypes (i.e., gene name in italic, only first letter is capitalized, alleles in superscript).

      (b) FLEx should be used instead of flex.

      Thank you for the suggestion. We have corrected these terms (including FLEx) according to MGI nomenclature.

      Reviewer #2 (Public review):

      Summary:

      This study investigates the role of spinal astrocytes in mediating stress-induced pain hypersensitivity, focusing on the LC (locus coeruleus)-to-SDH (spinal dorsal horn) circuit and its mechanisms. The authors aimed to delineate how LC activity contributes to spinal astrocytic activation under stress conditions, explore the role of noradrenaline (NA) signaling in this process, and identify the downstream astrocytic mechanisms that influence pain hypersensitivity.

      The authors provide strong evidence that 1-hour restraint stress-induced pain hypersensitivity involves the LC-to-SDH circuit, where NA triggers astrocytic calcium activity via alpha1a adrenoceptors (alpha1aRs). Blockade of alpha1aRs on astrocytes - but not on Vgat-positive SDH neurons - reduced stress-induced pain hypersensitivity. These findings are rigorously supported by well-established behavioral models and advanced genetic techniques, uncovering the critical role of spinal astrocytes in modulating stress-induced pain.

      However, the study's third aim - to establish a pathway from astrocyte alpha1aRs to adenosine-mediated inhibition of SDH-Vgat neurons - is less compelling. While pharmacological and behavioral evidence is intriguing, the ex vivo findings are indirect and lack a clear connection to the stress-induced pain model. Despite these limitations, the study advances our understanding of astrocyte-neuron interactions in stress-pain contexts and provides a strong foundation for future research into glial mechanisms in pain hypersensitivity.

      Strengths:

      The study is built on a robust experimental design using a validated 1-hour restraint stress model, providing a reliable framework to investigate stress-induced pain hypersensitivity. The authors utilized advanced genetic tools, including retrograde AAVs, optogenetics, chemogenetics, and subpopulation-specific knockouts, allowing precise manipulation and interrogation of the LC-SDH circuit and astrocytic roles in pain modulation. Clear evidence demonstrates that NA triggers astrocytic calcium activity via alpha1aRs, and blocking these receptors effectively reduces stress-induced pain hypersensitivity.

      Weaknesses:

      Despite its strengths, the study presents indirect evidence for the proposed NA-to-astrocyte(alpha1aRs)-to-adenosine-to-SDH-Vgat neurons pathway, as the link between astrocytic adenosine release and stress-induced pain remains unclear. The ex vivo experiments, including NA-induced depolarization of Vgat neurons and chemogenetic stimulation of astrocytes, are challenging to interpret in the stress context, with the high CNO concentration raising concerns about specificity. Additionally, the role of astrocyte-derived D-serine is tangential and lacks clarity regarding its effects on SDH Vgat neurons. The astrocyte calcium signal "dip" after LC optostimulation-induced elevation are presented without any interpretation.

      We appreciate the reviewer's careful reading of our paper. According to the reviewer's comments, we have performed new additional experiments and added some discussion in the revised manuscript (please see the point-by-point responses below).

      Reviewer #2 (Recommendations for the authors):

      The astrocyte-mediated pathway of NA-to-astrocyte (alpha1aRs)-to-adenosine-to-SDH Vgat neurons (A1R) in the context of stress-induced pain hypersensitivity requires more direct evidence. While the data showing that the A1R agonist CPT inhibits stress-induced hypersensitivity and that stress combined with Aβ fiber stimulation increases pERK in the SDH are intriguing, these findings primarily support the involvement of A1R on Vgat neurons and are only behaviorally consistent with SDH-Vgat neuronal A1R knockdown. The role of astrocytes in this pathway in vivo remains indirect. The ex vivo chemogenetic Gq-DREADD stimulation of SDH astrocytes, which reduced sIPSCs in Vgat neurons in a CPT-dependent manner, needs revision with non-DREADD+CNO controls to validate specificity. Furthermore, the ex vivo bath application of NA causing depolarization in Vgat neurons, blocked by CPT, adds complexity to the data leaving me wondering how astrocytes are involved in such processes, and it does not directly connect to stress-induced pain hypersensitivity. These findings are potentially useful but require additional refinement to establish their relevance to the stress model.

      We thank the reviewer for the insightful feedback. First, regarding the role of astrocytes in this pathway in vivo, we showed in the initial version that mechanical pain hypersensitivities induced by intrathecal NA injection and by acute restraint stress were attenuated by both pharmacological blockade and Vgat<sup>+</sup> neuron-specific knockdown of A<sub>1</sub>Rs (Figure 4A, B). Given that NA- and stress-induced pain hypersensitivity is mediated by α<sub>1A</sub>R-dependent signaling in Hes5<sup>+</sup> astrocytes (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); this study), these findings provide in vivo evidence supporting the involvement of the NA → Hes5<sup>+</sup> astrocyte (via α<sub>1A</sub>Rs) → adenosine → Vgat<sup>+</sup> neuron (via A<sub>1</sub>Rs) pathway. As noted in the reviewer’s major comment (2), in vivo monitoring of adenosine dynamics in the SDH during stress exposure would further substantiate the astrocyte-to-neuron signaling pathway. However, we did not detect clear signals, potentially due to several technical limitations (see our response below). Acknowledging this limitation, we have now added a new paragraph in the end of Discussion section to address this issue. Second, the specificity of the effect of CNO has now been validated by additional experiments (see our response to major point (4)). Third, the reviewer’s concern regarding the action of NA on Vgat<sup>+</sup> neurons has also been addressed (see our response to major point (3) below).

      Major points:

      (1) The in vivo pharmacology using DCK to antagonize D-serine signaling from alpha1a-activated astrocytes is tangential, as there is limited evidence on how Vgat neurons (among many others) respond to D-serine. This aspect requires more focused exploration to substantiate its relevance.

      We propose that the site of action of D-serine in our neural circuit model is the NMDA receptors (NMDARs) on excitatory neurons, a notion supported by our previous findings (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Kagiyama et al., Mol Brain, 2025 (PMID: 40289116)). However, we cannot exclude the possibility that D-serine also acts on NMDARs expressed by Vgat<sup>+</sup> inhibitory neurons. Nevertheless, given that intrathecal injection of D-serine in naïve mice induces mechanical pain hypersensitivity (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), it appears that the pronociceptive effect of D-serine in the SDH is primarily associated with enhanced pain processing and transmission, presumably via NMDARs on excitatory neurons. We have added this point to the Discussion section in the revised manuscript (lines 325-330).

      (2) Additionally, employing GRAB-Ado sensors to monitor adenosine dynamics in SDH astrocytes during NA signaling would significantly strengthen conclusions about astrocyte-derived adenosine's role in the stress model.

      We agree with the reviewer’s comment. Following this suggestion, we attempted to visualize NA-induced adenosine (and ATP) dynamics using GRAB-ATP and GRAB-Ado sensors (Wu et al., Neuron, 2022 (PMID: 34942116); Peng et al., Science, 2020 (PMID: 32883833)) in acutely isolated spinal cord slices from mice after intra-SDH injection of AAV-hSyn-GRABATP<sub>1.0</sub> and -GRABAdo<sub>1.0</sub>. We confirmed expression of these sensors in the SDH (Author response image 2a) and observed increased signals after bath application of ATP (0.1 or 1 µM) or adenosine (1 µM) (Author response image 2b, c). However, we were unable to detect clear signals following NA stimulation (Author response image 2b, c). The reason for this lack of detectable changes remains unclear. If the release of adenosine from astrocytes is a highly localized phenomenon, it may be measurable using high-resolution microscopy capable of detecting adenosine levels at the synaptic level and more sensitive sensors. Further investigation will therefore be required (lines 340-341).

      Author response image 2.

      Ex vivo imaging of GRAB-ATP and GRAB-Ado sensors.(a) Representative images of GRAB<sub>ATP1.0</sub> (left, green) or GRAB<sub>Ado1.0</sub> (right, green) expression in the SDH at 3 weeks after SDH injection of AAV-hSyn-GRAB<sub>Ado1.0</sub> or AAV-hSyn-GRAB<sub>Ado1.0</sub> in Hes5-CreERT2 mice. Scale bar, 200 µm. (b) Left: Representative fluorescence images showing GRAB<sub>ATP1.0</sub> responses before and after perfusion with NA or ATP. Right: Representative traces showing responses to ATP (0.1 and 1 µM) or NA (10 µM). (c) Left: Representative fluorescence images showing GRABAdo1.0 responses before and after perfusion with NA or adenosine (Ado). Right: Representative traces showing responses to Ado (0.01, 0.1, and 1 µM), NA (10 µM), or no application (negative control).

      (3) The interpretation of Figure 3D is challenging. The manuscript implies that 20 μM NA acts on Adra1a receptors on Vgat neurons to depolarize them, but this concentration should also activate Adra1a on astrocytes, leading to adenosine release and potential inhibition of depolarization. The observation of depolarization despite these opposing mechanisms requires explanation, as does the inhibition of depolarization by bath-applied A1R agonist. Of note, 20 μM NA is a high concentration for Adra1a activation, typically responsive at nanomolar levels. The discussion should reconcile this with prior studies indicating dose-dependent effects of NA on pain sensitivity (e.g., Reference 22).

      Like the reviewer, we also considered that bath-applied NA could activate α<sub>1A</sub>Rs expressed on Hes5<sup>+</sup> astrocytes. To clarify this point, we have performed additional patch-clamp recordings and found that knockdown of A<sub>1</sub>Rs in Vgat<sup>+</sup> neurons tended to increase the proportion of Vgat<sup>+</sup> neurons with NA-induced depolarizing responses (Figure S8). Therefore, it is conceivable that NA-induced excitation of Vgat<sup>+</sup> neurons may involve both a direct effect of NA activating α<sub>1A</sub>Rs in Vgat<sup>+</sup> neurons and an indirect inhibitory signaling from NA-stimulated Hes5<sup>+</sup> astrocytes via adenosine (lines 298-300).

      The concentration of NA used in our ex vivo experiments is higher than that typically used in vitro with αR-<sub>1A</sub>expressing cell lines or primary culture cells, but is comparable to concentrations used in other studies employing spinal cord slices (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652); Baba et al., Anesthesiology, 2000 (PMID: 10691236); Lefton et al., Science, 2025 (PMID: 40373122)). In slice experiments, drugs must diffuse through the tissue to reach target cells, resulting in a concentration gradient. Therefore, higher drug concentrations are generally necessary in slice experiments, in contrast to cultured cell experiments, where drugs are directly applied to target cells. Importantly, we have previously shown that the pharmacological effects of 20 μM NA on Vgat<sup>+</sup> neurons and Hes5<sup>+</sup> astrocytes are abolished by loss of α<sub>1A</sub>Rs in these cells (Uchiyama et al., Mol Brain, 2022 (PMID: 34980215); Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), confirming the specificity of these NA actions.

      Regarding the dose-dependent effect of NA on pain sensitivity, NA-induced pain hypersensitivity is abolished in Hes5<sup>+</sup> astrocyte-specific α<sub>1A</sub>R-KO mice (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), indicating that this behavior is mediated by α<sub>1A</sub>Rs expressed on Hes5<sup>+</sup> astrocytes. In contrast, the suppression of pain sensitivity by high doses of NA was unaffected in the KO mice (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), suggesting that other adrenergic receptors may contribute to this phenomenon. Clarifying the responsible receptors will require future investigation.

      (4) In Figure 3K-M, the CNO concentration used (100 μM) is unusually high compared to standard doses (1 to a few μM), raising concerns about potential off-target effects. Including non-hM3Dq controls and using lower CNO concentrations are essential to validate the specificity of the observed effects. Similarly, the study should clarify whether astrocyte hM3Dq stimulation alone (without NA) would induce hyperpolarization in Vgat neurons and how this interacts with NA-induced depolarization.

      We acknowledge that the concentration of CNO used in our experiments is relatively high compared to that used in other reports. However, in our experiments, application of CNO at 1, 10, and 100 μM induced Ca<sup>2+</sup> increases in GCaMP6-expressing astrocytes in spinal cord slices in a concentration-dependent manner (Figure S7). Among these, 100 μM CNO most effectively replicated the NA-induced Ca<sup>2+</sup> signals in astrocytes. Based on these findings, we selected this concentration for use in both the current and previous studies (Kohro et al., Nat Neurosci., 2020 (PMID: 33020652)). Importantly, to rule out non-specific effects, we conducted control experiments using spinal cord slices from mice that did not express hM3Dq in astrocytes and confirmed that CNO had no effect on Ca<sup>2+</sup> responses in astrocytes and sIPSCs in substantial gelatinosa (SG) neurons (Figure S7; lines 223-228). Thus, although the CNO concentration used is relatively high, the observed effects of CNO are not non-specific but result from the chemogenetic activation of hM3Dq-expressing astrocytes.

      In this study, we used Hes5-CreERT2 and Vgat-Cre mice to manipulate gene expression in Hes5<sup>+</sup> astrocytes and Vgat<sup>+</sup> neurons, respectively. In order to fully address the reviewer’s comment, the use of both Cre lines is necessary. However, simultaneous and independent genetic manipulation in each cell type using Cre activity alone is not feasible with the current genetic tools. We have mentioned this as a technical limitation in the Discussion section (lines 382-388).

      (5) The role of D-serine released by hM3Dq-stimulated astrocytes in (separately) modulating sub-types of neurons including excitatory neurons and Vgat positives needs more detailed discussion. If no effect of D-serine on Vgat neurons is observed, this should be explicitly stated, and the discussion should address why this might be the case.

      As mentioned in our response to Major Point (1) above, we have added a discussion of this point in the revised manuscript (lines 325-330).

      (6) Finally, the observed "dip" in astrocyte calcium signals below baseline following the large peaks with LC optostimulation should be discussed further, as understanding this phenomenon could provide valuable insights into astrocytic signaling dynamics in the context of single acute or repetitive chronic stress.

      Thank you for your comment. We found that this phenomenon was not affected by pretreatment with the α<sub>1A</sub>R-specific antagonist silodosin (Author response image 3), which effectively suppressed Ca<sup>2+</sup> elevations evoked by stimulation of LC-NA neurons (Figure 2F). This implies that the phenomenon is independent of α<sub>1A</sub>R signaling. Elucidating the detailed underlying mechanism remains an important direction for future investigation.

      Author response image 3.

      The observed "dip" in astrocyte Ca<sup>2+</sup> signals was not affected by pretreatment with the α<sub>1A</sub>R-specific antagonist silodosin. Representative traces of astrocytic GCaMP6m signals in response to optogenetic stimulation of LC-NAe<sup>→SDH</sup>rgic axons/terminals in a spinal cord slice. Each trace shows the GCaMP6m signal before and after optogenetic stimulation (625 nm, 1 mW, 10 Hz, 5 ms pulse duration, 10 s). Slices were pretreated with silodosin (40 nM) for 5 min prior to stimulation.

      Reviewer #3 (Public review):

      Summary:

      This is an exciting and timely study addressing the role of descending noradrenergic systems in nocifensive responses. While it is well-established that spinally released noradrenaline (aka norepinephrine) generally acts as an inhibitory factor in spinal sensory processing, this system is highly complex. Descending projections from the A6 (locus coeruleus, LC) and the A5 regions typically modulate spinal sensory processing and reduce pain behaviours, but certain subpopulations of LC neurons have been shown to mediate pronociceptive effects, such as those projecting to the prefrontal cortex (Hirshberg et al., PMID: 29027903).

      The study proposes that descending cerulean noradrenergic neurons potentiate touch sensation via alpha-1 adrenoceptors on Hes5+ spinal astrocytes, contributing to mechanical hyperalgesia. This finding is consistent with prior work from the same group (dd et al., PMID:). However, caution is needed when generalising about LC projections, as the locus coeruleus is functionally diverse, with differences in targets, neurotransmitter co-release, and behavioural effects. Specifying the subpopulations of LC neurons involved would significantly enhance the impact and interpretability of the findings.

      Strengths:

      The study employs state-of-the-art molecular, genetic, and neurophysiological methods, including precise CRISPR and optogenetic targeting, to investigate the role of Hes5+ astrocytes. This approach is elegant and highlights the often-overlooked contribution of astrocytes in spinal sensory gating. The data convincingly support the role of Hes5+ astrocytes as regulators of touch sensation, coordinated by brain-derived noradrenaline in the spinal dorsal horn, opening new avenues for research into pain and touch modulation.

      Furthermore, the data support a model in which superficial dorsal horn (SDH) Hes5+ astrocytes act as non-neuronal gating cells for brain-derived noradrenergic (NA) signalling through their interaction with substantia gelatinosa inhibitory interneurons. Locally released adenosine from NA-stimulated Hes5+ astrocytes, following acute restraint stress, may suppress the function of SDH-Vgat+ inhibitory interneurons, resulting in mechanical pain hypersensitivity. However, the spatially restricted neuron-astrocyte communication underlying this mechanism requires further investigation in future studies.

      Weaknesses

      (1) Specificity of the LC Pathway targeting

      The main concern lies with how definitively the LC pathway was targeted. Were other descending noradrenergic nuclei, such as A5 or A7, also labelled in the experiments? The authors must convincingly demonstrate that the observed effects are mediated exclusively by LC noradrenergic terminals to substantiate their claims (i.e. "we identified a circuit, the descending LC→SDH-NA neurons").

      (a) For instance, the direct vector injection into the LC likely results in unspecific effects due to the extreme heterogeneity of this nucleus and retrograde labelling of the A5 and A7 nuclei from the LC (i.e., Li et al., PMID: 26903420).

      We appreciate the reviewer's valuable comments. To address this point, we performed additional experiments and demonstrated that intra-SDH injection of AAVretro-Cre followed by intra-LC injection of AAV2/9-EF1α-FLEx[DTR-EGFP] specifically results in DTR expression in NA neurons of the LC, but not of the A5 or A7 regions (Figure S4; lines 127-128). These results confirm the specificity of targeting the LC<sup>→SDH</sup>-NAergic pathway in our study.

      (b) It is difficult to believe that the intersectional approach described in the study successfully targeted LC→SDH-NA neurons using AAVrg vectors. Previous studies (e.g., PMID: 34344259 or PMID: 36625030) demonstrated that similar strategies were ineffective for spinal-LC projections. The authors should provide detailed quantification of the efficiency of retrograde labelling and specificity of transgene expression in LC neurons projecting to the SDH.

      Thank you for your comment. As we described in our response to the weakness (5)-e) of Reviewer #1 (Public review), our additional analysis showed that, under our experimental conditions, expression of genes (for example DTR) was observed in 4.4% of NA (TH<sup>+</sup>) neurons in the LC (Figure S4; lines 126-127).

      The reasons for this difference between the previous studies and our current study is unclear; however, it is likely attributed to methodological differences, including the type of viral vectors employed, species differences (mouse (PMID: 34344259, our study) vs. rat (PMID: 36625030)), the amount of AAV injected into the SDH (300 nL at three sites (PMID: 34344259), and 300 nL at a single site (our study)) and LC (500 nL at a single site (PMID: 34344259), and 300 nL at a single site (our study)), as well as the depth of AAV injection in the SDH (200–300 µm from the dorsal surface of the spinal cord (PMID: 34344259), and 120–150 µm in depth from the surface of the dorsal root entry zone (our study)).

      (c) Furthermore, it is striking that the authors observed a comparably strong phenotypical change in Figure 1K despite fewer neurons being labelled, compared to Figure 1H and 1N with substantially more neurons being targeted. Interestingly, the effect in Figure 1K appears more pronounced but shorter-lasting than in the comparable experiment shown in Figure 1H. This discrepancy requires further explanation.

      Although only a representative section of the LC was shown in the initial version, LC<sup>→SDH</sup>-NA neurons are distributed rostrocaudally throughout the LC, as previously reported (Llorca-Torralba et al., Brain, 2022 (PMID: 34373893)). Our additional experiments analyzing multiple sections of the anterior and posterior regions of the LC have now revealed that approximately sixty LC<sup>→SDH</sup>-NA neurons express DTR, and these neurons are eliminated following DTX treatment (Figure S4; lines 126-128) (it should be noted that these neurons specifically project to the L4 segment of the SDH, and the total number of LC<sup>→SDH</sup>-NA neurons is likely much higher). Considering the specificity of LC<sup>→SDH</sup>-NAergic pathway targeting demonstrated in our study (as described above), together with the fact that primary afferent sensory fibers from the plantar skin of the hindpaw predominantly project to the L4 segment of the SDH, these data suggest that the observed behavioral changes are attributable to the loss of these neurons and that ablation of even a relatively small number of NA neurons in the LC can have a significant impact on behavior. We have added this hypothesis in the Discussion section (lines 373-382).

      Regarding the data in Figures 1H and 1K, as the reviewer pointed out, a statistically significant difference was observed at 90 min in mice with ablation of LC-NA neurons, but not in those with LC<sup>→SDH</sup>-NA neuron ablation. This is likely due to a slightly higher threshold in the control group at this time point (Figure 1K), and it remains unclear whether there is a mechanistic difference between the two groups at this specific time point.

      (d) A valuable addition would be staining for noradrenergic terminals in the spinal cord for the intersectional approach (Figure 1J), as done in Figures 1F/G. LC projections terminate preferentially in the SDH, whereas A5 projections terminate in the deep dorsal horn (DDH). Staining could clarify whether circuits beyond the LC are being ablated.

      As suggested, we performed DTR immunostaining in the SDH; however, we did not detect any DTR immunofluorescence there. A similar result was also observed in the spinal terminals of DTR-expressing primary afferent fibers (our unpublished data). The reason for this is unclear, but to the best of our knowledge, no studies have clearly shown DTR expression at presynaptic terminals, which may be because the action of DTX on the neuronal cell body is necessary for cell ablation. Nevertheless, as described in our response to the weakness (5)-f) by Reviewer 1 (Public review), we have now confirmed the specific expression of DTR in the LC, but not in the A5 and A7 regions (Figure S4; lines 127-128).

      (e) Furthermore, different LC neurons often mediate opposite physiological outcomes depending on their projection targets-for example, dorsal LC neurons projecting to the prefrontal cortex PFCx are pronociceptive, while ventral LC neurons projecting to the SC are antinociceptive (PMIDs: 29027903, 34344259, 36625030). Given this functional diversity, direct injection into the LC is likely to result in nonspecific effects.

      To avoid behavioral outcomes resulting from a mixture of facilitatory and inhibitory effects caused by activating the entire population of LC-NA neurons, we employed a specific manipulation targeting LC<sup>→SDH</sup>-NA neurons using AAV vectors. The specificity of this manipulation was confirmed in our previous study (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)) and in the current study (Figure S4). Using this approach, we previously demonstrated that LC neurons can exert pronociceptive effects via astrocytes in the SDH (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This pronociceptive role is further supported by the current study, which uses a more selective manipulation of LC<sup>→SDH</sup>-NA neurons through a NET-Cre mouse line. In addition, intrathecal administration of relatively low doses of NA in naïve mice clearly induces mechanical pain hypersensitivity. Nevertheless, we have also acknowledged that several recent studies have reported an inhibitory role of LC<sup>→SDH</sup>-NA neurons in spinal nociceptive signaling. The reason for these differing behavioral outcomes remains unclear, but several methodological differences may underlie the discrepancy. First, the degree of LC<sup>→SDH</sup>-NA neuronal activity may play a role. Although direct comparisons between studies reporting pro- and anti-nociceptive effects are difficult, our previous studies demonstrated that intrathecal administration of high doses of NA in naïve mice does not induce mechanical pain hypersensitivity (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). Second, the sensory modality used in behavioral testing may be a contributing factor as the pronociceptive effect of NA appears to be selectively observed in responses to mechanical, but not thermal, stimuli (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)). This sensory modality-selective effect is also evident in mice subjected to acute restraint stress (Table S1). Therefore, the role of LC<sup>→SDH</sup>-NA neurons in modulating nociceptive signaling in the SDH is more complex than previously appreciated, and their contribution to pain regulation should be reconsidered in light of factors such as NA levels, sensory modality, and experimental context. In revising the manuscript, we have included some points described above in the Discussion (lines 282-291).

      Conclusion on Specificity: The authors are strongly encouraged to address these limitations directly, as they significantly affect the validity of the conclusions regarding the LC pathway. Providing more robust evidence, acknowledging experimental limitations, and incorporating complementary analyses would greatly strengthen the manuscript.

      We appreciate the reviewer’s comments. We fully acknowledge the limitations raised and agree that addressing them directly is important for the rigor of our conclusions on the LC pathway. To this end, we have performed additional experiments (e.g., Figure A and S4), which are now included in the revised manuscript. Furthermore, we have also newly added a new paragraph for experimental limitations in the end of Discussion section (lines 373-408). We believe these new data substantially strengthen the validity of our findings and have clarified these points in the Discussion section.

      (2) Discrepancies in Data

      (a) Figures 1B and 1E: The behavioural effect of stress on PWT (Figure 1E) persists for 120 minutes, whereas Ca2+ imaging changes (Figure 1B) are only observed in the first 20 minutes, with signal attenuation starting at 30 minutes. This discrepancy requires clarification, as it impacts the proposed mechanism.

      Thank you for your important comment. As pointed out by the reviewer, there is a difference between the duration of behavioral responses and Ca<sup>2+</sup> events, although the exact time point at which the PWT begins to decline remains undetermined (as behavioral testing cannot be conducted during stress exposure). A similar temporal difference was also observed following intraplantar injection of capsaicin (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)); while LC<sup>→SDH</sup>-NA neuron-mediated astrocytic Ca<sup>2+</sup> responses in SDH astrocytes last for 5–10 min after injection, behavioral hypersensitivity peaks around 60 min post-injection and gradually returns to baseline over the subsequent 60–120 min. These findings raise the possibility that astrocyte-mediated pain hypersensitivity in the SDH may involve a sustained alteration in spinal neural function, such as central sensitization. We have added this hypothesis to the Discussion section of the revised manuscript (lines 399-408), as it represents an important direction for future investigation.

      (b) Figure 4E: The effect is barely visible, and the tissue resembles "Swiss cheese," suggesting poor staining quality. This is insufficient for such an important conclusion. Improved staining and/or complementary staining (e.g., cFOS) are needed. Additionally, no clear difference is observed between Stress+Ab stim. and Stress+Ab stim.+CPT, raising doubts about the robustness of the data.

      As suggested, we performed c-FOS immunostaining and obtained clearer results (Figure 4E,F; lines 243-252). We also quantitatively analyzed the number of c-FOS<sup>+</sup> cells in the superficial laminae, and the results are consistent with those obtained from the pERK experiments.

      (c) Discrepancy with Existing Evidence: The claim regarding the pronociceptive effect of LC→SDH-NAergic signalling on mechanical hypersensitivity contrasts with findings by Kucharczyk et al. (PMID: 35245374), who reported no facilitation of spinal convergent (wide-dynamic range) neuron responses to tactile mechanical stimuli, but potent inhibition to noxious mechanical von Frey stimulation. This discrepancy suggests alternative mechanisms may be at play and raises the question of why noxious stimuli were not tested.

      In our experiments, ChrimsonR expression was observed in the superficial and deeper laminae of the spinal cord (Figure S6). Due to the technical limitations of the optical fibers used for optogenetics, the light stimulation could only reach the superficial laminae; therefore, it may not have affected the activity of neurons (including WDR neurons) located in the deeper laminae. Furthermore, the study by Kucharczyk et al. (Brain, 2022 (PMID: 35245374)) employed a stimulation protocol that differed from ours, applying continuous stimulation over several minutes. Given that the levels of NA released from LC<sup>→SDH</sup>-NAergic terminals in the SDH increase with the duration of terminal stimulation (as shown in Figure 2B), longer stimulation may result in higher levels of NA in the SDH. Considering also our data indicating that the pro- and anti-nociceptive effects of NA are dose dependent (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), these differences may be related to LC<sup>→SDH</sup>-NA neuron activity, NA levels in the SDH, and the differential responses of SDH neurons in the superficial versus deeper laminae (lines 388-395).

      (3) Sole reliance on Von Frey testing

      The exclusive use of von Frey as a behavioural readout for mechanical sensitisation is a significant limitation. This assay is highly variable, and without additional supporting measures, the conclusions lack robustness. Incorporating other behavioural measures, such as the adhesive tape removal test to evaluate tactile discomfort, the needle floor walk corridor to assess sensitivity to uneven or noxious surfaces, or the kinetic weight-bearing test to measure changes in limb loading during movement, could provide complementary insights. Physiological tests, such as the Randall-Selitto test for noxious pressure thresholds or CatWalk gait analysis to evaluate changes in weight distribution and gait dynamics, would further strengthen the findings and allow for a more comprehensive assessment of mechanical sensitisation.

      Thank you for your suggestion. Based on our previous findings that Hes5<sup>+</sup> astrocytes in the SDH selectively modulate mechanosensory signaling (Kohro et al., Nat Neurosci, 2020 (PMID: 33020652)), the present study focused on behavioral responses to mechanical stimuli using von Frey filaments. As we have not previously conducted most of the behavioral tests suggested by the reviewers, and as we currently lack the necessary equipments for these tests (e.g., Randall–Selitto test, CatWalk gait analysis, and weight-bearing test), we were unable to include them in this study. However, it will be of great interest in future research to investigate whether activation of the LC<sup>→SDH</sup>-NA neuron-to-SDH Hes5<sup>+</sup> astrocyte signaling pathway similarly sensitizes behavioral responses to other types of mechanical stimuli and also to investigate the sensory modality-selective pro- and antinociceptive role of LC<sup>→SDH</sup>-NAergic signaling in the SDH (lines 396-399).

      Overall Conclusion

      This study addresses an important and complex topic with innovative methods and compelling data. However, the conclusions rely on several assumptions that require more robust evidence. Specificity of the LC pathway, experimental discrepancies, and methodological limitations (e.g., sole reliance on von Frey) must be addressed to substantiate the claims. With these issues resolved, this work could significantly advance our understanding of astrocytic and noradrenergic contributions to pain modulation.

      We have made every effort to address the reviewer’s concerns through additional experiments and analyses. Based on the new control data presented, we believe that our explanation is reasonable and acceptable. Although additional data cannot be provided on some points due to methodological constraints and limitations of the techniques currently available in our laboratory, we respectfully submit that the evidence presented sufficiently supports our conclusions.

      Reviewer #3 (Recommendations for the authors):

      A lot of beautiful and challenging-to-collect data is presented. Sincere congratulations to all the authors on this achievement!

      Notwithstanding, please carefully reconsider the conclusions regarding the LC pathway, as additional evidence is required to ensure their specificity and robustness.

      We thank the reviewer for the kind comments and for raising an important point regarding the LC pathway. The reviewer’s feedback prompted us to conduct additional investigations to further strengthen the validity of our conclusions. We have incorporated these new data and analyses into the revised manuscript, and we believe that these revisions substantially enhance the robustness and reliability of our findings.

    1. Reviewer #3 (Public review):

      Disclaimer:

      My expertise is in live single-molecule imaging of RNA and transcription, as well as associated data analysis and modeling. While this aligns well with the technical aspects of the manuscript, my background in translation is more limited, and I am not best positioned to assess the novelty of the biological conclusions.

      Summary:

      This study combines live-cell imaging of nascent proteins on single mRNAs with time-series analysis to investigate the kinetics of mRNA translation.<br /> The authors (i) used a calibration method for estimating absolute ribosome counts, and (ii) developed a new Bayesian approach to infer ribosome counts over time from run-off experiments, enabling estimation of elongation rates and ribosome density across conditions.

      They report (i) translational bursting at the single-mRNA level, (ii) low ribosome density (~10% occupancy {plus minus} a few percents), (iii) that ribosome density is minimally affected by perturbations of elongation (using a drug and/or different coding sequences in the reporter), suggesting a homeostatic mechanism potentially involving a feedback of elongation onto initiation, although (iv) this coupling breaks down upon knockout of elongation factor eIF5A.

      Strengths:

      (1) The manuscript is well written and the conclusions are in general appropriately cautious (besides the few improvements I suggest below).

      (2) The time-series inference method is interesting and promising for broader application.

      (3) Simulations provide convincing support for the modeling (though some improvements are possible).

      (4) The reported homeostatic effect on ribosome density is surprising and carefully validated with multiple perturbations.

      (5) Imaging quality and corrections (e.g., flat-fielding, laser power measurements) are robust.

      (6) Mathematical modeling is clearly described and precise; a few clarifications could improve it further.

      Weaknesses:

      (1) The absolute quantification of ribosome numbers (via the measurement of $i_{MP}$​) should be improved. This only affects the finding that ribosome density is low, not that it appears to be under homeostatic control. However, if $i_{MP}$​ turns out to be substantially overestimated (hence ribosome density underestimated), then "ribosomes queuing up to the initiation site and physically blocking initiation" could become a relevant hypothesis. In my first review of this work, I made recommendations, which the authors did not follow. In my view, the robustness of this particular aspect of this study remains moderate.

      (2) The proposed initiation-elongation coupling is plausible, but alternative explanations such as changes in abortive elongation frequency should be considered. In their response to my previous comments, the authors indicate that this is "beyond the scope of the present work".

      (3) More an opportunity for improvement than a weakness: It is unclear what the single-mRNA nature of the inference method is bringing since it is only used here to report _average_ ribosome elongation rate and density (averaged across mRNAs and across time during the run-off experiments -although the method, in principle, has the power to resolve these two aspects). In response to my previous comment, the authors note that such analyses could be incorporated in future work.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review): 

      Summary:

      In this study, Lamberti et al. investigate how translation initiation and elongation are coordinated at the single-mRNA level in mammalian cells. The authors aim to uncover whether and how cells dynamically adjust initiation rates in response to elongation dynamics, with the overarching goal of understanding how translational homeostasis is maintained. To this end, the study combines single-molecule live-cell imaging using the SunTag system with a kinetic modeling framework grounded in the Totally Asymmetric Simple Exclusion Process (TASEP). By applying this approach to custom reporter constructs with different coding sequences, and under perturbations of the initiation/elongation factor eIF5A, the authors infer initiation and elongation rates from individual mRNAs and examine how these rates covary.

      The central finding is that initiation and elongation rates are strongly correlated across a range of coding sequences, resulting in consistently low ribosome density ({less than or equal to}12% of the coding sequence occupied). This coupling is preserved under partial pharmacological inhibition of eIF5A, which slows elongation but is matched by a proportional decrease in initiation, thereby maintaining ribosome density. However, a complete genetic knockout of eIF5A disrupts this coordination, leading to reduced ribosome density, potentially due to changes in ribosome stalling resolution or degradation.

      Strengths:

      A key strength of this work is its methodological innovation. The authors develop and validate a TASEP-based Hidden Markov Model (HMM) to infer translation kinetics at single-mRNA resolution. This approach provides a substantial advance over previous population-level or averaged models and enables dynamic reconstruction of ribosome behavior from experimental traces. The model is carefully benchmarked against simulated data and appropriately applied. The experimental design is also strong. The authors construct matched SunTag reporters differing only in codon composition in a defined region of the coding sequence, allowing them to isolate the effects of elongation-related features while controlling for other regulatory elements. The use of both pharmacological and genetic perturbations of eIF5A adds robustness and depth to the biological conclusions. The results are compelling: across all constructs and conditions, ribosome density remains low, and initiation and elongation appear tightly coordinated, suggesting an intrinsic feedback mechanism in translational regulation. These findings challenge the classical view of translation initiation as the sole rate-limiting step and provide new insights into how cells may dynamically maintain translation efficiency and avoid ribosome collisions.

      We thank the reviewer for their constructive assessment of our work, and for recognizing the methodological innovation and experimental rigor of our study.

      Weaknesses:

      A limitation of the study is its reliance on exogenous reporter mRNAs in HeLa cells, which may not fully capture the complexity of endogenous translation regulation. While the authors acknowledge this, it remains unclear how generalizable the observed coupling is to native mRNAs or in different cellular contexts.

      We agree that the use of exogenous reporters is a limitation inherent to the SunTag system, for which there is currently no simple alternative for single-mRNA translation imaging. However, we believe our findings are likely generalizable for several reasons.

      As discussed in our introduction and discussion, there is growing mechanistic evidence in the literature for coupling between elongation (ribosome collisions) and initiation via pathways such as the GIGYF2-4EHP axis (Amaya et al. 2018, Hickey et al. 2020, Juszkiewicz et al. 2020), which might operate on both exogenous and endogenous mRNAs.

      As already acknowledged in our limitations section, our exogenous reporters may not fully recapitulate certain aspects of endogenous translation (e.g., ER-coupled collagen processing), yet the observed initiation-elongation coupling was robust across all tested constructs and conditions.

      We have now expanded the Discussion (L393-395) to cite complementary evidence from Dufourt et al. (2021), who used a CRISPR-based approach in Drosophila embryos to measure translation of endogenous genes. We also added a reference to Choi et al. 2025, who uses a ER-specific SunTag reporter to visualize translation at the ER (L395-397).

      Additionally, the model assumes homogeneous elongation rates and does not explicitly account for ribosome pausing or collisions, which could affect inference accuracy, particularly in constructs designed to induce stalling. While the model is validated under low-density assumptions, more work may be needed to understand how deviations from these assumptions affect parameter estimates in real data.

      We agree with the reviewer that the assumption of homogeneous elongation rates is a simplification, and that our work represents a first step towards rigorous single-trace analysis of translation dynamics. We have explicitly tested the robustness of our model to violations of the low-density assumption through simulations (Figure 2 - figure supplement 2). These show that while parameter inference remains accurate at low ribosome densities, accuracy slightly deteriorates at higher densities, as expected. In fact, our experimental data do provide evidence for heterogeneous elongation: the waiting times between termination events deviate significantly from an exponential distribution (Figure 3 - figure supplement 2C), indicating the presence of ribosome stalling and/or bursting, consistent with the reviewer's concern. We acknowledge in the Limitations section (L402-406) that extending the model to explicitly capture transcript-dependent elongation rates and ribosome interactions remains challenging. The TASEP is difficult to solve analytically under these conditions, but we note that simulation-based inference approaches, such as particle filters to replace HMMs, could provide a path forward for future work to capture this complexity at the single-trace level.

      Furthermore, although the study observes translation "bursting" behavior, this is not explicitly modeled. Given the growing recognition of translational bursting as a regulatory feature, incorporating or quantifying this behavior more rigorously could strengthen the work's impact.

      While we do not explicitly model the bursting dynamics in the HMM framework, we have quantified bursting behavior directly from the data. Specifically, we measure the duration of translated (ON) and untranslated (OFF) periods across all reporters and conditions (Figure 1G for control conditions and Figure 4G-H for perturbed conditions), finding that active translation typically lasts 10-15 minutes interspersed with shorter silent periods of 5-10 minutes. This empirical characterization demonstrates that bursting is a consistent feature of translation across our experimental conditions. The average duration of silent periods is similar to what was inferred by Livingston et al. 2023 for a similar SunTag reporter; while the average duration of active periods is substantially shorter (~15 min instead of ~40 min), which is consistent with the shorter trace duration in our system compared to theirs (~15 min compared to ~80 min, on average). Incorporating an explicit two-state or multi-state bursting model into the TASEP-HMM framework would indeed be computationally intensive and represents an important direction for future work, as it would enable inference of switching rates alongside initiation and elongation parameters. We have added this point to the Discussion (L415-417).

      Assessment of Goals and Conclusions:

      The authors successfully achieve their stated aims: they quantify translation initiation and elongation at the single-mRNA level and show that these processes are dynamically coupled to maintain low ribosome density. The modeling framework is well suited to this task, and the conclusions are supported by multiple lines of evidence, including inferred kinetic parameters, independent ribosome counts, and consistent behavior under perturbation.

      Impact and Utility:

      This work makes a significant conceptual and technical contribution to the field of translation biology. The modeling framework developed here opens the door to more detailed and quantitative studies of ribosome dynamics on single mRNAs and could be adapted to other imaging systems or perturbations. The discovery of initiation-elongation coupling as a general feature of translation in mammalian cells will likely influence how researchers think about translational regulation under homeostatic and stress conditions.

      The data, models, and tools developed in this study will be of broad utility to the community, particularly for researchers studying translation dynamics, ribosome behavior, or the effects of codon usage and mRNA structure on protein synthesis.

      Context and Interpretation:

      This study contributes to a growing body of evidence that translation is not merely controlled at initiation but involves feedback between elongation and initiation. It supports the emerging view that ribosome collisions, stalling, and quality control pathways play active roles in regulating initiation rates in cis. The findings are consistent with recent studies in yeast and metazoans showing translation initiation repression following stalling events. However, the mechanistic details of this feedback remain incompletely understood and merit further investigation, particularly in physiological or stress contexts. 

      In summary, this is a thoughtfully executed and timely study that provides valuable insights into the dynamic regulation of translation and introduces a modeling framework with broad applicability. It will be of interest to a wide audience in molecular biology, systems biology, and quantitative imaging.

      We appreciate the reviewer's thorough and positive assessment of our work, and that they recognize both the technical innovation of our modeling framework and its potential broad utility to the translation biology community. We agree that further mechanistic investigation of initiation-elongation feedback under various physiological contexts represents an important direction for future research.

      Reviewer #2 (Public review):

      Summary:

      This manuscript uses single-molecule run-off experiments and TASEP/HMM models to estimate biophysical parameters, i.e., ribosomal initiation and elongation rates. Combining inferred initiation and elongation rates, the authors quantify ribosomal density. TASEP modeling was used to simulate the mechanistic dynamics of ribosomal translation, and the HMM is used to link ribosomal dynamics to microscope intensity measurements. The authors' main conclusions and findings are:

      (1) Ribosomal elongation rates and initiation rates are strongly coordinated.

      (2) Elongation rates were estimated between 1-4.5 aa/sec. Initiation rates were estimated between 0.5-2.5 events/min. These values agree with previously reported values.

      (3) Ribosomal density was determined below 12% for all constructs and conditions.

      (4) eIF5A-perturbations (KO and GC7 inhibition) resulted in non-significant changes in translational bursting and ribosome density.

      (5) eIF5A perturbations resulted in increases in elongation and decreases in initiation rates.

      Strengths:

      This manuscript presents an interesting scientific hypothesis to study ribosome initiation and elongation concurrently. This topic is highly relevant for the field. The manuscript presents a novel quantitative methodology to estimate ribosomal initiation rates from Harringtonine run-off assays. This is relevant because run-off assays have been used to estimate, exclusively, elongation rates.

      We thank the reviewer for their careful evaluation of our work and for recognizing the novelty of our quantitative methodology to extract both initiation and elongation rates from harringtonine run-off assays, extending beyond the traditional use of these experiments.

      Weaknesses:

      The conclusion of the strong coordination between initiation and elongation rates is interesting, but some results are unexpected, and further experimental validation is needed to ensure this coordination is valid. 

      We agree that some of our findings need further experimental investigation in future studies. However, we believe that the coordination between initiation and elongation is supported by multiple results in our current work: (1) the strong correlation observed across all reporters and conditions (Figure 3E), and (2) the consistent maintenance of low ribosome density despite varying elongation rates. While additional experimental validation would be valuable, we note that directly manipulating initiation or elongation independently in mammalian cells remains technically challenging. Nevertheless, our findings are consistent with emerging mechanistic understanding of collision-sensing pathways (GIGYF2-4EHP) that could mediate such coupling, as discussed in our manuscript.

      (1) eIF5a perturbations resulted in a non-significant effect on the fraction of translating mRNA, translation duration, and bursting periods. Given the central role of eIF5a, I would have expected a different outcome. I would recommend that the authors expand the discussion and review more literature to justify these findings.

      We appreciate this comment. This finding is indeed discussed in detail in our manuscript (Discussion, paragraphs 6-7). As we note there, while eIF5A plays a critical role in elongation, the maintenance of bursting dynamics and ribosome density upon perturbation can be explained by compensatory feedback mechanisms. Specifically, the coordinated decrease in initiation rates that counterbalances slower elongation to maintain homeostatic ribosome density. We also discuss several factors that complicate interpretation: (1) potential RQC-mediated degradation masking stronger effects in proline-rich constructs, (2) differences between GC7 treatment and genetic knockout suggesting altered stalling resolution kinetics, and (3) the limitations of using exogenous reporters that lack ER-coupled processing, which may be critical for eIF5A function in endogenous collagen translation (as suggested by Rossi et al., 2014; Mandal et al., 2016; Barba-Aliaga et al., 2021). The mechanistic complexity and tissue-specific nature of eIF5A function in mammals, which differs substantially from the better-characterized yeast system, likely contributes to the nuanced phenotype we observe. We believe our Discussion adequately addresses these points.

      (2) The AAG construct leading to slow elongation is very surprising. It is the opposite of the field consensus, where codon-optimized gene sequences are expected to elongate faster. More information about each construct should be provided. I would recommend more bioinformatic analysis on this, for example, calculating CAI for all constructs, or predicting the structures of the proteins.

      We agree that the slow elongation of the AAG construct is counterintuitive and indeed surprising. Following the reviewer's suggestion, we have now calculated the Codon Adaptation Index (CAI) for all constructs (Renilla 0.89, Col1a1 0.78, Col1a1 mutated 0.74). It is therefore unlikely that codon bias explains the slow translation, particularly since we designed the mutated Col1a1 construct with alanine codons selected to respect human codon usage bias, thereby minimizing changes in codon optimality. As we discuss in the manuscript, we hypothesize that the proline-to-alanine substitutions disrupted co-translational folding of the collagen-derived sequence. Prolines are critical for collagen triple-helix formation (Shoulders and Raines, 2009), and their replacement with alanines likely generates misfolded intermediates that cause ribosome stalling (Barba-Aliaga et al., 2021; Komar et al., 2024). This interpretation is supported by the high frequency (>30%) of incomplete run-off traces for AAG, suggesting persistent stalling events. Our findings thus illustrate an important potential caveat: "optimizing" a sequence based solely on codon usage can be detrimental when it disrupts functionally important structural features or co-translational folding pathways.

      This highlights that elongation rates depend not only on codon optimality but also on the interplay between nascent chain properties and ribosome progression.

      (3) The authors should consider using their methodology to study the effects of modifying the 5'UTR, resulting in changes in initiation rate and bursting, such as previously shown in reference Livingston et al., 2023. This may be outside of the scope of this project, but the authors could add this as a future direction and discuss if this may corroborate their conclusions. 

      We thank the reviewer for this excellent suggestion. We agree that applying our methodology to 5'-UTR variants would provide a complementary test of initiation-elongation coupling, and we have now added this as a future direction in the Discussion (L417-420).

      (4) The mathematical model and parameter inference routines are central to the conclusions of this manuscript. In order to support reproducibility, the computational code should be made available and well-documented, with a requirements file indicating the dependencies and their versions. 

      We have added the Github link in the manuscript (https://github.com/naef-lab/suntag-analysis) and have also deposited the data (.ome.tif) on Zenodo (https://zenodo.org/records/17669332).

      Reviewer #3 (Public review):

      Disclaimer:

      My expertise is in live single-molecule imaging of RNA and transcription, as well as associated data analysis and modeling. While this aligns well with the technical aspects of the manuscript, my background in translation is more limited, and I am not best positioned to assess the novelty of the biological conclusions.

      Summary:

      This study combines live-cell imaging of nascent proteins on single mRNAs with time-series analysis to investigate the kinetics of mRNA translation.

      The authors (i) used a calibration method for estimating absolute ribosome counts, and (ii) developed a new Bayesian approach to infer ribosome counts over time from run-off experiments, enabling estimation of elongation rates and ribosome density across conditions.

      They report (i) translational bursting at the single-mRNA level, (ii) low ribosome density (~10% occupancy

      {plus minus} a few percents), (iii) that ribosome density is minimally affected by perturbations of elongation (using a drug and/or different coding sequences in the reporter), suggesting a homeostatic mechanism potentially involving a feedback of elongation onto initiation, although (iv) this coupling breaks down upon knockout of elongation factor eIF5A.

      Strengths:

      (1) The manuscript is well written, and the conclusions are, in general, appropriately cautious (besides the few improvements I suggest below).

      (2) The time-series inference method is interesting and promising for broader applications. 

      (3) Simulations provide convincing support for the modeling (though some improvements are possible). 

      (4) The reported homeostatic effect on ribosome density is surprising and carefully validated with multiple perturbations.

      (5) Imaging quality and corrections (e.g., flat-fielding, laser power measurements) are robust.

      (6) Mathematical modeling is clearly described and precise; a few clarifications could improve it further.

      We thank the reviewer for recognizing the novelty of the approach and its rigour, and for providing suggestions to improve it further.

      Weaknesses:

      (1) The absolute quantification of ribosome numbers (via the measurement of $i_{MP}$ ) should be improved.This only affects the finding that ribosome density is low, not that it appears to be under homeostatic control. However, if $i_{MP}$ turns out to be substantially overestimated (hence ribosome density underestimated), then "ribosomes queuing up to the initiation site and physically blocking initiation" could become a relevant hypothesis. In my detailed recommendations to the authors, I list points that need clarification in their quantifications and suggest an independent validation experiment (measuring the intensity of an object with a known number of GFP molecules, e.g., MS2-GFP MS2-GFP-labeled RNAs, or individual GEMs).

      We agree with the reviewer that the estimation of the number of ribosomes is central to our finding that translation happens at low density on our reporters. This result derives from our measurement of the intensity of one mature protein (i<sub>MP</sub>), that we have achieved by using a SunTag reporter with a RH1 domain in the C terminus of the mature protein, allowing us to stabilise mature proteins via actin-tethering. In addition, as suggested by the reviewer, we already validated this result with an independent estimate of the mature protein intensity (Figure 5 - figure supplement 2B), which was obtained by adding the mature protein intensity directly as a free parameter of the HMM. The inferred value of mature protein intensity for each construct (10-15 a.u) was remarkably close to the experimental calibration result (14 ± 2 a.u.). Therefore, we have confidence that our absolute quantification of ribosome numbers is accurate.

      (2) The proposed initiation-elongation coupling is plausible, but alternative explanations, such as changes in abortive elongation frequency, should be considered more carefully. The authors mention this possibility, but should test or rule it out quantitatively. 

      We thank the reviewer for the comment, but we consider that ruling out alternative explanations through new perturbation experiments is beyond the scope of the present work.

      (3) The observation of translational bursting is presented as novel, but similar findings were reported by Livingston et al. (2023) using a similar SunTag-MS2 system. This prior work should be acknowledged, and the added value of the current approach clarified.

      We did cite Livingston et al. (2023) in several places, but we recognized that we could add a few citations in key places, to make clear that the observation of bursting is not novel but is in agreement with previous results. We now did so in the Results and Discussion sections.

      (4) It is unclear what the single-mRNA nature of the inference method is bringing since it is only used here to report _average_ ribosome elongation rate and density (averaged across mRNAs and across time during the run-off experiments - although the method, in principle, has the power to resolve these two aspects).

      While decoding individual traces, our model infers shared (population-level) rates. Inferring transcript-specific parameters would be more informative, but it is highly challenging due to the uncertainty on the initial ribosome distribution on single transcripts. Pooling multiple transcripts together allows us to use some assumptions on the initial distribution and infer average elongation and initiation-rate parameters, while revealing substantial mRNA-to-mRNA variability in the posterior decoding (e.g. Figure 3 - figure Supplement 2C). Indeed, the inference still informs on the single-trace run-off time distribution (Figure 3 A) and the waiting time between termination events (Figure 3 - figure supplement 2C), suggesting the presence of stalling and bursting. In addition, the transcript-to-transcript heterogeneity is likely accounted for by our model better than previous methods (linear fit of the average run-off intensity), as suggested by their comparison (Figure 3 - figure supplement 2 A). In the future the model could be refined by introducing transcript-specific parameters, possibly in a hierarchical way, alongside shared parameters.

      (5) I did not find any statement about data availability. The data should be made available. Their absence limits the ability to fully assess and reproduce the findings.

      We have added the Github link in the manuscript (https://github.com/naef-lab/suntag-analysis) and have also deposited the data (.ome.tif) on Zenodo (https://zenodo.org/records/17669332).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors): 

      Major Comments:

      (1) Lack of Explicit Bursting Model

      Although translation "bursts" are observed, the current framework does not explicitly model initiation as a stochastic ON/OFF process. This limits insight into regulatory mechanisms controlling burst frequency or duration. The authors should either incorporate a two-state/more-state (bursting) model of initiation or perform statistical analysis (e.g., dwell-time distributions) to quantify bursting dynamics. They should clarify how bursting influences the interpretation of initiation rate estimates.

      We agree with the reviewer that an explicit bursting model (e.g., a two-state telegraph model) would be the ideal theoretical framework. However, integrating such a model into the TASEP-HMM inference framework is computationally intensive and complex. As a robust first step, we have opted to quantify bursting empirically based on the decoded single-mRNA traces. As shown in Figure 1G (control) and Figure 4G (perturbed conditions), we explicitly measured the duration of "ON" (translated) and "OFF" (untranslated) periods. This statistical analysis provides a quantitative description of the bursting dynamics without relying on the specific assumptions of a telegraph model. We have clarified this in the text (L123-125) and, as suggested, added a discussion (L415-417) on the potential extensions of the model to include explicit switching kinetics in the Outlook section.

      (2) Assumption of Uniform Elongation Rates

      The model assumes homogeneous elongation across coding sequences, which may not hold for stalling-prone inserts (e.g., PPG). This simplification could bias inference, particularly in cases of sequence-specific pausing. Adding simulations or sensitivity analysis to assess how non-uniform elongation affects the accuracy of inferred parameters. The authors should explicitly discuss how ribosome stalling, collisions, or heterogeneity might skew model outputs (see point 4).

      A strong stalling sequence that affects all ribosomes equally should not deteriorate the inference of the initiation rate, provided that the low-density assumption holds. The scenario where stalling events lead to higher density, and thus increased ribosome-ribosome interactions, is comparable to the conditions explored in Figure 2E. In those simulations, we tested the inference on data generated with varying initiation and elongation rates, resulting in ribosome densities ranging from low to high. We demonstrated that the inference remains robust at low ribosome densities (<10%). At higher densities, the accuracy of the initiation rate estimate decreases, whereas the elongation rate estimate remains comparatively robust. Additionally, the model tends to overestimate ribosome density under high-density conditions, likely because it neglects ribosome interference at the initiation site (Figure 2 figure supplement 2C). We agree that a deeper investigation into the consequences of stochastic stalling and bursting would be beneficial, and we have explicitly acknowledged this in the Limitations section.

      (3) Interpretation of eIF5A Knockout Phenotype

      The observation that eIF5A KO reduces initiation more than elongation, leading to decreased ribosome density, is biologically intriguing. However, the explanation invoking altered RQC kinetics is speculative and not directly tested. The authors should consider validating the RQC hypothesis by monitoring reporter mRNA stability, ribosome collision markers, or translation termination intermediates.

      We thank the reviewer for the comment, but we consider that ruling out alternative explanations through new experiments is beyond the scope of the present work.

      (4) To strengthen the manuscript, the authors should incorporate insights from three studies.

      - Livingston et al. (PMC10330622) found that translation occurs in bursts, influenced by mRNA features and initiation factors, supporting the coupling of initiation and elongation.

      - Madern et al. (PMID: 39892379) demonstrated that ribosome cooperativity enhances translational efficiency, highlighting coordinated ribosome behavior.

      - Dufourt et al. (PMID: 33927056) observed that high initiation rates correlate with high elongation rates, suggesting a conserved mechanism across cell cultures and organisms.

      Integrating these studies could enrich the manuscript's interpretation and stimulate new avenues of thought.

      We thank the reviewer for the valuable comment. We added citations of Livingston et al. in the context of translational bursting. We already cited Madern et al. in multiple places and, although its observations of ribosome cooperativity are very compelling, they cannot be linked with our observations of a feedback between initiation and elongation, and it would be very challenging to see a similar effect on our reporters. This is why we did not expressly discuss cooperativity. We also integrated Dufourt et al. in the Discussion about the possibility of designing genetically-encoded reporter. We also added a sentence about the possibility of using an ER-specific SunTag reporter, as done recently in Choi et al., Nature (2025) (https://doi.org/10.1038/s41586-025-09718-0).

      Minor Comments:

      (1) Use consistent naming for SunTag reporters (e.g., "PPG" vs "proline-rich") throughout.

      Thank you for the comment. However, the term proline-rich always appears together with PPG, so we believe that the naming is clear and consistent.

      (2) Consider a schematic overview of the experimental design and modeling pipeline for accessibility.

      Thank you for the suggestion. We consider that experimental design and modeling is now sufficiently clearly described and does not justify an additional scheme. 

      (3) Clarify how incomplete run-off traces are handled in the HMM inference.

      Incomplete run-off traces are treated identically to complete traces in our HMM inference. This is possible because our model relies on the probability of transitions occurring per time step to infer rates. It does not require observing the final "empty" state to estimate the kinetic parameters ɑ and λ. The loss of signal (e.g., mRNA moving out of the focal volume or photobleaching) does not invalidate the kinetic information contained in the portion of the trace that was observed. We have clarified this in the Methods section.

      Reviewer #2 (Recommendations for the authors):

      (1) Reproducibility:

      (1.1) The authors should use a GitHub repository with a timestamp for the release version.

      The code is available on GitHub (https://github.com/naef-lab/suntag-analysis).

      (1.2) Make raw images and data available in a figure repository like Figshare.

      The raw images (.ome.tif) are now available on Zenodo (https://zenodo.org/records/17669332).

      (2) Paper reorganization and expansion of the intensity and ribosome quantification:

      (2.1) Given the relevance of the initiation and elongation rates for the conclusions of this study, and the fact that the authors inferred these rates from the spot intensities. I recommend that the authors move Figure 1 Supplement 2 to the main text and expand the description of the process to relate spot intensity and number of ribosomes. Please also expand the figure caption for this image.

      We agree with the importance of this validation. We have expanded the description of the calibration experiment in the main text and in the figure caption.

      (2.2) I suggest the authors explicitly mention the use of HMM in the abstract.

      We have now explicitly mentioned the TASEP-based HMM in the abstract.

      (2.3) In line 492, please add the frame rate used to acquire the images for the run-off assays.

      We have added the specific frame rate (one frame every 20 seconds) to the relevant section.

      (3) Figures and captions:

      (3.1) Figure 1, Supplement 2. Please add a description of the colors used in plots B, C. 

      We have expanded the caption and added the color description.

      (3.2) In the Figure 2 caption. It is not clear what the authors mean by "traceseLife". Please ensure it is not a typo.

      Thank you for spotting this. We have corrected the typo.

      (3.3) Figure 1 A, in the cartoon N(alpha)->N-1, shouldn't the transition also depend on lambda?

      The transition probability was explicitly derived in the “Bayesian modeling of run-off traces” section (Eqs. 17-18), and does not depend on λ, but only on the initiation rate under the low-density assumption.

      (3.4) Figure 3, Supplement 2. "presence of bursting and stalling.." has a typo.

      Corrected.

      (3.5) Figure 5, panel C, the y-axis label should be "run-off time (min)."

      Corrected.

      (3.6) For most figures, add significance bars.

      (3.7) In the figure captions, please add the total number of cells used for each condition.

      We have systematically indicated the number of traces (n<sub>t</sub>) and the number of independent experiments (n<sub>e</sub>) in the captions in this format (n<sub>t</sub>, n<sub>e</sub>).

      (4) Mathematical Methods:

      We greatly thank the reviewer for their detailed attention to the mathematical notation. We have addressed all points below.

      (4.1) In lines 555, Materials and Methods, subsection, Quantification of Intensity Traces, multiple equations are not numbered. For example, after Equation (4), no numbers are provided for the rest of the equations. Please keep consistency throughout the whole document.

      We have ensured that all equations are now consistently numbered throughout the document.

      (4.2) In line 588, the authors mention "$X$ is a standard normal random variable with mean $\mu$ and standard deviation $s_0$". Please ensure this is correct. A standard normal random variable has a 0 mean and std 1. 

      Thank you for the suggestion, we have corrected the text (L678).

      (4.3) Line 546, Equation 2. The authors use mu(x,y) to describe a 2d Gaussian function. But later in line 587, the authors reuse the same variable name in equation 5 to redefine the intensity as mu = b_0 + I.

      We have renamed the 2D Gaussian function to \mu_{2D}(x,y) in the spot tracking section

      (4.4) For the complete document, it could be beneficial to the reader if the authors expand the definition of the relationship between the signal "y" and the spot intensity "I". Please note how the paragraph in lines 582-587 does not properly introduce "y".

      We have added an explicit definition of y and its relationship to the underlying spot intensity I in the text to improve readability and clarity.

      (4.5) Please ensure consistency in variable names. For example, "I" is used in line 587 for the experimental spot intensity, then line 763 redefines I(t) as the total intensity obtained from the TASEP model; please use "I_sim(t)" for simulated intensities. Please note that reusing the variable "I" for different contexts makes it hard for the reader to follow the text. 

      We agree that this was confusing. We have implemented the suggestion and now distinguish simulated intensities using the notation I<sub>S</sub> .

      (4.6) Line 555 "The prior on the total intensity I is an "uninformative" prior" I ~ half_normal(1000). Please ensure it is not "I_0 ~ half_normal(1000)."? 

      We confirm that “I” is the correct variable representing the total intensity in this context; we do not use an “I<sub>0</sub>” variable here.

      (4.7) In lines 595, equation 6. Ensure that the equation is correct. Shouldn't it be: s_0^2 = ln ( 1 + (sigma_meas^2 / ⟨y⟩^2) )? Please ensure that this is correct and it is not affecting the calculated values given in lines 598.

      Thank you for catching this typo. We have corrected the equation in the manuscript. We confirm that the calculations performed in the code used the correct formula, so the reported values remain unchanged.

      (4.8) In line 597, "the mean intensity square ^2". Please ensure it is not "the square of the temporal mean intensity."

      We have corrected the text to "the square of the temporal mean intensity."

      (4.9) In lines 602-619, Bayesian modeling of run-off traces, please ensure to introduce the constant "\ell". Used to define the ribosomal footprint?

      We have added the explicit definition of 𝓁 as the ribosome footprint size (length of transcript occupied by one ribosome) in the "Bayesian modeling of run-off traces" section.

      (4.10) Line 687 has a minor typo "[...] ribosome distribution.. Then, [...]"

      We have corrected the punctuation.

      (4.11) In line 678, Equation 19 introduces the constant "L_S", Please ensure that it is defined in the text.

      We have added the explicit definition of L<sub>S</sub> (the length of the SunTag) to the text surrounding Equation 19.

      (4.12) In line 695, Equation 22, please consider using a subscript to differentiate the variance due to ribosome configuration. For example, instead of "sigma (...)^2" use something like "sigma_c ^2 (...)". Ensure that this change is correctly applied to Equation 24 and all other affected equations.

      Thank you, we have implemented the suggestions.

      (4.13) In line 696, please double-check equations 26 and 27. Specifically, the denominator ^2. Given the previous text, it is hard to follow the meaning of this variable. 

      We have revised the notation in Equations 26 and 27 to ensure the denominator is consistent with the definitions provided in the text.

      (4.14) In lines 726, the authors mention "[...], but for the purposes of this dissertation [...]", it should be "[...], but for the purposes of this study [...]"

      Thank you for spotting this. We have replaced "dissertation" with "study."

      (4.15) Equations 5, 28, 37, and the unnumbered equation between Equations 16 and 17 are similar, but in some, "y" does not explicitly depend on time. Please ensure this is correct. 

      We have verified these equations and believe they are correct.

      (4.16) Please review the complete document and ensure that variables and constants used in the equations are defined in the text. Please ensure that the same variable names are not reused for different concepts. To improve readability and flow in the text, please review the complete Materials and Methods sections and evaluate if the modeling section can be written more clearly and concisely. For example, Equation 28 is repeated in the text.

      We have performed a comprehensive review of the Materials and Methods section. To improve conciseness and flow, we have merged the subsection “Observation model and estimation of observation parameters” with the “Bayesian modeling of run-off traces” section. This allowed us to remove redundant definitions and repeated equations (such as the previous Equation 28). We have also checked that all variables and constants are defined upon first use and that variable names remain consistent throughout the manuscript.

      Reviewer #3 (Recommendations for the authors):

      (1) Data Presentation

      (1.1) In main Figures 1D and 4E, the traces appear to show frequent on-off-on transitions ("bursting"), but in supplementary figures (1-S1A and 4-S1A), this behavior is seen in only ~8 of 54 traces. Are the main figure examples truly representative?

      We acknowledge the reviewer's point. In Figure 1D, we selected some of the longest and most illustrative traces to highlight the bursting dynamics. We agree that the term "representative" might be misleading if interpreted as "average." We have updated the text to state "we show bursting traces" to more accurately reflect the selection.

      (1.2) There are 8 videos, but I could not identify which is which.

      Thank you for pointing this out. We have renamed the video files to clearly correspond to the figures and conditions they represent.

      (2) Data Availability:

      As noted above, the data should be shared. This is in accordance with eLife's policy: "Authors must make all original data used to support the claims of the paper, or that are required to reproduce them, available in the manuscript text, tables, figures or supplementary materials, or at a trusted digital repository (the latter is recommended). [...] eLife considers works to be published when they are posted as preprints, and expects preprints we review to meet the standards outlined here." Access to the time traces would have been helpful for reviewers.

      We have now added the Github link for the code (https://github.com/naef-lab/suntag-analysis) and deposited the raw data (.ome.tif files) on Zenodo (10.5281/zenodo.17669332).

      (3) Model Assumptions:

      (3.1) The broad range of run-off times (Figure 3A) suggests stalling, which may be incompatible with the 'low-density' assumption used on the TASEP model, which essentially assumes that ribosomes do not bump into each other. This could impact the validity of the assumptions that ribosomes behave independently, elongate at constant speed (necessary for the continuum-limit approximation), and that the rate-limiting step is the initiation. How robust are the inferences to this assumption?

      We agree that the deviation of waiting times from an exponential distribution (Figure 3 - figure supplement 2C) suggests the presence of stalling, which challenges the strict low-density assumption and constant elongation speed. We explicitly explored the robustness of our model to higher ribosome densities in simulations. As shown in Figure 2 - figure supplement 2, while the model accuracy for single parameters deteriorates at very high densities (overestimating density due to neglected interference), it remains robust for estimating global rates in the regime relevant to our data. We have expanded the discussion on the limitations of the low density and homogeneous elongation rate assumptions in the text (L404-408).

      (3.2) Since all constructs share the same SunTag region, elongation rates should be identical there and diverge only in the variable region. This would affect $\gamma (t)$ and hence possibly affect the results. A brief discussion would be helpful.

      This is a valid point. Currently, our model infers a single average elongation rate that effectively averages the behavior over the SunTag and the variable CDS regions. Modeling distinct rates for these regions would be a valuable extension but adds significant complexity. While our current "effective rate" approach might underestimate the magnitude of differences between reporters, it captures the global kinetic trend. We have added a brief discussion acknowledging this simplification (L408-412).

      (3.3) A similar point applies to the Gillespie simulations: modeling the SunTag region with a shared elongation rate would be more accurate.

      We agree. Simulating distinct rates for the SunTag and CDS would increase realism, though our current homogeneous simulations serve primarily to benchmark the inference framework itself. We have noted this as a potential future improvement (L413-414).

      (3.4) Equation (13) assumes that switching between bursting and non-bursting states is much slower than the elongation time. First, this should be made explicit. Second, this is not quite true (~5 min elongation time on Figure 3-s2A vs ~5-15min switching times on Figure 1). It would be useful to show the intensity distribution at t=0 and compare it to the expected mixture distribution (i.e., a Poisson distribution + some extra 'N=0' cells). 

      We thank the reviewer for this insightful comment. We have added a sentence to the text explicitly stating the assumption that switching dynamics are slower than the translation time. While the timescales are indeed closer than ideal (5 min vs. 5-15 min), this assumption allows for a tractable approximation of the initial conditions for the run-off inference. Comparing the intensity distribution at t=0 to a zero-inflated Poisson distribution is an excellent suggestion for validation, which we will consider for future iterations of the model.

      (4) Microscopy Quantifications:

      (4.1) Figure 1-S2A shows variable scFv-GFP expression across cells. Were cells selected for uniform expression in the analysis? Or is the SunTag assumed saturated? which would then need to be demonstrated. 

      All cell lines used are monoclonal, and cells were selected via FACS for consistent average cytoplasmic GFP signal. We assume the SunTag is saturated based on the established characterization of the system by Tanenbaum et al. (2014), where the high affinity of the scFv-GFP ensures saturation at expression levels similar to ours.

      (4.2) As translation proceeds, free scFv-GFP may become limiting due to the accumulation of mature SunTag-containing proteins. This would be difficult to detect (since mature proteins stay in the cytoplasm) and could affect intensity measurements (newly synthesized SunTag proteins getting dimmer over time).

      This effect can occur with very long induction times. To mitigate this, we optimized the Doxycycline (Dox) incubation time for our harringtonine experiments to prevent excessive accumulation of mature protein. We also monitor the cytoplasmic background for granularity, which would indicate aggregation or accumulation.

      (4.3) The statements "for some traces, the mRNA signal was lost before the run-off completion" (line 195) and "we observed relatively consistent fractions of translated transcripts and trace duration distributions across reporters" (line 340) should be supported by a supplementary figure.

      The first statement is supported by Figure 2 - figure supplement 1, which shows representative run-off traces for all constructs, including incomplete ones.

      The second statement regarding consistency is supported by the quantitative data in Figure 1E and G, which summarize the fraction of translated transcripts and trace durations across conditions.

      (4.4) Measurements of single mature protein intensity $i_{MP}$:

      (4.4.1) Since puromycin is used to disassemble elongating ribosomes, calibration may be biased by incomplete translation products (likely a substantial fraction, since the Dox induction is only 20min and RNAs need several minutes to be transcribed, exported, and then fully translated).

      As mentioned in the “Live-cell imaging” paragraph, the imaging takes place 40 min after the end of Dox incubation. This provides ample time for mRNA export and full translation of the synthesized proteins. Consequently, the fraction of incomplete products generated by the final puromycin addition is negligible compared to the pool of fully synthesized mature proteins accumulated during the preceding hour.

      (4.4.2) Line 519: "The intensity of each spot is averaged over the 100 frames". Do I understand correctly that you are looking at immobile proteins? What immobilizes these proteins? Are these small aggregates? It would be surprising that these aggregates have really only 1, 2, or 3 proteins, as suggested by Figure 1-S2A.

      We are visualizing mature proteins that are specifically tethered to the actin cytoskeleton. This is achieved using a reporter where the RH1 domain is fused directly to the C-terminus of the Renilla protein (SunTag-Renilla-RH1). The RH1 domain recruits the endogenous Myosin Va motor, which anchors the protein to actin filaments, rendering it immobile. Since each Myosin Va motor interacts with one RH1 domain (and thus one mature protein), the resulting spots represent individual immobilized proteins rather than aggregates. We have now revised the text and Methods section to make this calibration strategy and the construct design clearer (L130-140).

      (4.4.3) Estimating the average intensity $i_{MP}$ of single proteins all resides in the seeing discrete modes in the histogram of Figure 1-S2B, which is not very convincing. A complementary experiment, measuring *on the same microscope* the intensity of an object with a known number of GFP molecules (e.g., MS2-GFP labeled RNAs, or individual GEMs https://doi.org/10.1016/j.cell.2018.05.042 (only requiring a single transfection)) would be reassuring to convince the reader that we are not off by an order of magnitude.

      While a complementary calibration experiment would be valuable, we believe our current estimate is robust because it is independently validated by our model. When we inferred i<sub>MP</sub> as a free parameter in the HMM (Figure 5 - figure supplement 2B), the resulting value (10-15 a.u.) was remarkably consistent with our experimental calibration (14 ± 2 a.u.). We have clarified this independent validation in the text to strengthen the confidence in our quantification (L264-272).

      (4.4.4) Further on the histogram in Figure 1-S2B:

      - The gap between the first two modes is unexpectedly sharp. Can you double-check? It means that we have a completely empty bin between two of the most populated bins.

      We have double-checked the data; the plot is correct, though the sharp gap is likely due to the small sample size (n=29).

      - I am surprised not to see 3 modes or more, given that panel A shows three levels of intensity (the three colors of the arrows).

      As noted below, brighter foci exist but fall outside the displayed range of the histogram.

      - It is unclear what the statistical test is and what it is supposed to demonstrate.

      The Student's t-test compares the means of the two identified populations to confirm they are statistically distinct intensity groups.

      - I count n = 29, not 31. (The sample is small enough that the bars of the histogram show clear discrete heights, proportional to 1, 2, 3, 4, and 5 --adding up all the counts, I get 29). Is there a mistake somewhere? Or are some points falling outside of the displayed x-range?

      You are correct. Two brighter data points fell outside the displayed range. The total number of foci in the histogram is 29. We have corrected the figure caption and the text accordingly.

      (5) Miscellaneous Points: 

      (5.1) Panel B in Figure 2-s1 appears to be missing.

      The figure contains only one panel.

      (5.2) In Equation (7), $l$ is not defined (presumably ribosome footprint length?). Instead, $J$ is defined right after eq (7), as if it were used in this equation.

      Thank you for pointing this out, we have corrected it.

      (5.3) Line 703, did you mean to write something else than "Equation 26" (since equation 26 is defined after)?

      Yes, this was a typo. We have corrected the cross-reference.

    1. Synthèse de la Matinale Associations : Fiscalité, Mécénat et Fonds de Dotation

      Résumé Exécutif

      Ce document synthétise les interventions de la Direction Régionale des Finances Publiques (DRFIP) d’Île-de-France lors d'un webinaire consacré à l'actualité fiscale des organismes sans but lucratif (OSBL).

      La gestion fiscale des associations et fonds de dotation est marquée par une recherche accrue de sécurité juridique, illustrée par une hausse constante des demandes de rescrit fiscal (près de 50 % des demandes totales concernent le secteur associatif).

      Les points critiques à retenir sont le renforcement des contrôles sur l'émission des reçus fiscaux suite à la loi du 24 août 2021, l'application rigoureuse des critères de non-lucrativité (règle des « 4P » et gestion désintéressée), et la distinction impérative entre le mécénat et le parrainage commercial.

      Enfin, le cadre des fonds de dotation, bien que plus souple, impose des obligations déclaratives et de dotation minimale (15 000 €) strictes.

      --------------------------------------------------------------------------------

      I. Le Cadre d'Action de la DRFIP et la Sécurité Juridique

      La Direction Régionale des Finances Publiques d'Île-de-France, et plus particulièrement son pôle de contrôle fiscal et des affaires juridiques, assure une mission de sécurisation de la dépense fiscale.

      1. La montée en puissance du rescrit fiscal

      Le rescrit est une procédure volontaire permettant à un organisme d'obtenir une prise de position formelle de l'administration sur son régime fiscal.

      Statistiques : En 2025, la DRFIP prévoit de traiter environ 1 140 demandes de rescrits, dont 493 concernent spécifiquement les associations (soit environ 45 %).

      Objectif : Sécuriser l'émission des reçus fiscaux pour les donateurs afin d'éviter des remises en cause ultérieures lors de contrôles.

      Limites : Le rescrit ne protège l'organisme que si les informations fournies sont exhaustives et conformes à la réalité. Il n'empêche pas un contrôle fiscal ultérieur.

      2. Le renforcement des contrôles (Loi du 24 août 2021)

      La loi confortant le respect des principes de la République a transformé la nature des contrôles :

      Avant 2021 : Simple contrôle de concordance des montants.

      Depuis 2021 : Contrôle de validité sur le fond. L'administration vérifie si l'organisme est réellement fondé à émettre des reçus fiscaux au regard des critères d'intérêt général.

      --------------------------------------------------------------------------------

      II. Analyse de la Lucrativité : Critères et Méthodologie

      Le régime par défaut d'une association est l'exonération des impôts commerciaux, basée sur une présomption simple de non-lucrativité.

      L'administration peut toutefois apporter la preuve contraire en suivant une analyse par étapes.

      1. La gestion désintéressée

      C’est la condition préalable indispensable. Elle repose sur trois piliers :

      Absence de rémunération des dirigeants : Les dirigeants doivent être bénévoles.

      Une tolérance existe pour une rémunération ne dépassant pas les 3/4 du SMIC, appréciée annuellement.

      Absence de distribution de ressources : Aucun bénéfice ne doit être reversé aux membres.

      Absence d'attribution de parts d'actif : Les membres ne peuvent pas s'approprier les biens de l'association, même lors de sa dissolution.

      2. L'examen de la concurrence et la règle des « 4P »

      Si une association intervient dans un secteur concurrentiel, l'administration évalue ses modalités de gestion par rapport aux entreprises commerciales selon le faisceau d'indices dit des « 4P » (par ordre d'importance décroissante) :

      | Critère | Analyse | | --- | --- | | Produit | L'utilité sociale du service rendu (ex: méthodes adaptées pour les troubles dys). | | Public | Le service s'adresse-t-il à des personnes ne pouvant normalement pas y accéder (critères sociaux) ? | | Prix | Les tarifs sont-ils nettement inférieurs au marché ou modulés selon les revenus ? | | Publicité | L'association utilise-t-elle des méthodes commerciales de promotion ou une simple information ? |

      3. La notion de communauté d'intérêt

      Une association peut être jugée lucrative si elle constitue le prolongement d'une entreprise commerciale ou lui offre des débouchés.

      Jurisprudence "Audace" (2016) : Une association servant de « capteur de clientèle » pour une société d'assistance juridique dirigée par la même personne a été requalifiée en organisme lucratif.

      Relations privilégiées : Cette notion s'applique lorsque l'association permet à des entreprises membres de réduire leurs dépenses (ex: études de marché à moindre coût), leur offrant ainsi un avantage concurrentiel.

      --------------------------------------------------------------------------------

      III. Le Régime du Mécénat et du Parrainage

      Le dispositif du mécénat a été libéralisé par la loi de décembre 2023 (entrée en vigueur en janvier 2024), mais reste soumis à des définitions strictes.

      1. L'intérêt général fiscal

      L'intérêt général au sens fiscal (articles 200 et 238 bis du CGI) diffère du sens commun. Il exige :

      • Une gestion désintéressée.

      • Une activité non lucrative.

      • L'absence de bénéfice pour un « cercle restreint » de personnes.

      2. Distinction Mécénat vs Parrainage (Sponsoring)

      La distinction repose sur la valorisation des contreparties :

      Mécénat : Il doit exister une disproportion marquée entre le don et les contreparties reçues par le donateur (ex: simple mention du nom du donateur).

      Parrainage (Sponsoring) : Si les contreparties (publicité, logos sur maillots, cocktails premium, places réservées) ont une valeur proche du montant versé, il s'agit d'une prestation de service commerciale taxable.

      3. Cas particulier du spectacle vivant

      Le législateur autorise certains organismes lucratifs (ex: sociétés commerciales détenues par des entités publiques) à bénéficier du mécénat pour des activités de spectacle vivant, de cinéma ou d'expositions d'art contemporain, à condition que la gestion reste désintéressée.

      --------------------------------------------------------------------------------

      IV. Les Fonds de Dotation : Un Outil Spécifique

      Créés par la loi de 2008, les fonds de dotation visent à favoriser le mécénat pour le financement de missions d'intérêt général.

      1. Modes de fonctionnement

      Fonds opérateur : Réalise lui-même des activités d'intérêt général.

      Fonds redistributeur : Collecte des fonds pour les reverser à d'autres organismes d'intérêt général.

      Mixte : Combine les deux activités.

      2. Obligations et fiscalité

      Dotation minimale : 15 000 €.

      Obligations déclaratives : Déclaration annuelle en préfecture précisant le montant de la collecte et des redistributions.

      Consomptibilité : Si les statuts prévoient que la dotation peut être consommée, le fonds perd certains avantages fiscaux sur ses revenus patrimoniaux (soumission à l'IS à taux réduit).

      Taxe sur les salaires : Les fonds de dotation y sont soumis sans l'abattement dont bénéficient les associations (2 144 €), sauf pour les salaires liés à l'organisation de six manifestations de bienfaisance annuelles.

      --------------------------------------------------------------------------------

      V. Jurisprudences et Exemples de Contrôle

      L'administration s'appuie sur des cas concrets pour illustrer l'application des règles :

      École de voile de Carantec : Requalification lucrative car la zone de chalandise (touristes venant de toute la France) et les tarifs étaient comparables aux écoles de voile commerciales de la région.

      Arrêt "Piou-Piou" (2022) : Une association de ski pour enfants entretenait des relations privilégiées avec les moniteurs de l'ESF (membres de l'association), car elle leur fournissait un débouché économique direct.

      Défense de la mémoire (Affaire Maréchal Pétain) : Le mécénat est refusé si l'activité éligible (ex: un musée) est accessoire par rapport à l'objet principal de l'association qui, lui, ne rentre pas dans les critères de la loi.

      VI. Secteur Lucratif Accessoire et Sectorisation

      Une association non lucrative peut exercer des activités commerciales accessoires.

      Franchise d'impôts : Jusqu'à un seuil de 90 011 € (chiffre cité pour 2023/2024), ces revenus ne sont pas imposés si l'activité non lucrative reste prépondérante.

      Au-delà du seuil : L'association doit sectoriser ses activités. Elle paie des impôts commerciaux sur le secteur lucratif dès le premier euro.

      Critère de prépondérance : L'administration ne regarde pas seulement les recettes, mais aussi la mobilisation des ressources (temps de bénévolat, occupation des locaux, salaires) pour déterminer si l'activité non lucrative reste dominante.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Drosophila larval type II neuroblasts generate diverse types of neurons by sequentially expressing different temporal identity genes during development. Previous studies have shown that the transition from early temporal identity genes (such as Chinmo and Imp) to late temporal identity genes (such as Syp and Broad) depends on the activation of the expression of EcR by Seven-up (Svp) and progression through the G1/S transition of the cell cycle. In this study, Chaya and Syed examined whether the expression of Syp and EcR is regulated by cell cycle and cytokinesis by knocking down CDK1 or Pav, respectively, throughout development or at specific developmental stages. They find that knocking down CDK1 or Pav either in all type II neuroblasts throughout development or in single-type neuroblast clones after larval hatching consistently leads to failure to activate late temporal identity genes Syp and EcR. To determine whether the failure of the activation of Syp and EcR is due to impaired Svp expression, they also examined Svp expression using a Svp-lacZ reporter line. They find that Svp is expressed normally in CDK1 RNAi neuroblasts. Further, knocking down CDK1 or Pav after Svp activation still leads to loss of Syp and EcR expression. Finally, they also extended their analysis to type I neuroblasts. They find that knocking down CDK1 or Pav, either at 0 hours or at 42 hours after larval hatching, also results in loss of Syp and EcR expression in type I neuroblasts. Based on these findings, the authors conclude that cycle and cytokinesis are required for the transition from early to late temporal identity genes in both types of neuroblasts. These findings add mechanistic details to our understanding of the temporal patterning of Drosophila larval neuroblasts.

      Strengths:

      The data presented in the paper are solid and largely support their conclusion. Images are of high quality. The manuscript is well-written and clear.

      We appreciate the reviewer’s detailed summary and recognition of the study’s strengths.

      Weaknesses:

      The quantifications of the expression of temporal identity genes and the interpretation of some of the data could be more rigorous.

      (1) Expression of temporal identity genes may not be just positive or negative. Therefore, it would be more rigorous to quantify the expression of Imp, Syp, and EcR based on the staining intensity rather than simply counting the number of neuroblasts that are positive for these genes, which can be very subjective. Or the authors should define clearly what qualifies as "positive" (e.g., a staining intensity at least 2x background).

      We thank the reviewer for this helpful suggestion. In the new version, we have now clarified how positive expression was defined and added more details of our quantification strategy to the Methods section (page 11, lines 380-388; lines 426-434 in tracked changes file). Fluorescence intensity for each neuroblast was normalized to the mean intensity of neighboring wild-type neuroblasts imaged in the same field. A neuroblast was considered positive for a given factor when its normalized nuclear intensity was at least 2× the local background. This scoring criterion was applied uniformly across all genotypes and time points. All quantifications were performed on the raw LSM files in Fiji prior to assembling the figure panels.

      (2) The finding that inhibiting cytokinesis without affecting nuclear divisions by knocking down Pav leads to the loss of expression of Syp and EcR does not support their conclusion that nuclear division is also essential for the early-late gene expression switch in type II NSCs (at the bottom of the left column on page 5). No experiments were done to specifically block the nuclear division in this study specifically. This conclusion should be revised.

      We blocked both cell cycle progression and cytokinesis, and both these manipulations affected temporal gene transitions, suggesting that both cell cycle and cytokinesis are essential. To our knowledge, no mechanism/tool exists that selectively blocks nuclear division while leaving cell cycle progression intact. We have added more clarification on page 4, line 123 onwards (lines 126 onwards in tracked changes file).

      (3) Knocking down CDK1 in single random neuroblast clones does not make the CDK1 knockdown neuroblast develop in the same environment (except still in the same brain) as wild-type neuroblast lineages. It does not help address the concern whether "type 2 NSCS with cell cycle arrest failed to undergo normal temporal progression is indirectly due to a lack of feedback signaling from their progeny", as discussed (from the bottom of the right column on page 9 to the top of the left column on page 10). The CDK1 knockdown neuroblasts do not divide to produce progeny and thus do not receive a feedback signal from their progeny as wild-type neuroblasts do. Therefore, it cannot be ruled out that the loss of Syp and EcR expression in CDK1 knockdown neuroblasts is due to the lack of the feedback signal from their progeny. This part of the discussion needs to be clarification.

      Thanks to the reviewer for raising this critical point. We agree and have added more clarification of our interpretations and limitations to our studies in the revised text on page 8, line 278-282 (lines 296-300 in tracked changes file)

      (4) In Figure 2I, there is a clear EcR staining signal in the clone, which contradicts the quantification data in Figure 2J that EcR is absent in Pav RNAi neuroblasts. The authors should verify that the image and quantification data are consistent and correct.

      When cytokinesis is blocked using pav-RNAi, the neuroblasts become extremely large and multinucleated. In some large pav RNAi clones, we observed a weak EcR signal near the cell membrane. However, more importantly, none of the nuclear compartments showed detectable EcR staining, where EcR is typically localized. We selected a representative nuclear image for the figure panel. To clarify this observation, we have now added an explanatory note to the discussion section on page 8, lines 283-291 (lines 301-309 in tracked changes file).

      Reviewer #2 (Public review):

      Summary:

      Neural stem cells produce a wide variety of neurons during development. The regulatory mechanisms of neural diversity are based on the spatial and temporal patterning of neural stem cells. Although the molecular basis of spatial patterning is well-understood, the temporal patterning mechanism remains unclear. In this manuscript, the authors focused on the roles of cell cycle progression and cytokinesis in temporal patterning and found that both are involved in this process.

      Strengths:

      They conducted RNAi-mediated disruption on cell cycle progression and cytokinesis. As they expected, both disruptions affected temporal patterning in NSCs.

      We appreciate the reviewer’s positive assessment of our experimental results.

      Weaknesses:

      Although the authors showed clear results, they needed to provide additional data to support their conclusion sufficiently.

      For example, they need to identify type II NSCs using molecular markers (Ase/Dpn).The authors are encouraged to provide a more detailed explanation of each experiment. The current version of the manuscript is difficult for non-expert readers to understand.

      Thanks for your feedback. We have now included a detailed description of how we identify type II NSCs in both wild-type and mutant clones. We have also added a representative Asense staining to clearly distinguish type 1 (Ase<sup>+</sup>) from type 2 (Ase<sup>-</sup>) NSCs see Figure S1. We have also added a resources table explaining the genotypes associated with each figure, which was omitted due to an error in the previous version of the manuscript. 

      Reviewer #3 (Public review):

      Summary:

      The manuscript by Chaya and Syed focuses on understanding the link between cell cycle and temporal patterning in central brain type II neural stem cells (NSCs). To investigate this, the authors perturb the progression of the cell cycle by delaying the entry into M phase and preventing cytokinesis. Their results convincingly show that temporal factor expression requires progression of the cell cycle in both Type 1 and Type 2 NSCs in the Drosophila central brain. Overall, this study establishes an important link between the two timing mechanisms of neurogenesis.

      Strengths:

      The authors provide solid experimental evidence for the coupling of cell cycle and temporal factor progression in Type 2 NSCs. The quantified phenotype shows an all-ornone effect of cell cycle block on the emergence of subsequent temporal factors in the NSCs, strongly suggesting that both nuclear division and cytokinesis are required for temporal progression. The authors also extend this phenotype to Type 1 NSCs in the central brain, providing a generalizable characterization of the relationship between cell cycle and temporal patterning.

      We thank the reviewer for recognizing the robustness of our data linking the cell cycle to temporal progression.

      Weaknesses:

      One major weakness of the study is that the authors do not explore the mechanistic relationship between the cell cycle and temporal factor expression. Although their results are quite convincing, they do not provide an explanation as to why Cdk1 depletion affects Syp and EcR expression but not the onset of svp. This result suggests that at least a part of the temporal cascade in NSCs is cell-cycle independent, which isn't addressed or sufficiently discussed.

      Thank you for bringing up this important point. We are equally interested in uncovering the mechanism by which the cell cycle regulates temporal gene transitions; however, such mechanistic exploration is beyond the scope of the present study. Interestingly, while the temporal switching factor Svp is expressed independently of the cell cycle, the subsequent temporal transitions are not. We have expanded our discussion on this intriguing finding (page 9, line 307-315; lines 345-355 in tracked changes file). Specifically, we propose that svp activation marks a cell-cycle–independent phase, whereas EcR/Syp induction likely depends on cell-cycle–coupled mechanisms, such as mitosis-dependent chromatin remodeling or daughter-cell feedback. Although further dissection of this mechanism lies beyond the current study, our findings establish a foundation for future work aimed at identifying how developmental timekeeping is molecularly coupled to cell-cycle progression.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors): 

      (1) Figure 1 C and D, it would be better to put a question mark to indicate that these are hypotheses to be tested. 

      We appreciate this suggestion and have added question marks in Figure 1C and 1D to clearly indicate that these panels represent hypotheses under investigation clearly.

      (2) Figure 2A-I, Figure 4A-I, Figure 5A-I and K-S, in addition to enlarged views of single type II neuroblasts, it would be more convincing to include zoomed-out images of the entire larval brain or at least a portion of the brain to include neighboring wild-type type II neuroblasts as internal controls. Also, it would be ideal to show EcR staining from the same neuroblasts as IMP and Syp staining. 

      We thank the reviewer for this valuable input. In our imaging setup, the number of available antibody channels was limited to four (anti-Ase, anti-GFP, anti-Syp, and antiImp). Adding EcR in the same sample was therefore not technically possible, we performed EcR staining separately. 

      (3) The authors cited "Syed et al., 2024" (in the middle of the right column on page 5), but this reference is missing in the "References" section and should be added. 

      The missing citation has been added to the reference section.  

      (4) It would be better to include Ase staining in the relevant figure to indicate neuroblast identity as type I or type II. 

      We agree and now include representative Ase staining for both type 1 and type 2 NSC clones in Figure S1, along with corresponding text updates that describe these markers.

      Reviewer #2 (Recommendations for the authors): 

      Major comments 

      (1) The present conclusion relies on the results using Cdk1 RNAi and pav RNAi. It is still possible that Cdk1 and Pav are involved in the regulation of temporal patterning independent of the regulation of cell cycle or cytokinesis, respectively. To avoid this possibility, the authors need to inhibit cell cycle progression or cytokinesis in another alternative manner. 

      We thank the reviewer for raising this important point. While we cannot completely exclude gene-specific, cell-cycle-independent roles for Cdk1 or Pav, we observe consistent phenotypes across several independent manipulations that slow or block the cell cycle. Also, earlier studies using orthogonal approaches that delay G1/S (Dacapo/Rbf) or impair mitochondrial OxPhos (which lengthens G1/S; van den Ameele & Brand, 2019) produce similar temporal delays. These concordant phenotypes strongly support the interpretation that altered cell-cycle progression—rather than specific roles of a single gene—is the primary cause of the defect. While we cannot exclude additional, gene-specific effects of Cdk1 or Pav, the concordant phenotypes across independent perturbations make the cell-cycle disruption model the most parsimonious interpretation. We have clarified this reasoning in the discussion section on pages 8-9, lines 293-305 (lines 311-343 in tracked changes file).

      (2) To reach the present conclusion, the authors need to address the effects of acceleration of cell cycle progression or cytokinesis on temporal patterning. 

      We thank the reviewer for this insightful suggestion. To our knowledge, there are currently no established genetic tools that can specifically accelerate cell-cycle progression in Drosophila neuroblasts. However, our results demonstrate that blocking the cell cycle impairs the transition from early to late temporal gene expression. These findings suggest that proper cell-cycle progression is essential for the transition from early to late temporal identity in neuroblasts.

      Minor comments 

      (3) P3L2 (right), ... we blocked the NSC cell cycle...

      How did they do it? 

      Which fly lines were used?

      Why did they use the line? 

      These details are now included in the Materials and Methods and the Resource Table (pages 11-13). We used Wor-Gal4, Ase-Gal80 to drive UAS-Cdk1RNAi and UASpavRNAi in type 2 NSCs 

      (4) P5L1(left), ... we used the flip-out approach...

      Why did they conduct it? 

      Probably, the authors have reasons other than "to further ensure." 

      We have clarified in the text on page 4, lines 137-139, that the flip-out approach was used to generate random single-cell clones, enabling quantitative analysis of type 2 NSCs within an otherwise wild-type brain. 

      (5) P5L8(left), ... type 2 hits were confirmed by lack of the type 1 Asense...  The authors must examine Deadpan (Dpn) expression as well. Because there are a lot of Asense (Ase) negative cells in the brain (neurons, glial cell, and neuroepithelial cells). 

      Type II NSCs can be identified as Dpn+/Ase- cells.

      We agree that Dpn is a helpful marker. However, we reliably distinguished type II NSCs by their lack of Ase and larger cell size relative to surrounding neurons and glia, which are smaller in size and located deeper within the clone. These differences, together with established lineage patterns, allow unambiguous identification of type 2 NSCs across all genotypes. We have now added representative type I and type 2 NSC clones to the supplemental figure S1 (E-G’) with Asense stains to demonstrate how we differentiate type I from type II NSCs. 

      (6) P5L32(left), To do this, we induced... 

      This sentence should be made more concise.

      Please rephrase it. 

      The sentence has been rewritten for clarity and concision.

      (7)  P5L42(left), ...lack of EcR/Syp expression (Figure 2).  However, EcR expression is still present (Figure 2I). 

      In some large pavRNAi clones, a weak EcR signal can be observed near the cell membrane; however, none of the nuclear compartments—where EcR is typically localized—show detectable staining. We selected a representative nuclear image for the figure and addressed this observation on page 8, lines 283-291 (lines 301-309 in tracked changes file).

      (8) P7L29(left), ......had persistent Imp expression...

      Imp expression is faint compared to that in Figure 2G.

      The differences between Figures 2G and 3G should be discussed. 

      We thank the reviewer for this comment. We have added a note in the Methods section clarifying that brightness and contrast were adjusted per panel for optimal visualization; thus, apparent differences in signal intensity do not reflect biological variation. Fluorescence intensity for each neuroblast was normalized to the mean intensity of neighboring wild-type neuroblasts imaged in the same field. A neuroblast was considered Imp-positive when its normalized nuclear intensity was at least 2× the local background. This scoring criterion was applied uniformly across all genotypes and time points. All quantifications were performed on the raw LSM files in Fiji prior to assembling the figure panels.

      (9) P8 (Figure 5)

      The Imp expression is faint compared to that in Figure 5Q.

      The difference between Figure 5G and 5Q should be discussed further. 

      As mentioned above, we have clarified our image processing approach in the Methods section to explain any differences in signal appearance between these figures.

      (10) P10 Materials and Methods

      The authors did not mention the fly lines used. This is very important for the readers. 

      We thank the reviewer for bringing this oversight to our attention. The Resource Table was inadvertently omitted from the initial submission. The complete list of fly lines and reagents used in this study is now provided in the updated Resource Table.

      Reviewer #3 (Recommendations for the authors): 

      Major points 

      (1) The authors mention that the heat-shock induction at 42ALH is well after svp temporal window and therefore the cell cycle block independently affects Syp and EcR expression. However, Figure 3 shows svp-LacZ expression at 48ALH. If svp expression is indeed transient in Type 2 NSCs, then this must be validated using an immunostaining of the svp-LacZ line with svp antibody. This is crucial as the authors claim that cell cycle block doesn't affect does affect svp expression and is required independently. 

      We thank the reviewer for bringing this important issue to our attention. As noted, Svp protein is expressed transiently and stochastically in type 2 NSCs (Syed et al., 2017), making direct antibody quantification challenging upon cell cycle block. Consistent with previous work (Syed et al., 2017), we used the svp-LacZ reporter line to visualize stabilized Svp expression, which reliably captures Svp expression in type 2 NSCs (Syed et al., 2017 https://doi.org/10.7554/eLife.26287, and Dhilon et al., 2024 https://doi.org/10.1242/dev.202504).

      (2) The authors have successfully slowed down the cell cycle and showed that it affects temporal progression. However, a converse experiment where the cell cycle is sped up in NSCs would be an important test for the direct coupling of temporal factor expression and cell cycle, wherein the expectation would be the precocious expression of late temporal factors in faster cycle NSCs. 

      We agree that such an experiment would be ideal. However, as noted above (Reviewer #2 comment 2), to our knowledge, no suitable tools currently exist to accelerate neuroblast cell-cycle progression without pleiotropic effects.

      Minor point 

      The authors must include Ray and Li (https://doi.org/10.7554/eLife.75879) in the references when describing that "...cell cycle has been shown to influence temporal patterning in some systems,...".  

      We thank the reviewer for this helpful suggestion. The cited reference (Ray and Li, eLife, 2022) has now been included and appropriately referenced in the revised manuscript.

    1. Comprendre la Contre-volonté : Analyse de l'Opposition Instinctive chez l'Enfant

      Résumé Exécutif

      Ce document propose une analyse approfondie du concept de « contre-volonté », un phénomène souvent confondu avec l'opposition ou l'impolitesse dans le cadre de l'éducation et du développement de l'enfant.

      Contrairement aux perceptions populaires qui valorisent l'obéissance immédiate, la recherche développementale démontre que la contre-volonté est une réaction instinctive, saine et nécessaire.

      Elle assure la protection de l'individu contre les influences extérieures non sécurisées et constitue le socle de l'affirmation de soi et de l'esprit critique à l'âge adulte.

      Le document souligne que les interventions basées sur la pression, les ultimatums ou la punition sont contre-productives, car elles alimentent la résistance au lieu de favoriser la coopération.

      La clé d'une collaboration harmonieuse réside dans la réactivation intentionnelle du lien d'attachement.

      En privilégiant la connexion émotionnelle, l'humour et la créativité, les adultes peuvent transformer une dynamique de confrontation en une adhésion naturelle, permettant à l'enfant de se développer sans sacrifier son intégrité personnelle.

      --------------------------------------------------------------------------------

      1. Définition et Origines de la Contre-volonté

      La contre-volonté se distingue de la simple « opposition » par sa nature structurelle et instinctive dans le développement humain.

      Un être autodéterminé : L'humain est, par essence, un être doté d'une volonté propre. La contre-volonté émerge lorsque la volonté de l'adulte entre en conflit direct avec celle de l'enfant.

      Opposition vs Contre-volonté : Alors que le terme « opposition » est souvent utilisé de manière péjorative dans le jargon populaire pour décrire un manque de respect, la « contre-volonté » décrit plus précisément le processus biologique et psychologique de résistance à une consigne externe perçue comme intrusive.

      Le mythe de l'enfant « bien élevé » : Le modèle traditionnel valorise l'obéissance au doigt et à l'œil.

      Or, une obéissance totale et immédiate s'apparente davantage au fonctionnement d'un robot ou d'une marionnette qu'à celui d'un être humain en développement.

      2. La Valeur Développementale et Sécuritaire

      Loin d'être un défaut de comportement, la contre-volonté remplit des fonctions vitales pour l'individu.

      Protection et Survie

      Résistance instinctive : Les humains sont programmés pour résister aux directives de personnes avec lesquelles ils n'ont pas de lien d'attachement solide.

      Sécurité physique : Cette résistance est un mécanisme de protection essentiel (par exemple, refuser de suivre un inconnu dans la rue).

      L'enfant fait alors preuve de contre-volonté pour préserver son intégrité.

      Affirmation de Soi et Esprit Critique

      Préparation à l'âge adulte : L'affirmation de soi ne commence pas à 18 ou 22 ans.

      Elle se cultive dès l'enfance. Un adulte capable de négocier son salaire ou de poser des limites dans son couple est un enfant qui a pu exercer sa contre-volonté.

      Développement du jugement : La capacité de remettre en question, d'argumenter et de ne pas tout accepter « pour argent comptant » est le fondement de l'esprit critique.

      Sans contre-volonté, l'enfant devient un adolescent et un adulte vulnérable à l'influence d'autrui.

      3. Les Causes de la Résistance au Quotidien

      L'analyse identifie plusieurs facteurs exacerbant la contre-volonté dans les interactions quotidiennes :

      | Facteur | Description | | --- | --- | | Immaturité cérébrale | Le cerveau de l'enfant traite souvent une seule information à la fois. S'il est absorbé par le jeu, il n'ignore pas l'adulte par mépris, mais par incapacité neurologique à basculer instantanément sa volonté. | | Pression extérieure | L'usage de l'autorité brute, des menaces, des punitions ou des ultimatums augmente la contre-volonté au lieu de susciter la collaboration. | | Déconnexion relationnelle | Donner une consigne à distance ou sans avoir préalablement établi un contact visuel ou émotionnel crée un fossé qui déclenche la résistance. |

      4. Stratégies de Collaboration : De la Pression à la Connexion

      Pour réduire la contre-volonté, l'adulte doit chercher à « augmenter la volonté » de l'enfant de collaborer par des leviers relationnels.

      Le Concept de la « Bulle » et du « Velcro »

      La Bulle d'attachement : L'adulte doit inviter l'enfant à entrer dans sa « bulle » de sécurité. Lorsque l'enfant est connecté à l'adulte, il a naturellement tendance à suivre la direction de ce dernier.

      L'effet Velcro : Plutôt que d'être une « balle de ping-pong » (donner un ordre et repartir), l'adulte doit devenir « velcro » : s'approcher physiquement, s'intéresser à l'activité de l'enfant et établir un lien avant de formuler une demande.

      Leviers d'Intervention Efficaces

      La Connexion avant la Consigne : Prendre quelques secondes pour saluer l'enfant, le flatter ou exprimer son plaisir de le retrouver.

      La Créativité et l'Humour : Utiliser le jeu pour contourner la résistance (ex: faire parler un jouet pour inviter au lavage des mains). La créativité est présentée comme une alternative supérieure à l'autorité pure.

      L'Empathie : Reconnaître que la volonté de l'enfant est légitime, même si elle diffère de la nôtre. L'objectif n'est pas de céder sur tout, mais d'imposer une structure dans le respect du stade développemental de l'enfant.

      5. Perspectives Systémiques : Adolescence et Milieu Scolaire

      La dynamique de la contre-volonté s'étend au-delà de la petite enfance et touche toutes les sphères sociales.

      Adolescence : C'est une période de contre-volonté intense.

      Les interventions basées sur la déconnexion et les attentes irréalistes de soumission ne font qu'empirer la situation.

      Milieu Scolaire : Les enfants ayant les besoins relationnels les plus importants sont souvent ceux qui résistent le plus.

      Le système tend malheureusement à les exclure ou à les punir (systèmes de couleurs, retrait de privilèges), ce qui rompt davantage le lien d'attachement et renforce leur comportement d'opposition.

      Vie Adulte : La contre-volonté persiste chez l'adulte.

      Un employé réagira par la résistance face à un supérieur qui impose une directive sans considération pour son travail en cours ou sans politesse élémentaire.

      Conclusion

      La contre-volonté n'est pas un problème de comportement à éradiquer, mais un signal de besoin de connexion ou d'affirmation.

      En changeant de perspective — en passant de la gestion de l'opposition à la culture de l'attachement — les éducateurs et parents favorisent le développement d'individus autonomes, critiques et capables de respecter leurs propres limites tout en collaborant avec la structure sociale.

      Comprendre ce mécanisme permet de passer d'une éducation basée sur la force à une éducation basée sur la relation.

    1. Reviewer #2 (Public review):

      Summary:

      This work addresses the question whether artificial deep neural network models of the brain could be improved by incorporating top-down feedback, inspired by the architecture of neocortex.

      In line with known biological features of cortical top-down feedback, the authors model such feedback connections with both, a typical driving effect and a purely modulatory effect on the activation of units in the network.

      To asses the functional impact of these top-down connections, they compare different architectures of feedforward and feedback connections in a model that mimics the ventral visual and auditory pathways in cortex on an audiovisual integration task.

      Notably, one architecture is inspired by human anatomical data, where higher visual and auditory layers possess modulatory top-down connections to all lower-level layers of the same modality, and visual areas provide feedforward input to auditory layers, whereas auditory areas provide modulatory feedback to visual areas.

      First, the authors find that this brain-like architecture imparts the models with a light visual bias similar to what is seen in human data, which is the opposite in a reversed architecture, where auditory areas provide feedforward drive to the visual areas.

      Second, they find that, in their model, modulatory feedback should be complemented by a driving component to enable effective audiovisual integration, similar to what is observed in neural data.

      Overall, the study shows some possible functional implications when adding feedback connections in a deep artificial neural network that mimic some functional aspects of visual perception in humans.

      Strengths:

      The study contains innovative ideas, such as incorporating an anatomically inspired architecture into a deep ANN, and comparing its impact on a relevant task to alternative architectures.

      Moreover, the simplicity of the model allows it to draw conclusions on how features of the architecture and functional aspects of the top-down feedback affects performance of the network.

      This could be a helpful resource for future studies of the impact of top-down connections in deep artificial neural network models of neocortex.

      Weaknesses:

      Some claims not yet supported.

      The problem is that results are phrased quite generally in the abstract and discussion, while the actual results shown in the paper are very specific to certain implementations of top-down feedback and architectures. This could lead to misunderstanding and requires some revisions of the claims in the abstract and discussion (see below).

      "Altogether our findings demonstrate that modulatory top-down feedback is a computationally relevant feature of biological brain..."

      This claim is not supported, since no performance increase is demonstrated for modulatory feedback. So far, only the second half of the sentence is supported: "...and that incorporating it into ANNs affects their behavior and constrains the solutions it's likely to discover."

      "This bias does not impair performance on the audiovisual tasks."

      This is only true for the composite top-down feedback that combines driving and modulatory effects, whereas modulatory feedback alone can impair the performance (e.g., in the visual tasks VS1 and VS2). The fact that modulatory feedback alone is insufficient in ANNs to enable effective cross-modal integration and requires some driving component is actually very interesting, but it is not stressed enough in the abstract. This is hinted at in the following sentence, but should be made more explicitly:

      "The results further suggest that different configurations of top-down feedback make otherwise identically connected models functionally distinct from each other, and from traditional feedforward and laterally recurrent models."

      "Here we develop a deep neural network model that captures the core functional properties of top-down feedback in the neocortex" -> this is too strong, take out "the", because very likely there are other important properties that are not yet incorporated.

      "Altogether, our results demonstrate that the distinction between feedforward and feedback inputs has clear computational implications, and that ANN models of the brain should therefore consider top-down feedback as an important biological feature."

      This claim is still not substantiated by evidence provided in the paper. First, the wording is a bit imprecise, because mechanistically, it is not really the feedforward versus feedback (a purely feedforward model is not considered at all in the paper), but modulatory versus driving. Moreover, the second part of the sentence is problematic: The results imply that, computationally/functionally, driving connections are doing the job, while modulatory feedback does not really seem to improve performance (best case, it does not do any harm). It is true that it is a feature that is inspired by biology, but I don't see why the results imply that (modulatory) top-down feedback should be considered in ANN models of the brain. This would require to show that such models either improve performance, or do improve the ability to fit neural data, both which are beyond the scope of the paper.

      The same argument holds for the following sentence, which is not supported by the results of the paper:

      "More broadly, our work supports the conclusion that both the cellular neurophysiology and structure of feed-back inputs have critical functional implications that need to be considered by computational models of brain function."

      Additional supplementary material required

      Although the second version checked the influence of processing time, this was not done for the most important figure of the paper, Figure 4. A central claim in the abstract "This bias does not impair performance on the audiovisual tasks" relies on this figure, because only with composite feedback the performance is comparable between the the "drive-only" and "brain-like" models. Thus, the supplementary Figure 3 should also include the composite networks and drive only network to check the robustness of the claim with respect to process time. This robustness analysis should then also be mentioned in the text. For example, it should be mentioned whether results in these networks are robust or not with respect to process time, whether there are differences between network architectures or types of feedback in general etc.

      Moreover, the current analysis for networks with modulatory feedback is a bit confusing. Why is the performance so low for the reverse model for a process time of 3 and 10? This is a very strong effect that warrants explanation. More details should be added in the caption as well. For example, are the models separately trained for the output after 3 and 10 processing steps for the comparison, or just evaluated at these times? Not training these networks separately might explain the low performance for some networks, so ideally networks are trained for each choice of processing steps.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Here, the authors aim to investigate the potential improvements of ANNs when used to explain brain data using top-down feedback connections found in the neocortex. To do so, they use a retinotopic and tonotopic organization to model each subregion of the ventral visual (V1, V2, V4, and IT) and ventral auditory (A1, Belt, A4) regions using Convolutional Gated Recurrent Units. The top-down feedback connections are inspired by the apical tree of pyramidal neurons, modeled either with a multiplicative effect (change of gain of the activation function) or a composite effect (change of gain and threshold of the activation function).

      To assess the functional impact of the top-down connections, the authors compare three architectures: a brain-like architecture derived directly from brain data analysis, a reversed architecture where all feedforward connections become feedback connections and vice versa, and a random connectivity architecture. More specifically, in the brain-like model the visual regions provide feedforward input to all auditory areas, whereas auditory areas provide feedback to visual regions.

      First, the authors found that top-down feedback influences audiovisual processing and that the brain-like model exhibits a visual bias in multimodal visual and auditory tasks. Second, they discovered that in the brain-like model, the composite integration of top-down feedback, similar to that found in the neocortex, leads to an inductive bias toward visual stimuli, which is not observed in the feedforward-only model. Furthermore, the authors found that the brain-like model learns to utilize relevant stimuli more quickly while ignoring distractors. Finally, by analyzing the activations of all hidden layers (brain regions), they found that the feedforward and feedback connectivity of a region could determine its functional specializations during the given tasks.

      Strengths:

      The study introduces a novel methodology for designing connectivity between regions in deep learning models. The authors also employ several tasks based on audiovisual stimuli to support their conclusions. Additionally, the model utilizes backpropagation of error as a learning algorithm, making it applicable across a range of tasks, from various supervised learning scenarios to reinforcement learning agents. Conversely, the presented framework offers a valuable tool for studying top-down feedback connections in cortical models. Thus, it is a very nice study that also can give inspiration to other fields (machine learning) to start exploring new architectures.

      We thank the reviewer for their accurate summary of our work and their kind assessment of its strengths.

      Weaknesses:

      Although the study explores some novel ideas on how to study the feedback connections of the neocortex, the data presented here are not complete in order to propose a concrete theory of the role of top-down feedback inputs in such models of the brain.

      (1) The gap in the literature that the paper tries to fill in the ability of DL algorithms to predict behavior: "However, there are still significant gaps in most deep neural networks' ability to predict behavior, particularly when presented with ambiguous, challenging stimuli." and "[...] to accurately model the brain."

      It is unclear to me how the presented work addresses this gap, as the only facts provided are derived from a simple categorization task that could also be solved by the feedforward-only model (see Figures 4 and 5). In my opinion, this statement is somewhat far-fetched, and there is insufficient data throughout the manuscript to support this claim.

      We can see now that the way the introduction was initially written led to some confusion about our goal in this study. Our goal here was not to demonstrate that top-down feedback can enable superior matches to human behaviour. Rather, our goal was to determine if top-down feedback had any real implications for processing ambiguous stimuli. The sentence that the reviewer has highlighted was intended as an explanation for why top-down feedback, and its impact on ambiguous stimuli, might be something one would want to examine for deep neural networks. But, here, we simply wanted to (1) provide an overview of the code base we have created, (2) demonstrate that top-down feedback does impact the processing of ambiguous stimuli.

      We agree with the reviewer that if our goal was to improve our ability to predict behaviour, then there was a big gap in the evidence we provided here. But, this was not our goal, and we believe that the data we provide here does convincingly show that top-down feedback has an impact on processing of ambiguous stimuli. We have updated the text in the introduction to make our goals more clear for the reader and avoid this misunderstanding of what we were trying to accomplish here. Specifically, the end of the introduction is changed to:

      “To study the effect of top-down feedback on such tasks, we built a freely available code base for creating deep neural networks with an algorithmic approximation of top-down feedback. Specifically, top-down feedback was designed to modulate ongoing activity in recurrent, convolutional neural networks. We explored different architectural configurations of connectivity, including a configuration based on the human brain, where all visual areas send feedforward inputs to, and receive top-down feedback from, the auditory areas. The human brain-based model performed well on all audiovisual tasks, but displayed a unique and persistent visual bias compared to models with only driving connectivity and models with different hierarchies. This qualitatively matches the reported visual bias of humans engaged in audio-visual tasks. Our results confirm that distinct configurations of feedforward/feedback connectivity have an important functional impact on a model's behavior. Therefore, top-down feedback captures behaviors and perceptual preferences that do not manifest reliably in feedforward-only networks. Further experiments are needed to clarify whether top-down feedback helps an ANN fit better to neural data, but the results show that top-down feedback affects the processing of stimuli and is thus a relevant feature that should be considered for deep ANN models in computational neuroscience more broadly.”

      (2) It is not clear what the advantages are between the brain-like model and a feedforward-only model in terms of performance in solving the task. Given Figures 4 and 5, it is evident that the feedforward-only model reaches almost the same performance as the brain-like model (when the latter uses the modulatory feedback with the composite function) on almost all tasks tested. The speed of learning is nearly the same: for some tested tasks the brain-like model learns faster, while for others it learns slower. Thus, it is hard to attribute a functional implication to the feedback connections given the presented figures and therefore the strong claims in the Discussion should be rephrased or toned down.

      Again, we believe that there has been a misunderstanding regarding the goals of this study, as we are not trying to claim here that there are performance advantages conferred by top-down feedback in this case. Indeed, we share the reviewer’s assessment that the feedforward only model seems to be capable of solving this task well. To reiterate: our goal here was to demonstrate that top-down feedback alters the computations in the network and, thus, has distinct effects on behaviour that need to be considered by researchers who use deep networks to model the brain. But we make no claims of “superiority” of the brain-like model.

      In-line with this, we’re not completely sure which claims in the discussion the reviewer is referring to. We note that we were quite careful in our claims. For example, in the first section of the discussion we say:

      “Altogether, our results demonstrate that the distinction between feedforward and feedback inputs has clear computational implications, and that ANN models of the brain should therefore consider top-down feedback as an important biological feature.”

      And later on:

      “In summary, our study shows that modulatory top-down feedback and the architectural diversity enabled by it can have important functional implications for computational models of the brain. We believe that future work examining brain function with deep neural networks should therefore consider incorporating top-down modulatory feedback into model architectures when appropriate.”

      If we have missed a claim in the discussion that implies superiority of the brain-like model in terms of task performance we would be happy to change it.

      (3) The Methods section lacks sufficient detail. There is no explanation provided for the choice of hyperparameters nor for the structure of the networks (number of trainable parameters, number of nodes per layer, etc). Clarifying the rationale behind these decisions would enhance understanding. Moreover, since the authors draw conclusions based on the performance of the networks on specific tasks, it is unclear whether the comparisons are fair, particularly concerning the number of trainable parameters. Furthermore, it is not clear if the visual bias observed in the brain-like model is an emerging property of the network or has been created because of the asymmetries in the visual vs. auditory pathway (size of the layer, number of layers, etc).

      We thank the reviewer for raising this issue, and want to provide some clarifications: First, the number of trainable parameters are roughly equal, since we were only switching the direction of connectivity (top-down versus bottom-up), not the number of connections. We confirmed the biggest difference in size is between models with composite and multiplicative feedback; models with composite feedback have roughly ~1K more parameters, and all models are within the 280K parameter range. We now state this in the methods.

      Second, because superior performance was not the goal of this study, as stated above, we conducted limited hyperparameter tuning. Given the reviewer’s comment, we wondered whether this may have impacted our results. Therefore, we explored different hyperparameters for the model during the multimodal auditory tasks, which show the clearest example of the visual dominance in the brainlike model (Figure 3).

      We explored different hidden state sizes, learning rates and processing times, and examined whether the core results were different. We found that extremely high learning rates (0.1) destabilize all models and that some models perform poorly under different processing times. But overall, the core results are evident across all hyperparameters where the models learn i.e the different behaviors of models with different connectivities and the visual dominance observed in the brainlike model. We now provide these results in a supplementary figure (Fig. S2, showing larger models trained with different learning rates, and Fig S3, which shows the effect of processing time on AS task performance).

      Reviewer #2 (Public review):

      Summary:

      This work addresses the question of whether artificial deep neural network models of the brain could be improved by incorporating top-down feedback, inspired by the architecture of the neocortex.

      In line with known biological features of cortical top-down feedback, the authors model such feedback connections with both, a typical driving effect and a purely modulatory effect on the activation of units in the network.

      To assess the functional impact of these top-down connections, they compare different architectures of feedforward and feedback connections in a model that mimics the ventral visual and auditory pathways in the cortex on an audiovisual integration task.

      Notably, one architecture is inspired by human anatomical data, where higher visual and auditory layers possess modulatory top-down connections to all lower-level layers of the same modality, and visual areas provide feedforward input to auditory layers, whereas auditory areas provide modulatory feedback to visual areas.

      First, the authors find that this brain-like architecture imparts the models with a light visual bias similar to what is seen in human data, which is the opposite in a reversed architecture, where auditory areas provide a feedforward drive to the visual areas.

      Second, they find that, in their model, modulatory feedback should be complemented by a driving component to enable effective audiovisual integration, similar to what is observed in neural data.

      Last, they find that the brain-like architecture with modulatory feedback learns a bit faster in some audiovisual switching tasks compared to a feedforward-only model.

      Overall, the study shows some possible functional implications when adding feedback connections in a deep artificial neural network that mimics some functional aspects of visual perception in humans.

      Strengths:

      The study contains innovative ideas, such as incorporating an anatomically inspired architecture into a deep ANN, and comparing its impact on a relevant task to alternative architectures.

      Moreover, the simplicity of the model allows it to draw conclusions on how features of the architecture and functional aspects of the top-down feedback affect the performance of the network.

      This could be a helpful resource for future studies of the impact of top-down connections in deep artificial neural network models of the neocortex.

      We thank the reviewer for their summary and their recognition of the innovative components and helpful resources therein.

      Weaknesses:

      Overall, the study appears to be a bit premature, as several parts need to be worked out more to support the claims of the paper and to increase its impact.

      First, the functional implication of modulatory feedback is not really clear. The "only feedforward" model (is a drive-only model meant?) attains the same performance as the composite model (with modulatory feedback) on virtually all tasks tested, it just takes a bit longer to learn for some tasks, but then is also faster at others. It even reproduces the visual bias on the audiovisual switching task. Therefore, the claims "Altogether, our results demonstrate that the distinction between feedforward and feedback inputs has clear computational implications, and that ANN models of the brain should therefore consider top-down feedback as an important biological feature." and "More broadly, our work supports the conclusion that both the cellular neurophysiology and structure of feed-back inputs have critical functional implications that need to be considered by computational models of brain function" are not sufficiently supported by the results of the study. Moreover, the latter points would require showing that this model describes neural data better, e.g., by comparing representations in the model with and without top-down feedback to recorded neural activity.

      To emphasize again our specific claims, we believe that our data shows that top-down feedback has functional implications for deep neural network behaviour, not increased performance or neural alignment. Indeed, our results demonstrate that top-down feedback alters the behaviour of the networks, as shown by the differences in responses to various combinations of ambiguous stimuli. We agree with the reviewer that if our goal was to claim either superior performance on these tasks, or better fit to neural data, we would need to actually provide data supporting that claim.

      Given the comments from the reviewer, we have tried to provide more clarity in the introduction and discussion regarding our claims. In particular, we now highlight that we are not trying to demonstrate that the models with top-down feedback exhibit superior performance or better fit to neural data.

      As one final note, yes, the reviewer understood correctly that the “only feedforward” model is a model with only driving inputs. We have renamed the feedforward-only models to drive only models and added additional emphasis in the text to ensure that the distinction is clear for all readers.

      Second, the analyses are not supported by supplementary material, hence it is difficult to evaluate parts of the claims. For example, it would be helpful to investigate the impact of the process time after which the output is taken for evaluation of the model. This is especially important because in recurrent and feedback models the convergence should be checked, and if the network does not converge, then it should be discussed why at which point in time the network is evaluated.

      This is an excellent point, and we thank the reviewer for raising it. We allowed the network to process the stimuli for seven time-steps, which was enough for information from any one region to be transmitted to any other. We found in some initial investigations that if we shortened the processing time some seeds would fail to solve the task. But, based on the reviewer’s comment, we have now also run additional tests with longer processing times for the auditory tasks where we see the clearest visual bias (Figure 3). We find that different process times do not change the behavioral biases observed in our models, but may introduce difficulties ignoring visual stimuli for some models. Thus, while process time is an important hyperparameter for optimal performance of the model, the central claim of the paper remains. We include this new data in a supplementary figure S3.

      Third, the descriptions of the models in the methods are hard to understand, i.e., parameters are not described and equations are explained by referring to multiple other studies. Since the implications of the results heavily rely on the model, a more detailed description of the model seems necessary.

      We agree with the reviewer that the methods could have been more thorough. Therefore, we have greatly expanded the methods section. We hope the model details are now more clear.

      Lastly, the discussion and testable predictions are not very well worked out and need more details. For example, the point "This represents another testable prediction flowing from our study, which could be studied in humans by examining the optical flow (Pines et al., 2023) between auditory and visual regions during an audiovisual task" needs to be made more precise to be useful as a prediction. What did the model predict in terms of "optic flow", how can modulatory from simple driving effect be distinguished, etc.

      We see that the original wording of this prediction was ambiguous, thank you for pointing this out. In the study highlighted (Pines et al., 2023) the authors use an analysis technique for measuring information flow between brain regions, which is related to analysis of optical flow in images, but applied to fMRI scans. This is confusing given the current study, though. Therefore, we have changed this sentence to make clear that we are speaking of information flow here. 

      Reviewer #3 (Public review):

      Summary:

      This study investigates the computational role of top-down feedback in artificial neural networks (ANNs), a feature that is prevalent in the brain but largely absent in standard ANN architectures. The authors construct hierarchical recurrent ANN models that incorporate key properties of top-down feedback in the neocortex. Using these models in an audiovisual integration task, they find that hierarchical structures introduce a mild visual bias, akin to that observed in human perception, not always compromising task performance.

      Strengths:

      The study investigates a relevant and current topic of considering top-down feedback in deep neural networks. In designing their brain-like model, they use neurophysiological data, such as externopyramidisation and hierarchical connectivity. Their brain-like model exhibits a visual bias that qualitatively matches human perception.

      We thank the reviewer for their summary and evaluation of our paper’s strengths.

      Weaknesses:

      While the model is brain-inspired, it has limited bioplausibility. The model assumes a simplified and fixed hierarchy. In the brain with additional neuromodulation, the hierarchy could be more flexible and more task-dependent.

      We agree, there are still many facets of top-down feedback that we have not captured here, and the modulation of hierarchy is an interesting example. We have added some consideration of this point to the limitations section of the discussion.

      While the brain-like model showed an advantage in ignoring distracting auditory inputs, it struggled when visual information had to be ignored. This suggests that its rigid bias toward visual processing could make it less adaptive in tasks requiring flexible multimodal integration. It hence does not necessarily constitute an improvement over existing ANNs. It is unclear, whether this aspect of the model also matches human data. In general, there is no direct comparison to human data. The study does not evaluate whether the top-down feedback architecture scales well to more complex problems or larger datasets. The model is not well enough specified in the methods and some definitions are missing.

      We agree with the reviewer that we have not demonstrated anything like superior performance (since the brain-like network is quite rigid, as noted) nor have we shown better match to human data with the brain-like network. This was not our intended claim. Rather, we demonstrated here simply that top-down feedback impacts behavior of the networks in response to ambiguous stimuli. We have now added statements to the introduction and discussion to make our specific claims (which are supported by our data, we believe) clear.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      I believe that the work is very nice but not so mature at this stage. Below, you can find some comments that eventually could improve your manuscript.

      (1) Intro, last sentence: "Therefore, top-down feedback is a relevant feature that should be considered for deep ANN models in computational neuroscience more broadly." I don't understand what the authors refer to with this sentence. There are numerous models (deep ANNs) that have been used to model the neural activity and are much simpler than the one proposed here which contains very complex models and connectivity. Although I do agree that the top-down connections are very important there is no data to support their importance for modeling the brain.

      Respectfully, we disagree with the reviewer that we don’t provide data to demonstrate the importance of top-down feedback for modelling. Indeed, we provided a great deal of data to show that top-down feedback in the networks has real functional implications for behaviour, e.g., it can induce a human-like visual bias. Thus, top-down feedback is a factor that one should care about when modelling the brain. But, we agree with the reviewer that more demonstration of the utility of using top-down feedback for achieving better fits to neural data would be an important next step. 

      (2) I suggest adding some extra supplementary simulations where, for example, the number of data for visual and auditory pathways is equal in size (i.e., the same number of examples), the number of layers is identical (3 per pathway), and also the number of parameters. Doing this would help strengthen the claims presented in the paper.

      In fact, all of the hyperparameters the reviewer mentions here were identical for the different networks, so the experiments the reviewer is requesting here were already part of the paper. We now clarify this in the text.

      (3) Results: I suggest adding Tables with quantifications of the presented results. For example, best performance, epochs to converge, etc. As it is now, it is very hard to follow the evidence shown in Figures.

      This is a good suggestion, we have now added this table to the start of the supplemental figures.

      (4) Figure 2e, 3e: Although VS3, and AS3 have been used only for testing, the plot shows alignments with respect to training epochs. The authors should clarify in the Methods if they tested the network with all intermediate weights during VS1/VS2 or AS1/AS2 training.

      Testing scenarios in this context meant that the model was never shown the scenario/task during training, but the models were indeed evaluated on the VS3 and AS3 after each training epoch. We have added clarifications to the figure legends.

      (5) Methods: It would be beneficial to discuss how specific hyperparameters were selected based on prior research, empirical testing, or theoretical considerations. Also, it is not clear how the alignment (visual or audio) is calculated. Do the authors use the examples that have been classified correctly for both stimuli or do they exclude those from the analysis (maybe I have missed it).

      As noted above, because superior performance was not the goal of this study, we conducted limited hyperparameter tuning. But we have extended the results with additional hyperparameter tuning in a supplementary figure, and describe the hyperparameter choices more thoroughly in the methods. As well, all data includes all model responses, regardless of whether they were correct or not. We now clarify this in the methods.

      (6) Code: The code repository lacks straightforward examples demonstrating how to utilize the modeling approach. Given that it is referred to as a "framework", one would expect it to facilitate easy integration into various models and tasks. Including detailed instructions or clear examples would significantly improve usability and help users effectively apply the proposed methodology.

      We agree with the reviewer, this would be beneficial. We have revised the README of the codebase to explain the model and its usage more clearly and included an interactive jupyter notebook with example training on MNIST.

      Some minor comments are given below. Generally speaking, the Figures need to be more carefully checked for consistent labels, colors, etc.

      (1) Page 4, 1st paragraph - grammar correction: "a larger infragranular layer" or "larger infragranular layers"

      Thank you for catching this, we have fixed the text.

      (2) Page 4, 2nd para - rephrase: "In three additional control ANNs" → "In the third additional control ANN"

      In fact, we did mean three additional control ANNs, each one representing a different randomized connectivity profile. We now clarify this in the text and provide the connectivity of the two other random graphs in the supplemental figures.

      (3) Page 4, VAE acronym needs to be defined before its first use

      The variational autoencoder is introduced by its full name in the text now.

      (4) Page 4: Fig. 2c reference should be Fig. 2b, Fig. 2d should be Fig. 2c, Fig. 2b should be Fig. 2d, VS4; Fig. 2b, bottom should be VS4; Fig. 2f, Fig. 2f to Fig. 2g. Double check the Figure references in the text. Here is very confusing for the reader.

      We have now fixed this, thank you for catching it.

      (5) Page 5, 1st para: "Altogether, our results demonstrated both" → "Altogether, our results demonstrated that both"

      This has been updated.

      (6) Figure 2: In the e and g panels the x label is missing.

      This was actually because the x-axis were the same across the panels, but we see how this was unclear, so we have updated the figure.

      (7) Figure 3: There is no panel g (the title is missing); In panels b, c, e, and g the y label is missing, and in panels e and g the x label is missing. Also, the Feedforward model is shown in panel g but it is introduced later in the text. Please remove it from Figure 3. Also in legend: "AV Reverse graph" → "Reverse graph". Also, "Accuracy" and "Alignment" should be presented as percentages (as in Figure 2).

      This has been corrected.

      (8) Figure 4; x labels are missing.

      As with point (6), this was actually because the x-axis were the same across the panels, but we see how this was unclear, so we have updated the figure.

      (9) Page 7; I can’t find the cited Figure S1.

      Apologies, we have added the supplemental figure (now as S4). It shows the results of models with multiplicative feedback on the task in Fig 5 (as opposed to models with composite feedback shown in the main figure).

      Reviewer #2 (Recommendations for the authors):

      (1) Discussion Section 3.1 is only a literature review, and does not really add any value.

      Respectfully, we think it is important to relate our work to other computational work on the role of top-down feedback, and to make clear what our specific contribution is. But, we have updated the text to try to place additional emphasis on our study’s contribution, so that this section is more than just a literature review.

      “Our study adds to this previous work by incorporating modulatory top-down feedback into deep, convolutional, recurrent networks that can be matched to real brain anatomy. Importantly, using this framework we could demonstrate that the specific architecture of top-down feedback in a neural network has important computational implications, endowing networks with different inductive biases.”

      (2) Including ipython notebooks and some examples would be great to make it easier to use the code.

      We now provide a demo of how to use the code base in a jupyter notebook.

      (3) The description of the model is hard to comprehend. Please name and describe all parameters. Also, a figure would be great to understand the different model equations.

      We have added definitions of all model terms and parameters.

      (4) The terminology is not really clear to me. For example "The results further suggest that different configurations of top-down feedback make otherwise identically connected models functionally distinct from each other and from traditional feedforward only recurrent models." The feedforward and only recurrent seem to contradict each other. Would maybe driving and modulatory be a better term here? I also saw in the code that you differentiate between three types of inputs, modulatory, threshold offset and basal (like feedforward). How about you only classify connections based on these three type? I was also confused about the feedforward only model, because I was unsure whether it is still feedback connections but with "basal" quality, or whether feedback connections between modalities and higher-to-lower level layers were omitted altogether.

      We take the reviewer’s point here. To clarify this, we have updated the text to refer to “driving only” rather than “feedforward only”, to make it obvious that what we change in these models is simply whether the connection has any modulatory impact on the activity. 

      (5) "incorporating it into ANNs can affect their behavior and help determine the solutions that the network can discover." -> Do you mean constrain? Overall, I did not really get this point.

      Yes, we mean that it constrains the solutions that the network is likely to discover.

      (6) "ignore the auditory inputs when they visual inputs were unambiguous" -> the not they

      This has been fixed. Thank you for catching it.

      (7) xlabel in Figure 4 is missing.

      This has been fixed, thank you for catching it.

      Reviewer #3 (Recommendations for the authors):

      Major:

      (1) How alignment is computed is not defined. In addition to a proper definition in the methods section, it would be nice to briefly define it when it first appears in the results section.

      We’ve added an explicit definition of how alignment is calculated in the methods and emphasized the calculation when its first explained in the results

      (2) A connectivity matrix for the feedforward-only model is missing and could be added.

      We have added this to Figure 1.

      (3) The connectivity matrix for each random model should also be shown.

      We’ve shown each of the random model configurations in the new supplemental figure S1.

      (4) Initial parameters are not defined, such as W, b etc. A table with all model parameters would be great.

      We have added a table to the methods listing all of the parameters.

      (5) Would be nice to show the t-sne plots (not just the NH score) for each model and each task in the appendix.

      We can provide these figures on request. They massively increase the file size of the paper pdf, as there’s 49 of them for each task and each model, 980 in total. An example t-SNE plot is provided in figure 6.

      Minor:

      (1) Page 4:

      "we refer to this as Visual-dominant Stimulus case 1, or VS1; Fig. 1a, top)." This should be Fig. 2a.

      (2) "In stimulus condition VS1, all of the models were able to learn to use the auditory clues to disambiguate the images (Fig. 2c)."

      This should be Fig. 2b.

      (3) "In comparison, in VS2, we found that the brainlike model learned to ignore distracting audio inputs quickly and consistently compared to the random models, and a bit more rapidly than the auditory information (Fig 2d)."

      This should be Fig. 2c.

      (4) "VS3; Fig. 2b, top"

      This should be Fig. 2d

      (5) "while all other models had to learn to do so further along in training (Fig. 2e)."

      It is not stated explicitly, but this suggests that the image-aligned target was considered correct, and that weight updates were happening.

      (6) "VS4; Fig. 2b, bottom"

      This should be Fig. 2f

      (7) "adept at learning (Fig. 2f)."

      This should be Fig. 2g

      (8) Figure 3:b,c,e y-labels are missing

      3f: both x and y labels are missing

      (9) Figure labeling in the text is not consistent (Fig. 1A versus Fig. 2a)

      (10) Doubled "the" in ""This shows that the inductive bias towards vision in the brainlike model depended on the presence of the multiplicative component of the the feedback"

      (11) Page 9 Figure 6: The caption says b shows the latent spaces for the VS2 task, whereas the main text refers to 6b as showing the latent space for the AS2 task. Please correct which task it is.

      (12) Methods 4.1 page 13

      "which is derived from the feedback input (h_{l−1})"

      This should be h_{l+1}

      (13) r_l, u_l, u and c are not defined to which aspects of the model they refer to

      Even though this is based on a previous model, the methods section should completely describe the model.

      Equations 1,2,3: the notation [x;y] is unclear and should be defined.

      Equation 5: u should probably be u_l.

      (14) Page 14 typo: externopyrmidisation.

      (15) It is confusing to use different names for the same thing: the all-feedforward model, the all feedforward network, the feedforward network, and the feedforward-only model are probably all the same? Consistent naming would help here.

      Thank you for the detailed comments! We’ve fixed the minor errors and renamed the feedforward models to drive-only models.

    1. Reviewer #1 (Public review):

      Summary:

      This study reports the effects of psilocin on iPSC-derived human cortical neurons.

      Strengths:

      The characterization was comprehensive, involving immunohistochemistry of various markers, 5-HT2A receptors, BDNF, and TrkB, transcriptomics analyses, morphological determination, electrophysiology, and finally synaptic protein measurements. The results are in close agreement with prior work (PMID 29898390) on rat cultured cortical neurons. Nevertheless, there is value in confirming those earlier findings and furthermore to demonstrate the effects in human neurons, which are important for translation. The genetic, proteomics, and cell structure analyses used in this paper are its major strength. The study supports the value of using iPSC-derived human cortical neurons for drug development involving psychedelics-related compounds.

      Weaknesses:

      (1) Line 140: 5-HT2A receptor expression was found via immunocytochemistry to reside in the somatodendritic and axonal compartments. However, prior work from ex vivo tissue using electron microscopy has found predominantly 5-HT2A receptor expression in the somatodendritic compartment (PMID: 12535944). Was this antibody validated to be 5-HT2A receptor-specific? Can the authors reason why the discrepancy may arise, and if the axonal expression is specific to the cultured neurons?

      (2) Line 143: It would be helpful to specify the dose of psilocin tested, and describe how this dose was chosen.

      (3) Figure 1: The interpretation is that the differential internalization in the axonal and somatodendritic compartments is time-dependent. However, given that only one dose is tested, it is also possible that this reflects dose dependence, with the longer time exposure leading to higher dose exposure, so these variables are related. That is, if a higher dose is given, internalization may also be observed after 10 minutes in the dendritic compartment.

      (4) Figure 3 & 4: What is the 'control' here? A more appropriate control for the 24 hours after psilocin application would be 24 hours after vehicle application. Here the authors are looking at before and after, but the factor of time elapsed and perturbation via application is not controlled for.

      (5) The sample size was not clearly described. In the figure legend, N = the number of neurites is provided, but it is unclear how many cells have been analyzed, and then how many of those cells belong to the same culture. These are important sample size information that should be provided. Relatedly, statistical analyses should consider that the neurites from the same cells are not independent. If the neurites indeed come from the same cells, then the sample size is much smaller and a statistical analysis considering the nested nature of the data should be used.

      Comments on revisions:

      The authors performed substantial experiments to check validity of the HTR2A antibody for the revision. Briefly, they found that western blot shows a single band, abolished by a blocking peptide, in neural progenitors and iPSC-derived neurons, suggesting positive results. However, they also detected immunofluorescence signals in HEK293 and HeLa cells, which do not express 5-HT2A receptors as scRNAseq analysis of these cells show complete absence of the transcript. Therefore the antibody has epitope-selective binding but also has some non-specific binding, precluding its use. The authors rightfully removed the data related to the antibody in the revised manuscript. The account is repeated here to highlight to anyone who may find the information helpful. Overall, the additional results added rigor to the study.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Comment 1: 5-HT2A Antibody Specificity

      Was this antibody validated to be 5-HT2A receptor-specific? Can the authors reason why the discrepancy may arise, and if the axonal expression is specific to the cultured neurons?

      We performed extensive validation of the anti-5-HT2A receptor antibody (Alomone #ASR-033), which is summarized in the accompanying Author response images:

      Positive findings (Author response image 1c-e, Author response image 2a): (1) Western blot showed a single band at the expected molecular weight (~50 kDa) in neural progenitors and iPSCderived neurons. (2) The blocking peptide (#BLP-SR033) abolished Western blot bands and markedly reduced immunofluorescence signals in neurons, confirming epitope-specific binding.

      Negative findings (Author response image 1a-b, Author response image 2a-b, Author response image 3): (1) We detected positive immunofluorescence signals in HEK293 and HeLa cells (Author response image 1a-b), which do not express 5-HT2AR. (2) Western blot also showed bands in HEK293 and HeLa cells (Author response image 2a-b). (3) Single-cell RNA-seq analysis of HEK293T cells confirmed complete absence of HTR2A expression (Author response image 3a). (4) qPCR showed no detectable HTR2A transcripts in iPSCs or HeLa cells (Ct > 36), while neural progenitors and neurons showed clear expression (Author response image 3b). (5) siRNA knockdown experiments failed to produce a corresponding decrease in immunofluorescence or Western blot signals, despite reduced HTR2A transcript levels (data not shown).

      BLAST analysis: Protein BLAST analysis of the 13-amino acid immunogenic peptide sequence identified the human 5-HT2A receptor as the top hit (9/13 amino acids overlap). However, shorter sequence similarities were also found with other proteins, including APPBP1 (6/9 amino acids), Immunoglobulin Heavy Chain (6/7 amino acids), and Interleukin31 receptor (6/8 amino acids). While these partial homologies do not provide a definitive mechanistic explanation for the observed off-target binding, they illustrate that the epitope sequence is not entirely unique to the 5-HT2A receptor.

      Conclusion: While our validation confirmed epitope-specific binding (blocking peptide effective in neurons), the antibody clearly detects something in cells that demonstrably lack HTR2A gene expression. This indicates off-target binding to other proteins sharing the epitope sequence. We have therefore removed all antibody-based 5-HT2A receptor experiments from the revised manuscript. This includes the receptor internalization data from Figure 1. The remaining findings (BDNF upregulation, gene expression changes, morphological effects, electrophysiology) are supported by independent methods including pharmacological blockade with ketanserin.

      Comment 2: Psilocin Dose Selection

      It would be helpful to specify the dose of psilocin tested, and describe how this dose was chosen.

      We used 10 µM psilocin based on: (1) The seminal study by Ly et al. (2018), which demonstrated neuroplasticity effects at this concentration in rat cortical neurons. (2) Our own dose-response experiments (Figure S2B) showing maximal BDNF increase at 10 µM compared to lower concentrations (10 nM, 100 nM, 1 µM). We have clarified this in the revised Methods section.

      Comment 3: Dose vs. Time Dependence

      Given that only one dose is tested, it is also possible that this reflects dose dependence, with the longer time exposure leading to higher dose exposure.

      We agree that dose dependence cannot be excluded with our current experimental design. This point is now moot as we have removed the 5-HT2A receptor internalization experiments from the manuscript. Future studies in our group will address dose-dependent effects on other readouts.

      Comment 4: Control Conditions

      What is the 'control' here? A more appropriate control would be 24 hours after vehicle application.

      The control condition is indeed a vehicle (DMSO) control collected at the same time point as the experimental condition (i.e., 24 hrs post-treatment). We have clarified this in the revised figure legends and Methods section to avoid confusion.

      Comment 5: Sample Size Description

      The sample size was not clearly described. Statistical analyses should consider that neurites from the same cells are not independent.

      We have expanded the sample size descriptions in the figure legends. Analyses were performed using 5-10 microscope images per condition, with 15 ROIs per image, across at least two independent differentiations from two genetic backgrounds. Regarding independence: each neurite segment exists within a distinct microenvironment and can be considered an independent measurement unit, consistent with established practices in the field (Paul et al., 2021, CNS Neurosci Ther). We acknowledge this increases statistical power and have noted this in the Methods.

      Reviewer #2:

      Comment 1: 5-HT2A Antibody Validation

      Without validation (using for example knockdown techniques to decrease expression of 5HT2A), the experiments using this antibody should be excluded from the manuscript.

      We agree with this assessment. As detailed in our response to Reviewer 1 (Comment 1) and documented in the Response to Reviewer Figure, our extensive validation attempts—including siRNA knockdown—could not conclusively demonstrate antibody specificity. We have removed all antibody-based 5-HT2A receptor experiments from the revised manuscript.

      Comment 2: Serotonin in Cell Media

      Did the authors evaluate whether 5-HT is present in the cell media?

      The cell culture media used in our experiments does not contain serotonin. We have explicitly stated this in the revised Methods section.

      Comment 3: Statistical Analysis of Figure S1F

      Some of the datasets are not statistically analyzed, such as Figure S1F.

      Figure S1F related to the 5-HT2A receptor experiments and has been removed from the revised manuscript along with the associated data.

      Comment 4: Translational Validity of Prolonged Exposure

      The authors continuously exposed cells to psilocin for hours or days. Since this is not the model of what occurs in vivo, the findings lack translational validity.

      We acknowledge this limitation. Most experiments (BDNF, gene expression, branching) were conducted 24–48 hrs after a brief 10-minute exposure, which better reflects the in vivo situation. Prolonged exposures (96 hrs) were used specifically for synaptogenesis experiments based on literature showing that repeated LSD administration enhances spine density (Inserra et al., 2022; De Gregorio et al., 2022). Our in vitro system lacks metabolizing enzymes and glial cells, which may introduce temporal biases. We have added a discussion of these limitations in the revised manuscript.

      Comment 5: Ketanserin Effect on BDNF

      In Figure 2E, ketanserin by itself seems to reduce BDNF density. How do the authors conclude that ketanserin blocks psi-induced effects?

      We identified that one cell line (Ctrl 1) with inherently higher BDNF density was inadvertently excluded from the ketanserin-only condition. After removing Ctrl 1 from all conditions and reanalyzing, the difference between Ctrl and Ket alone is no longer significant. The significant difference between Psi+Ket and Ket alone demonstrate that psilocin exerts effects that ketanserin can block, consistent with 5-HT2A receptor mediation. The revised figure and statistical analysis are included in the updated manuscript.

      Comment 6: mCherry Localization mCherry (Fig 4A) seems to be retained in the nucleus.

      The CamKII promoter drives expression of cytoplasmic mCherry, which fills the entire neuron including soma, dendrites, and axons. The apparent nuclear signal reflects mCherry accumulation in the soma, which surrounds the nucleus. The images clearly show mCherry extending into neurites, which was essential for our Sholl analysis of neuronal complexity.

      Comment 7: Reference 36

      Reference 36 is a review article that does not mention psilocin.

      Our statement refers broadly to serotonergic psychedelics increasing neurotrophic factors. Reference 36 (Colaço et al., 2020) examines ayahuasca, which contains the serotonergic psychedelic DMT. We have revised the text to clarify this point.

      Summary of Major Revisions

      (1) Removed all 5-HT2A receptor antibody-based experiments from Figure 1 and supplementary figures due to inconclusive specificity validation. An Author response image documenting our validation attempts is provided.

      (2) Clarified control conditions (vehicle controls at matched time points) in figure legends.

      (3) Expanded sample size descriptions in Methods and figure legends.

      (4) Re-analyzed ketanserin experiments with consistent cell line inclusion.

      (5) Added discussion of translational limitations.

      (6) Added new Figure S5 summarizing proposed signaling pathways.

      (7) Expanded discussion on the relevance of iPSC-derived neurons for drug development.

      Author response image 1.

      Immunostaining for 5-HT2A receptor across cell types and peptide-blocking control. (a) HEK293 cells display a positive immunofluorescent signal despite not endogenously expressing 5-HT2AR, indicating nonspecific antibody reactivity. (b) HeLa cells also exhibit a positive signal despite lacking endogenous 5-HT2AR expression, further demonstrating nonspecific antibody binding in non-expressing cell types. (c) Neural progenitor cells show clear positive 5-HT2AR staining. (d) iPSC-derived neurons exhibit robust and well-defined 5-HT2AR staining. (e) Application of the Alomone 5-HT2AR blocking peptide (#BLP-SR033) markedly reduces neuronal signal intensity, supporting epitope-specific binding.

      Author response image 2.

      Western blot analysis of 5-HT2A receptor abundance and peptide-blocking control. (a-b) In line with the immunofluorescence a single band is detected in iPSCs, HEK cells, neural progenitors, iPSC-derived neurons and (b) HeLa cells. (a) Preincubation of the primary antibody with the corresponding blocking peptide abolishes this band across all samples, consistent with specific binding of the antibody to its intended epitope.

      Author response image 3.

      Lack of detectable 5-HT2AR expression in HEK and HeLa cells. (a) Analysis of a human-only HEK293T single-cell RNA-seq dataset (10x Genomics; https://www.10xgenomics.com/datasets/293-t-cells-1-standard-1-1-0, accessed 2025-11-25) shows no meaningful HTR2A expression, whereas other genes such as GAPDH, TP53, MYC, and ACTB are robustly detected. Consistently, evaluation of a “Barnyard” dataset - an equal mixture of human HEK293T and mouse NIH3T3 cells (10x Genomics; https://www.10xgenomics.com/datasets/20-k-1-1mixture-of-human-hek-293-t-and-mouse-nih-3-t-3-cells-3-ht-v-3-1-3-1-high-6-1-0, accessed 2025-1125) reveals only ~4 of ~10,000 droplets with minimal HTR2A signal, confirming the absence of meaningful expression.(b) (b) qPCR analysis further demonstrates no detectable HTR2A transcripts in iPSCs or HeLa cells (Ct > 36), while neural progenitors and iPSC-derived cortical neurons show expression when normalized to housekeeping genes GAPDH and TBP.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents findings on the adaptation mechanisms of Saccharomyces cerevisiae under extreme stress conditions. The authors try to generalize this to adaptation to stress tolerance. A major finding is that S. cerevisiae evolves a quiescence-like state with high trehalose to adapt to freeze-thaw tolerance independent of their genetic background. The manuscript is comprehensive, and each of the conclusions is well supported by careful experiments.

      Strengths:

      This is excellent interdisciplinary work.

      I have commented on the response of the authors, in-line, below. This is to maintain the conversation thread with the authors.

      Comment 1:

      Earlier papers have shown that loss of ribosomal proteins, that slow growth, leads to better stress tolerance in S. cerevisiae. Given this, isn't it expected that any adaptation that slows down growth would, overall, increase stress tolerance? Even for other systems, it has been shown that slowing down growth (by spore formation in yeast or bacteria/or dauer formation in C. elegans) is an effective strategy to combat stress and hence is a likely route to adaptation. The authors stress this as one of the primary findings. I would like the authors to explain their position, detailing how their findings are unexpected in the context of the literature.

      Response:

      We agree that the link between slower growth and higher stress tolerance has been well stud-ied. What is distinctive here is that repeated, near-lethal freeze-thaw selected not only for a tolerant/quiescent-like state but also for a shorter lag on re-entry. In this regime of freeze-thaw-regrowth, cells that are tolerant but slow to restart would be outcompeted by naive fast growers. Our quiescence-based selection simulations reproduce exactly this constraint. We have added this explanation to the Results to make clear that the novelty is the co-evolution of a tolerant, trehalose-rich state together with rapid regrowth under an alternating regime.

      Comment to Response: I get the point. I believe that the outcome is highly dependent on how selection pressure is administered. So, generalizing this over all stresses (as done in the abstract) may not be accurate.

      Comment 2:

      Convergent evolution of traits: I find the results unsurprising. When selecting for a trait, if there is a major mode to adapt to that stress, most of the strains would adapt to that mode, independent of the route. According to me, finding out this major route was the objective of many of the previous reports on adaptive evolution. The surprising part in the previous papers (on adaptive evolution of bacteria or yeast) was the resampling of genes that acquired mutations in multiple replicates of an evolution experiments, providing a handle to understand the major genetic route or the molecular mechanism that guides the adaptation (for example in this case it would be - what guides the over-accumulation of trehalose). I fail to understand why the authors find the results surprising, and I would be happy to understand that from the authors. I may have missed something important.

      Response:

      Our surprise was precisely that we did not see the classical pattern of "phenotypic convergence + repeated mutations in the same locus/module." All independently evolved lines converged on a trehalose-rich, mechanically reinforced, quiescence-like phenotype, but population sequencing across lines did not reveal a single repeatedly hit gene or small shared pathway, even when we increased selection stringency (1-3 freeze-thaw cycles per round). We have now stated in the manuscript that this decoupling (strong phenotypic convergence, non-overlapping genetic routes) is the central inference: selection is acting on a physiologically defined state that multiple genotypes can reach.

      Comment to Response: You indeed saw a case of phenotypic convergence. Converging towards trehalose-rich, mechanically reinforced, quiescent like - are phenotypes that have converged. This is what prevented lysis. The same locus need not be mutated over and over again, if the trehalose pathway is controlled by many processes (it is, and many are still unknown as I point in the next comment), many different mutations on different loci can result in the same regulation! I do not see the decoupling between phenotypic convergence and decoupling of genetic mutations as surprising or novel; molecular and cellular biology is replete with such examples where deletion(mutation) of hundreds of different genes can have the same phenotypic outcome (yeast deletion library screening, indirect effects etc). If this was a specific question unsolved in evolutionary biology, then the matter is different.

      A minor point: Here I would also like to point out that the three phenotypes you measure may be linked to each other, so their independent evolution may just be a cause-effect relationship. For example Trehalose accumulation may drive the other two. This has not been deconvoluted in this manuscript.

      Comment 3:

      Adaptive evolution would work on phenotype, as all of selective evolution is supposed to. So, given that one of the phenotypes well-known in literature to allow free-tolerance is trehalose accumulation, I think it is not surprising that this trait is selected. For me, this is not a case of "non-genetic" adaptation as the authors point out: it is likely because perturbation of many genes can individually result in the same outcome - up-regulation of trehalose accumulation. Thereby, although the adaptation is genetic, it is not homogeneous across the evolving lines - the end result is. Do the authors check that the trait is actually a non-genetic adaptation, i.e., if they regrow the cells for a few generations without the stress, the cells fall back to being similarly only partially fit to freeze-thaw cycles? Additionally, the inability to identify a network that is conserved in the sequencing does not mean that there is no regulatory pathway. A large number of cryptic pathways may exist to alter cellular metabolic states.<br /> This is a point in continuation of point #2, and I would like to understand what I have missed.

      Response:

      We agree, and we have removed the wording "non-genetic adaptation." The evolved populations retain high survival even after regrowth for {greater than or equal to}25 generations without freeze-thaw, so the adaptation is clearly genetically maintained. What our data show is that there is no single genetic route to the shared phenotype; different mutations can all drive cells into the same trehalose-rich, quiescence-like, mechanochemically reinforced state. We now describe this as "genetic diversification with phenotypic convergence."

      Comment to Response: While the last term does explain what is going on, isn't it an outcome that is routine in cell biology (as pointed out in my previous comment to your response)? I apologize for not understanding the punchline that is provided in the last few sentences of the abstract.

      Comment 4:

      To propose the convergent nature, it would be important to check for independently evolved lines and most probably more than 2 lines. It is not clear from their results section if they have multiple lines that have evolved independently.

      Response:

      We indeed evolved four independent lines and maintained two independent controls. We have added this information at the start of the Results so that the level of replication is immediately clear.

      Comment to Response: Previous large scale studies have done hundreds of sequencing to oversample the pathway and figure out reproducible loci. With pooled sequencing (as mentioned below) and only 4 sample evolution, I am not sure that you would have the power in your study to conclude in the loci are sampled or not! If there were 10 gene LOFs that control Trehalose levels (which you can find from the published deletion screening experiment), then four of the experiments are likely to go through one of these routes; what is the likely event that you would identify the same route in two pools? It is unlikely, and therefore, sequencing of 4 pools cannot tell you if the mutation path is repeatedly sampled or not.

      Comment 5:

      For the genomic studies, it is not clear if the authors sequenced a pool or a single colony from the evolved strains. This is an important point, since an average sequence will miss out on many mutations and only focus on the mutations inherited from a common ancestral cell. It is also not clear from the section.

      Response:

      We sequenced population samples from the evolved lines. Our specific question was whether independently evolved lines would show the same high-frequency genetic solution, as is often seen in parallel evolution. Pool sequencing may under-sample rare/private variants, but it is appropriate for detecting such shared, high-frequency routes - and we do not find any. We have clarified this rationale in the Methods/Results.

      Comment to Response: Please provide the average sequencing depth of each sequencing run. It is essential to understand the power of this study in identifying mutations. What coverage was used in Xgenome size?

    2. Author response:

      The following is the authors’ response to the original reviews.

      We thank the editor and the reviewers for the detailed and constructive comments. In revising the manuscript we have: (i) clarified what is new relative to prior stress tolerance work, (ii) made explicit that we observe phenotypic convergence without a shared genetic route, (iii) stated upfront that we evolved four independent lines plus two controls, and (iv) corrected figure legends, statistics, and the missing citations. Below we respond point-by-point.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript presents findings on the adaptation mechanisms of Saccharomyces cerevisiae under extreme stress conditions. The authors try to generalize this to adaptation to stress tolerance. A major finding is that S. cerevisiae evolves a quiescence-like state with high trehalose to adapt to freeze-thaw tolerance independent of their genetic background. The manuscript is comprehensive, and each of the conclusions is well supported by careful experiments.

      Strengths:

      This is excellent interdisciplinary work.

      Weaknesses:

      I have questions regarding the overall novelty of the proposal, which I would like the authors to explain.

      (1) Earlier papers have shown that loss of ribosomal proteins, that slow growth, leads to better stress tolerance in S. cerevisiae. Given this, isn’t it expected that any adaptation that slows down growth would, overall, increase stress tolerance? Even for other systems, it has been shown that slowing down growth (by spore formation in yeast or bacteria/or dauer formation in C. elegans) is an effective strategy to combat stress and hence is a likely route to adaptation. The authors stress this as one of the primary findings. I would like the authors to explain their position, detailing how their findings are unexpected in the context of the literature.

      We agree that the link between slower growth and higher stress tolerance has been well studied. What is distinctive here is that repeated, near-lethal freeze–thaw selected not only for a tolerant/quiescent-like state but also for a shorter lag on re-entry. In this regime of freeze–thaw–regrowth, cells that are tolerant but slow to restart would be outcompeted by naive fast growers. Our quiescence-based selection simulations reproduce exactly this constraint. We have added this explanation to the Results to make clear that the novelty is the co-evolution of a tolerant, trehaloserich state together with rapid regrowth under an alternating regime.

      (2) Convergent evolution of traits: I find the results unsurprising. When selecting for a trait, if there is a major mode to adapt to that stress, most of the strains would adapt to that mode, independent of the route. According to me, finding out this major route was the objective of many of the previous reports on adaptive evolution. The surprising part in the previous papers (on adaptive evolution of bacteria or yeast) was the resampling of genes that acquired mutations in multiple replicates of an evolution experiments, providing a handle to understand the major genetic route or the molecular mechanism that guides the adaptation (for example in this case it would be - what guides the overaccumulation of trehalose). I fail to understand why the authors find the results surprising, and I would be happy to understand that from the authors. I may have missed something important.

      Our surprise was precisely that we did not see the classical pattern of “phenotypic convergence + repeated mutations in the same locus/module.” All independently evolved lines converged on a trehalose-rich, mechanically reinforced, quiescence-like phenotype, but population sequencing across lines did not reveal a single repeatedly hit gene or small shared pathway, even when we increased selection stringency (1–3 freeze–thaw cycles per round). We have now stated in the manuscript that this decoupling (strong phenotypic convergence, non-overlapping genetic routes) is the central inference: selection is acting on a physiologically defined state that multiple genotypes can reach.

      (3) Adaptive evolution would work on phenotype, as all of selective evolution is supposed to. So, given that one of the phenotypes well-known in literature to allow free-tolerance is trehalose accumulation, I think it is not surprising that this trait is selected. For me, this is not a case of ”non-genetic” adaptation as the authors point out: it is likely because perturbation of many genes can individually result in the same outcome - up-regulation of trehalose accumulation. Thereby, although the adaptation is genetic, it is not homogeneous across the evolving lines - the end result is. Do the authors check that the trait is actually a non-genetic adaptation, i.e., if they regrow the cells for a few generations without the stress, the cells fall back to being similarly only partially fit to freeze-thaw cycles? Additionally, the inability to identify a network that is conserved in the sequencing does not mean that there is no regulatory pathway. A large number of cryptic pathways may exist to alter cellular metabolic states.

      This is a point in continuation of point #2, and I would like to understand what I have missed.

      We agree, and we have removed the wording “non-genetic adaptation.” The evolved populations retain high survival even after regrowth for ≥25 generations without freeze–thaw, so the adaptation is clearly genetically maintained. What our data show is that there is no single genetic route to the shared phenotype; different mutations can all drive cells into the same trehalose-rich, quiescencelike, mechanochemically reinforced state. We now describe this as “genetic diversification with phenotypic convergence.”

      (4) To propose the convergent nature, it would be important to check for independently evolved lines and most probably more than 2 lines. It is not clear from their results section if they have multiple lines that have evolved independently.

      We indeed evolved four independent lines and maintained two independent controls. We have added this information at the start of the Results so that the level of replication is immediately clear.

      (5) For the genomic studies, it is not clear if the authors sequenced a pool or a single colony from the evolved strains. This is an important point, since an average sequence will miss out on many mutations and only focus on the mutations inherited from a common ancestral cell. It is also not clear from the section.

      We sequenced population samples from the evolved lines. Our specific question was whether independently evolved lines would show the same high-frequency genetic solution, as is often seen in parallel evolution. Pool sequencing may under-sample rare/private variants, but it is appropriate for detecting such shared, high-frequency routes — and we do not find any. We have clarified this rationale in the Methods/Results.

      Reviewer #2 (Public review):

      Summary:

      The authors used experimental evolution, repeatedly subjecting Saccharomyces cerevisiae populations to rapid liquid-nitrogen freeze-thaw cycles while tracking survival, cellular biophysics, metabolite levels, and whole-genome sequence changes. Within 25 cycles, viability rose from ~2 % to ~70 % in all independent lines, demonstrating rapid and highly convergent adaptation despite distinct starting genotypes. Evolved cells accumulated about threefold more intracellular trehalose, adopted a quiescence-like phenotype (smaller, denser, non-budding cells), showed cytoplasmic stiffening and reduced membrane damage, and re-entered growth with shorter lag traits that together protected them from ice-induced injury. Whole-genome sequencing indicated that multiple genetic routes can yield the same mechano-chemical survival strategy. A population model in which trehalose controls quiescence entry, growth rate, lag, and freeze-thaw survival reproduced the empirical dynamics, implicating physiological state transitions rather than specific mutations as the primary adaptive driver. The study therefore concludes that extreme-stress tolerance can evolve quickly through a convergent, trehalose-rich quiescence-like state that reinforces membrane integrity and cytoplasmic structure.

      Strengths:

      The strengths of the paper are the experimental design, data presentation and interpretation, and that it is well-written.

      (1) While the phenotyping is thorough, a few more growth curves would be quite revealing to determine the extent of cross-stress protection. For example, comparing growth rates under YPD vs. YPEG (EtOH/glycerol), and measuring growth at 37ºC or in the presence of 0.8 M KCl.

      We thank the referee for the interesting suggestions. However, growth rates alone may be difficult to interpret since WT strains also show different growth rates under these conditions. Therefore, comparing the relative fitness or survival of the evolved strains versus the WT under these stresses would be more informative. In the present study we limited growth/survival measurements to what was needed to parameterize the adaptation model in YPD under the freeze–thaw regime. We have now added a statement in the Discussion that, given the shared trehalose/mechanical mechanism, such cross-stress assays are an expected and straightforward follow-up.

      (2) Is GEMS integrated prior to evolution? Are the evolved cells transformable?

      Yes. GEMs were integrated prior to evolution, because the non-integrated evolved population showed low transformation efficiency, likely due to altered cell-wall properties.

      (3) From the table, it looks like strains either have mutations in Ras1/2 or Vac8. Given the known requirements of Ras/PKA signaling for the G1/S checkpoint (to make sure there are enough nutrients for S phase), this seems like a pathway worth mentioning and referencing. Regarding Vac8, its emerging roles in NVJ and autophagy suggest another nutrient checkpoint, perhaps through TORC1. The common theme is rewired metabolism, which is probably influencing the carbon shuttling to trehalose synthesis.

      We appreciate the reviewer’s suggestion to consider pathways like Ras/PKA (linked to Ras1/2) and autophagy/TORC1 (linked to Vac8) as potential upstream modulators. While these pathways are involved in nutrient sensing and metabolic regulation, we choose not to emphasize them specifically. This is because (i) some evolved lines lack Ras1/2 or Vac8 variants, and (ii) none of the variants lies directly in trehalose synthesis/degradation pathways. Furthermore, direct links to trehalose accumulation are not well established for these specific variants in this context, and pathways like Ras are global regulators with broad effects. Together with the strongly convergent phenotype, this supports our main inference that multiple genetic/metabolic routes can feed into the same trehalose-rich, mechanochemically reinforced, quiescence-like state. We have added a note in the discussion regarding metabolic rewiring and trehalose.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Generally, the results sections should have more details. The figures should be corrected, and the legends should be checked for correctness. The manuscript seems to have been assembled in haste?

      We have expanded the relevant Results subsections with one-sentence motivations (why each measurement was performed) and we have corrected the figure legends for ordering and consistency.

      Figure 3: It will be good to have the correct p-values on the figure itself. P-values are typically less than 1, unless there is some special method (here the values presented are , etc). Please explain how the P-values were obtained in the figure legend itself.

      Figure 3 now shows the actual p-values. The legend specifies the details and the sample sizes used.

      Figure 5: It is not clear what the error bars show in 5B, E (different evolved population/ clones/ cells?). All the figure legends are mixed up, please correct them. It is difficult to follow the paper.

      Figure 5 legends now state clearly what the error bars represent (biological replicates) and which panels are from single-cell measurements. We have checked the panel lettering and legend order for consistency with the flow of the main text.

      Reviewer #3 (Recommendations for the authors):

      Overall, the paper is outstanding, well-written, and insightful.

      A point to address is that there are missing citations on lines 60, 91.

      We have added the missing citations at both locations. We apologize for the omission, which was due to a compilation error. This error has been fixed, and the bibliography has been corrected (now containing 74 references).

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Authors state, "we identified ETF dehydrogenase (ETFDH) as one of the most dispensable metabolic genes in neoplasia." Surely there are thousands of genes that are dispensable for neoplasia. Perhaps the authors can revise this sentence and similar sentiments in the text.

      We agree with the reviewer and have corrected the text accordingly. Specifically, we rephrased the sentence: “Surprisingly, we observed that in contrast to muscle, ETFDH is one of the most non-essential metabolic genes in cancer cells.” to “Surprisingly, we observed that in contrast to muscle, ETFDH is a non-essential gene in acute lymphoblastic leukemia NALM-6 cells”

      Authors state, "These findings show that ETFDH loss elevates glutamine utilization in the CAC to support mitochondrial metabolism." While elevated glutamine to CAC flux is consistent with the statement that increased glutamine, the authors have not measured the effect of restoring glutamine utilization to baseline on mitochondrial metabolism. Thus, the causality implied by the authors can only be inferred based on the data presented. Indeed, the increased glutamine consumption may be linked to the increase in ROS, as glutamate efflux via system xCT is a major determinant of glutamine catabolism in vitro.

      Indeed. We changed the statement "These findings show that ETFDH loss elevates glutamine utilization in the CAC to support mitochondrial metabolism." to "Collectively, these data demonstrate that ETF insufficiency in cancer cells remodels mitochondrial metabolism and increases the glutamine consumption and anaplerosis."

      Authors state that the mechanism described is an example of "retrograde signaling". However, the mechanism seems to be related to a reduction in BCAA catabolism, suggesting that the observed effects may be a consequence of altered metabolic flux rather than a direct signaling pathway. The data presented do not delineate whether the observed effects stem from disrupted mitochondrial communication or from shifts in nutrient availability and metabolic regulation.

      Notwithstanding that the term “retrograde” was used to refer to signaling from mitochondria to mTORC1, rather than from mTORC1 to mitochondria [1], we have removed the term “retrograde signaling” throughout the manuscript.

      The authors should discuss which amino acids that are ETFDH substrates might affect mTORC1 activity or consider whether other ETFDH substrates might also affect mTORC1 in their discussion. Along these lines, the authors might consider discussing why amino acids that are not ETFDH substrates are increased upon ETFDH loss.

      Based on the literature, we expect that branched chain amino acids that are ETFDH substrates (e.g., leucine) are likely to play a major role in activating mTORC1 upon ETFDH abrogation. As expected, the aforementioned amino acids are among those that are the most highly upregulated in ETFDH deficient cells (Fig 3A). We have, however, never formally tested the role of branched chain amino acid in activating mTORC1 in the context of ETFDH disruption. The increase in amino acids that are not metabolized via ETFDH, is likely to stem from global metabolic rewiring of ETFDH-deficient cells and observed alterations in amino acid uptake (e.g., glutamine; Fig 2F). We discuss this in the revised version of the paper as follows:

      “Several metabolites can be sensed via signaling partners upstream of mTORC1, including leucine, arginine, methionine/SAM, and threonine [2]. Branched-chain amino acids (leucine, isoleucine, and valine), which are among the highest upregulated metabolites in ETFDH deficient cells (Fig 3A) serve as ETFDH substrates, and have been described to display strong activation capabilities towards mTORC1 in the literature [3,4]. Glutamine can also activate mTORC1 through Arf family of GTPases [5]. Indeed, glutamine can supplement the non-essential amino acid (NEAA) pool through transamination [6] and amino acid uptake [7]. Accordingly, the maintenance of NEAA that are non-ETFDH substrates may be supported by the global metabolic rewiring fueled by enhanced glutamine metabolism in ETFDH-deficient cells. Deciphering the mechanisms leading to accumulation of specific amino acids and their role in ETFDH-dependent mTORC1 modulation is warranted.”

      Reviewer #2 (Public review):

      The authors would strengthen the paper considerably by adding back catalytically inactive ETFDH to show that the activity of this enzyme is responsible for the increased growth phenotypes and changes in labeling that they observe.

      Based on the Reviewers’ suggestions we performed these experiments. Herein, we took advantage of Y304A/G306E ETFDH mutant that impairs electron transfer from ETF and cannot substitute for the wild type (WT) gene function in ETFDH-deficient myoblasts [8]. We expressed WT and Y304A/G306E ETFDH mutant in ETFDH KO HCT116 colorectal cancer cells and confirmed that they are expressed to a comparable level (Supplementary Figure 6C). Re-expression of WT decreased proliferation, while suppressing mTORC1 signaling and increasing 4E-BP1 levels relative to control (vector infected) ETFDH KO EV HCT116 cells (Supplementary Figure 6D). In contrast, proliferation rates, mTORC1 signaling and 4E-BP1 levels remained largely unchanged upon Y304A/G306E ETFDH mutant expression in ETFDH KO HCT116 cells (Supplementary Figure 6D). Similarly, re-expression of WT ETFDH disrupted the bioenergetic phenotype associated with ETFDH loss, in contrast to re-expression of Y304A/G306E ETFDH mutant, which exhibited similar bioenergetic profiles as ETFDH KO control (Supplementary Figure 6E-F). Collectively these findings argue that the ETFDH activity is required for its tumor suppressive effects.

      If nucleotide pool and labeling data are available, or can be obtained readily, this would significantly strengthen the tracing data already obtained.

      We followed Reviewer’s suggestion and measured nucleotide levels. This revealed that loss of ETFDH results in increase in steady-state nucleotide pools (Supplementary Figure 2K), consistent with increased aspartate labelling and accelerated tumor growth.

      References

      (1) Morita, M. et al. mTORC1 controls mitochondrial activity and biogenesis through 4EBP-dependent translational regulation. Cell Metab 18, 698-711 (2013). https://doi.org/10.1016/j.cmet.2013.10.001

      (2) Valenstein, M. L. et al. Structural basis for the dynamic regulation of mTORC1 by amino acids. Nature 646, 493-500 (2025). https://doi.org/10.1038/s41586-025-09428-7

      (3) Appuhamy, J. A., Knoebel, N. A., Nayananjalie, W. A., Escobar, J., & Hanigan, M. D. Isoleucine and leucine independently regulate mTOR signaling and protein synthesis in MAC-T cells and bovine mammary tissue slices. J Nutr 142, 484-491 (2012). https://doi.org/10.3945/jn.111.152595

      (4) Herningtyas, E. H. et al. Branched-chain amino acids and arginine suppress MaFbx/atrogin-1 mRNA expression via mTOR pathway in C2C12 cell line. Biochim Biophys Acta 1780, 1115-1120 (2008). https://doi.org/10.1016/j.bbagen.2008.06.004

      (5) Jewell, J. L. et al. Metabolism. Differential regulation of mTORC1 by leucine and glutamine. Science 347, 194-198 (2015). https://doi.org/10.1126/science.1259472

      (6) Tan, H. W. S., Sim, A. Y. L. & Long, Y. C. Glutamine metabolism regulates autophagy-dependent mTORC1 reactivation during amino acid starvation. Nat Commun 8, 338 (2017). https://doi.org/10.1038/s41467-017-00369-y

      (7) Chen, R. et al. The general amino acid control pathway regulates mTOR and autophagy during serum/glutamine starvation. J Cell Biol 206, 173-182 (2014).https://doi.org/10.1083/jcb.201403009

      (8) Herrero Martin, J. C. et al. An ETFDH-driven metabolon supports OXPHOS efficiency in skeletal muscle by regulating coenzyme Q homeostasis. Nat Metab 6, 209-225 (2024). https://doi.org/10.1038/s42255-023-00956-y

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public reviews:

      Reviewer #1 (Public review):

      Summary:

      Schafer et al. tested whether the hippocampus tracks social interactions as sequences of neural states within an abstract social space defined by dimensions of affiliation and power, using a task in which participants engaged in narrative-based social interactions. The findings of this study revealed that individual social relationships are represented by unique sequences of hippocampal activity patterns. These neural trajectories corresponded to the history of trial-to-trial affiliation and power dynamics between participants and each character, suggesting an extended role of the hippocampus in encoding sequences of events beyond spatial relationships.

      The current version has limited information on details in decoding and clustering analyses which can be improved in the future revision.

      Strengths:

      (1) Robust Analysis: The research combined representational similarity analysis with manifold analyses, enhancing the robustness of the findings and the interpretation of the hippocampus's role in social cognition.

      (2) Replicability: The study included two independent samples, which strengthens the generalizability and reliability of the results.

      Weaknesses:

      I appreciate the authors for utilizing contemporary machine-learning techniques to analyze neuroimaging data and examine the intricacies of human cognition. However, the manuscript would benefit from a more detailed explanation of the rationale behind the selection of each method and a thorough description of the validation procedures. Such clarifications are essential to understand the true impact of the research. Moreover, refining these areas will broaden the manuscript's accessibility to a diverse audience.

      We thank the reviewer for these comments and have addressed them in various ways.

      First, we removed the spline-based decoding and spectral clustering analyses. As we detail in our response to the recommendations, these approaches were complex and raised legitimate interpretational concerns, making it unclear how they supported our core claims. The revised manuscript now focuses on a set of representational similarity analyses to show representations consistent with social dimension similarity (affiliation vs. power decision trials) and social location similarity (trajectory/map-like coding based on participant choices).

      Second, we expanded the Methods and Results to more clearly explain the analyses, the questions they address, and associated controls and robustness tests. The dimension similarity analysis tests whether hippocampal patterns differentiate affiliation and power decisions in a way consistent with an abstract dimension representation. The location similarity RSAs test whether within-character neural pattern distances scale with Euclidean distance in social space (relationship-specific trajectories), and whether pattern distances across all characters scale with location distances when distances are globally standardized, consistent with a shared map-like coordinate system.

      Third, we emphasize new controls. For the dimension similarity RSA, we test for potential confounds such as word count, text sentiment, and reaction time differences between affiliation and power trials. For the location similarity RSA, we control for temporal distance between trials and show (in the Supplement) that the reported effects cannot be explained by temporal autocorrelation in the fMRI data or by the relationship between temporal distance and behavioral location distance.

      We believe that these changes address the reviewer’s request for clearer rationale and validation.

      Reviewer #2 (Public review):

      Summary:

      Using an innovative task design and analysis approach, the authors set out to show that the activity patterns in the hippocampus related to the development of social relationships with multiple partners in a virtual game. While I found the paper highly interesting (and would be thrilled if the claims made in the paper turned out to be true), I found many of the analyses presented either unconvincing or slightly unconnected to the claims that they were supposed to support. I very much hope the authors can alleviate these concerns in a revision of the paper.

      Strengths & Weaknesses:

      (1) The innovative task design and analyses, and the two independent samples of participants are clear strengths of the paper.

      We thank the reviewer for this comment.

      (2) The RSA analysis is not what I expected after I read the abstract and tile of the result section "The hippocampus represents abstract dimensions of affiliation and power". To me, the title suggests that the hippocampus has voxel patterns, which could be read out by a downstream area to infer the affiliation and power value, independent of the exact identity of the character in the current trial. The presented RSA analysis however presents something entirely different - namely that the affiliation trials and power trials elicit different activity patterns in the area indicated in Figure 3. What is the meaning of this analysis? It is not clear to me what is being "decoded" here and alternative explanations have not been considered. How do affiliation and power trials differ in terms of the length of sentences, complexity of the statements, and reaction time? Can the subsequent decision be decoded from these areas? I hope in the revision the authors can test these ideas - and also explain how the current RSA analysis relates to a representation of the "dimensions of affiliation and power".

      We agree that this analysis needed to be better justified and explained. We have revised the text to clarify that by “represents the interaction decision trials along abstract social dimensions” we mean that hippocampal multivoxel patterns differentiate affiliation and power decisions in a way consistent with the conceptual framework of underlying latent dimensions. The analysis tests one simple prediction of this view – that on average these trial types are separable in the neural patterns. We have added details to the Methods, showing how the affiliation and power trials do not differ in word count or in sentiment, but do differ in their semantics, as assessed by a Large Language Model, as we expect from our task assumptions. Thanks to the reviewer’s comment, we also tested for and found a reaction time difference between affiliation and power trials, that we now control for.

      (3) Overall, I found that the paper was missing some more fundamental and simpler RSA analyses that would provide a necessary backdrop for the more complicated analyses that followed. Can you decode character identity from the regions in question? If you trained a simple decoder for power and affiliation values (using the LLE, but without consideration of the sequential position as used in the spline analysis), could you predict left-out trials? Are affiliation and power represented in a way that is consistent across participants - i.e. could you train a model that predicts affiliation and power from N-1 subjects and then predict the Nth subject? Even if the answer to these questions is "no", I believe that they are important to report for the reader to get a full understanding of the nature of the neural representations in these areas. If the claim is that the hippocampus represents an "abstract" relationship space, then I think it is important to show that these representations hold across relationships. Otherwise, the claim needs to be adjusted to say that it is a representation of a relationship-specific trajectory, but not an abstract social space.

      We appreciate this comment and agree on the value of clear, conceptually simple analyses. To address this concern, we have simplified our main analysis significantly by removing the spline-based analysis and substituting it with a multiple regression representational similarity analysis approach. We test whether within-character neural pattern distances scale with distance in social space (relationship-specific trajectories), and whether pattern distances across all characters scale with location distances when distances are globally standardized. We find evidence for both, consistent with a shared map-like coordinate system.

      We agree that decoding character identity and an across-participant decoding approach could be informative. However, our current task is not well designed for such analyses and as such would complicate the paper. Although we agree that these questions are interesting, they would test questions that are outside the scope of this paper. 

      (4) To determine that the location of a specific character can be decoded from the hippocampal activity patterns, the authors use a sequential analysis in a lowdimensional space (using local linear embedding). In essence, each trial is decoded by finding the pair of two temporally sequential trials that is closest to this pattern, and then interpolating the power/affiliation values linearly between these two points. The obvious problem with this analysis is that fMRI pattern will have temporal autocorrelation and the power and affiliation values have temporal autocorrelation. Successful decoding could just reflect this smoothness in both time series. The authors present a series of control analyses, but I found most of them to not be incisive or convincing and I believe that they (and their explanation of their rationale) need to be improved. For example, the circular shifting of the patterns preserves some of the autocorrelation of the time series - but not entirely. In the shifted patterns, the first and last items are considered to be neighboring and used in the evaluation, which alone could explain the poor performance. The simplest way that I can see is to also connect the first and last item in a circular fashion, even when evaluating the veridical ordering. The only really convincing control condition I found was the generation of new sequences for every character by shuffling the sequence of choices and re-creating new artificial trajectories with the same start and endpoint. This analysis performs much better than chance (circular shuffling), suggesting to me that a lot of the observed decoding accuracy is indeed simply caused by the temporal smoothness of both time series.

      We thank the reviewer for emphasizing this important concern; we agree that we did not sufficiently address this in the initial submission. This concern is one main reason we removed the spline-based analysis and now use regression-based representational similarity analyses in its place. In the revision, we report autocorrelation-related analyses in the supplement, and via controls and additional analysis show that temporal distance (or its square) cannot explain the location-like effects. This substantially improves our ability to interpret the findings.

      (5) Overall, I found the analysis of the brain-behavior correlation presented in Figure 5 unconvincing. First, the correlation is mostly driven by one individual with a large network size and a 6.5 cluster. I suspect that the exclusion of this individual would lead to the correlation losing significance. Secondly, the neural measure used for this analysis (determining the number of optimal clusters that maximize the overlap between neural clustering and behavioral clustering) is new, non-validated, and disconnected from all the analyses that had been reported previously. The authors need to forgive me for saying so, but at this point of the paper, would it not be much more obvious to use the decoding accuracy for power and affiliation from the main model used in the paper thus far? Does this correlate? Another obvious candidate would be the decoding accuracy for character identity or the size of the region that encodes affiliation and power. Given the plethora of candidate neural measures, I would appreciate if the authors reported the other neural measures that were tried (and that did not correlate). One way to address this would have been to select the method on the initial sample and then test it on the validation sample - unfortunately, the measure was not pre-registered before the validation sample was collected. It seems that the correlation was only found and reported on the validation sample?

      We agree that this analysis was too complicated and under constrained, and thus not convincing. We think that removing this cluster-based analysis is the most conservative response to the reviewer’s concerns and have removed it from the revised paper.

      Recommendations to the authors:

      Reviewer #1 (Recommendations for the authors):

      The manuscript's description of the shuffling analysis performed during decoding is currently ambiguous, particularly concerning the control variables. This ambiguity is present only in the Figure 4 legends and requires a more detailed explanation within the methods section. It is essential to clarify whether the permutation process was conducted within each character's data set or across multiple characters' data sets. If permutations were confined to within-character data, the conclusion would be that the hippocampus encodes context-specific information rather than providing a twodimensional common space.

      We thank the reviewer for this comment. We have now removed the spline analysis due to these and other problems and have replaced it with representational similarity analyses that are both more rigorous and easier to interpret. We think these analyses allow us to make the claim that the characters are represented in a common space. 

      In the methods, we explain the analyses (page 23-24, lines 475-500):

      “We also expected the hippocampus to represent the different characters’ changing social locations, which are implicit in the participant’s choices. We used multiple regression searchlight RSA to test whether hippocampal pattern dissimilarity increases with social location distance, based on participant-specific trial-wise beta images where boxcar regressors spanned each trial’s reaction time.”

      “We ran two complementary regression analyses to address two related questions. First, we asked whether the hippocampus represents how a specific relationship changes over time. For this analysis, for each participant and each searchlight, we computed character-specific (i.e., only for same character trial pairs) correlation distances between trial-wise beta patterns and Euclidean distances between the social location behavioral coordinates. Distances were zscored within character trial pairs to isolate character-specific changes. The second analysis asked whether the there is a common map-like representation, where all trials, regardless of relationship, are represented in a shared coordinate system. Here, we included all trial pairs and z-scored the distances globally. For both regression analyses, we included control distances to control for possible confounds. To account for generic time-related changes, we controlled for absolute scan-time difference, as this correlated with location distance across participants (see Temporal autocorrelation of hippocampal beta patterns in the supplement). Although the square of this temporal distance did not explain any additional variance in behavioral distances, we ran a robustness analysis including both temporal distance and its square and saw qualitatively the same clusters with similar effect sizes. As such, we report the main analysis only. We included binary dimension difference (0 = trial pairs of different dimension, 1 = trials pairs of the same dimension), to ensure effects could not be explained by dimension-related effects. In the group-level model, we controlled for sample and the average reaction time between affiliation and power decisions.”

      In the results, we describe the results and our interpretation (pages 11-12, lines 185208):

      “We have shown that the left hippocampus represents the affiliation and power trials differently, consistent with an abstract dimensional representation. Does it also represent the changing social coordinates of each character? To test this, we multiple-regression RSA searchlight to test whether left hippocampus patterns represent the characters’ changing social locations across interactions (see Figure 3). We restricted the distances to those from trial pairs from the same character and standardized the distances within character (see Figure 3BD). We controlled for temporal distance to ensure the effect was not explainable by the time between trials, and for whether the trials shared the same underlying dimension (affiliation or power; see Location similarity searchlight analyses for more details). At the group level, we controlled for sample and the average reaction time difference between affiliation and power trials. Using the same testing logic as the dimensionality similarity analysis, we first tested our hypothesis in the bilateral hippocampus and found widespread effects in both the left (peak voxel MNI x/y/z = -35/-22/-15, cluster extent = 1470 voxels) and right (peak voxel MNI x/y/z = 37/-19/-14, cluster extent = 1953 voxels) hemispheres. The whole-brain searchlight analysis revealed additional clusters in the left putamen (-27/-3/14, cluster extent = 131 voxels) and left posterior cingulate cortex (-10/-28/41, cluster extent = 304 voxels).”

      “We then asked a second, complementary question: does the hippocampus represent all interactions, across characters, within a shared map? To test for this map-like structure, we repeated the analysis but now included all trial pairs, z-scoring distances globally rather than within character (Figure 3E-F). The remainder of the procedure followed the same logic as the preceding analysis. The hippocampus analysis revealed an extensive right hippocampal cluster (27/27/-14, cluster extent = 1667 voxels). The whole-brain analysis did not show any significant clusters.”

      We also describe the results in the discussion (page 12, lines 220-226): 

      “Then, we show that the hippocampus tracks the changing social locations (affiliation and power coordinates), above and beyond the effects of dimension or time; the hippocampus seemed to reflect both the changing within-character locations, tracking their locations over time, and locations across characters, as if in a shared map. Thus, these results suggest that the hippocampus does not just encode static character-related representations but rather tracks relationship changes in terms of underlying affiliation and power.”

      The manuscript's description of the decoding analysis is unclear regarding the variability of the decoded positions. The authors appear to decode the position of a character along a spline, which raises the question of whether this position correlates with time, since characters are more likely to be located further from the center in later trials. There is a concern that the decoded position may not solely reflect the hippocampal encoding of spatial location, but could also be influenced by an inherent temporal association. Given that a character's position at time t is likely to be similar to its positions at t−1 and t+1, it is crucial that the authors clearly articulate their approach to separating spatial representation from temporal autocorrelation. While this issue may have been addressed in the construction of the test set, the manuscript does not seem to adequately explain how such biases were mitigated in the training set.

      We agree that temporal confounding needs to be better accounted for, as our claims depend on space-like signals being separable from time-like ones. We address this in several ways in the revised manuscript.

      First, we emphasize that this is a narrative-based task, where temporal structure is relevant. As such, our analyses aim to demonstrate that effects go beyond simple temporal confounds, like trial order or time elapsed.

      Despite the temporal structure to the task, the decisions for the same character are spaced in time, and interleaved with other characters’ decisions, reducing the chance that a simple temporal confound could explain trajectory-related effects. We now describe the task better in the revised methods (page 16, lines 314-318):

      “All six characters’ decision trials are interleaved with one another and with narrative slides. On average, after a decision trial for a given character, participants view ~11 narrative slides and complete ~3 decisions for other characters before returning to that same character, such that each character’s choices are separated by an average of ~20 seconds (range 12 seconds to 10 min).”

      To address temporal autocorrelation in the fMRI time series, we used SPM’s FAST algorithm. Briefly, FAST models temporal autocorrelation as a weighted combination of candidate correlation functions, using the best estimate to remove autocorrelated signal.

      We also now report the temporal autocorrelation profile of the hippocampal beta series in the supplement, including (pages 29-31, lines 593-656):

      “The Social Navigation Task is a narrative-based task, where the relationships with characters evolve over time; trial pairs that are close in time may have more similar fMRI patterns for reasons unrelated to social mapping (e.g., slow drift). It is important to account for the role of time in our analyses, to ensure effects go beyond simple temporal confounds, like the time between decision trials. To aid in this, we quantified how fMRI signals change over time using a pattern autocorrelation function across decision trial lags. We defined the left and right hippocampus and the left and right intracalcarine cortex using the HarvardOxford atlas and thresholded them at 50% probability. We chose intracalcarine corex as an early visual control region that largely corresponds to primary visual cortex (V1), as it is likely to be driven by the visually presented narrative. We used the same trial-wise beta images as in the location similarity RSA (boxcar regressors spanning each decision trial’s reaction time). For each participant and region-of-interest (ROI), we extracted the decision trial-by-voxel beta matrix and quantified three kinds of temporal dependence: beta autocorrelation, multivoxel pattern correlation and multivoxel pattern correlation after regressing out temporal distance.”

      “To estimate the temporal autocorrelation of the trial-wise beta values, we treated each voxel’s beta values as a time series across trials and measured how much a voxel’s response on one trial correlated (Pearson) with its response on previous trials. We averaged these voxel wise autocorrelations within each ROI. At one trial apart (lag 1), both the hippocampus and V1 showed small positive autocorrelations, indicating modest trial-to-trial carryover in response amplitude (see Supplemental figure 1) that by three trials apart was approximately 0.”

      “Because our representational similarity analyses depend on trial-by-trial pattern similarity, we also estimated how multivoxel patterns were autocorrelated over time. For each lag, we computed the Pearson correlation between each trial’s voxelwise pattern and the pattern from the trial that many trials earlier, then averaged those correlations to obtain a single autocorrelation value for that lag. At one trial apart, both regions showed positive autocorrelation, with V1 having greater autocorrelation than the hippocampus; pattern correlations between trials 3 or 4 trials apart reduced across participants, settling into low but positive values. Then, for each participant and ROI, we regressed out the effect of absolute trial onset differences from all pairwise pattern correlations, to mirror the effects of controlling for these temporal distances in regressions. After removing this temporal distance component, the short lag pattern autocorrelation dropped substantially in both regions. The similarity in autocorrelation profiles between the two regions suggests that significant similarity effects in the hippocampus are unlikely to be driven by generic temporal autocorrelation.”

      “Relationship between behavioral location distance and temporal distance “

      “We also quantified how temporal distances between trials relates to their behavioral location distances, participant by participant. Our dimension similarity analysis controls for temporal distance between trials by design (see Social dimension similarity searchlight analysis), but our location similarity analysis does not. To decide on covariates to include in the analysis, we tested whether temporal distances can explain behavioral location distances. For each participant, we computed the correlations between trial pairs’ Euclidean distances in social locations and their linear temporal distances (“linear”) and the temporal distances squared (“quadratic”), to test for nonlinear effects. We then summarized the correlations using one-sample t-tests. The linear relationship was statistically significant (t<sub>49</sub> = 12.24, p < 0.001), whereas the quadratic relationship was not (t<sub>49</sub> = -0.55, p = 0.586). Similarly, in participant specific regressions with both linear and quadratic temporal distances, the linear effect was significant (t<sub>49</sub> = 5.69, p < 0.001) whereas the quadratic effect was not (t<sub>49</sub> = 0.20, p = 0.84). Based on this, we included linear temporal distances as a covariate in our location similarity analyses (see Location similarity searchlight analyses), and verified that adding a quadratic temporal distance covariate does not alter the results. Thus, the reported location-related pattern similarity effects go beyond what can be explained by temporal distance alone.”

      How the free parameter of spectral clustering was determined, if there is any?

      The interpretation of the number of hippocampal activity clusters is ambiguous. It is suggested that this number could fluctuate due to unique activity patterns or the fit to behaviorally defined trajectories. A lower number of clusters might indicate either a noisier or less distinct representation, raising the question of the necessity and interpretability of such a complex analysis. This concern is compounded by the potential sensitivity of the clustering to the variance in Euclidean distances of each trial's position relative to the center. If a character's position is consistently near the center, this could artificially reduce the perceived number of clusters. Furthermore, the manuscript should address whether there is any correlation between the number of clusters and behavioral performance. Specifically, what are the implications if participants are able to perform the task adequately with a smaller number of distinct hippocampal representation states?

      The rationale for conducting both cluster analysis and position decoding as separate analyses remains unclear. While cluster analysis can corroborate the findings of position decoding, it is not apparent why the authors chose to include trials across characters for cluster analysis but not for decoding analysis. An explanation of the reasoning behind this methodological divergence would help in understanding the distinct contributions of each analysis to the study's findings.

      The paper by Cohen et al. (1997), which provides the questionnaire for measuring the social network index, is not cited in the references. Upon reviewing the questionnaire that the author may have used, it appears that the term "social network size" does not refer to the actual size but to a score or index derived from the questionnaire responses. It may be more appropriate to replace the term "size" with a different term to more accurately reflect this distinction.

      Thank you for seeking these clarifications. Given the complexity of this analysis, we have decided to drop it to focus instead on our dimension and location representational similarity analysis results.

      Reviewer #2 (Recommendations for the authors):

      How did the participants' decisions on previous trials influence the future trials that the subjects saw? If the different participants were faced with different decision trials, then how did you compare their decision? If two participants made the same decisions, would they have seen exactly the same sequence of trials (see point X on how the trial sequence was randomized).

      All participants experience the same narrative, with the same decisions (i.e., the same available options); their choices (i.e., the options they select) are what implicitly shape each character’s affiliation and power locations, and thus each character’s trajectory. In other words, the narrative is fixed; what changes is the social coordinates assigned to each trial’s outcome depending on the participant’s choice of how to interact from the two narrative options. This means that we can meaningfully compare participants' neural patterns, given that every participant received the same text and images throughout.

      We have now added details on the narrative structure, replacing more ambiguous statements with a clearer description (page 16, lines 309-318):

      “The sequence of trials, including both narrative and decision trials, were fixed across participants; all that differs are the choices that the participants make. Narrative trials varied in duration, depending on the content (range 2-10 seconds), but were identical across participants. Decision trials always lasted 12 seconds, with two options presented until the participant made a choice, after which a blank screen was presented for the remainder of the duration. All six characters’ decision trials are interleaved with one another, and with the narrative slides. On average, after a decision trial for a given character, participants view ~11 narrative slides and complete ~3 decisions for other characters before returning to another decision with the same character, such that each character’s choices are separated by an average of ~20 seconds (ranging from 12 seconds to 10 min).”

      Figure 2B: I assume that "count" is "count of participants"? It would be good to indicate this on the axis/caption.

      Thank you for noting this. We have now removed this figure to improve the clarity of our figures. 

      We have shown that the hippocampus represents the interaction decision trials along abstract social dimensions, but does it track each relationship's unique sequence of abstract social coordinates?". Please clarify what you mean by "represents the interaction decision trials”.

      By “represents the interaction decision trials along abstract social dimensions”, we mean that when the participant makes a choice during the social interactions the hippocampal patterns represent the current social dimension of the choice (affiliation vs power). In other words, the hippocampal BOLD patterns differentiate affiliation and power decisions, consistent with our hypothesis of abstract social dimension representation in the hippocampus. We have clarified this (page 11, lines 185-187):

      “We have shown that the left hippocampus represents the affiliation and power trials differently, consistent with an abstract dimensional representation.”

      Page 8: "Hippocampal sequences are ordered like trajectories": It is not entirely clear to me what is meant by the split midpoint. Is this the midpoint of the piece-wise linear interpolation between two points, or simply the mean of all piecewise splines from one character? If the latter, is the null model the same as simply predicting the mean affiliation and power value for this character? If yes, please clarify and simplify this for the reader.

      Page 8: "Hippocampal sequences track relationship-specific paths". First, I was misled by the "relationship-specific". I first understood this to mean that you wanted to test whether two relationships (i.e. the identity of the partner) had different representations in Hippocampus, even if the power/affiliation trajectories are the same. I suggest changing the title of this section.

      The analysis in this section also breaks any temporal autocorrelation of measured patterns - so I am not sure if this is a strong analysis that should be interpreted at all. This analysis seems to not address the claim and conclusion that is drawn from it. I assume that the random trajectories have different choices and different affiliation/power values than the true trajectories. So the fact that the true trajectories can be better decoded simply shows that either choices or affiliation and power (or both) are represented in the neural code - but not necessarily anything beyond this.

      Page 9: "Neural trajectories reflect social locations, not just choices". The motivation of this analysis is not clear to me. As I understand this analysis, both social location and choices are changed from the real trajectories. How can it then show that it reflects social locations, not just the choices?

      Figure 4 caption: "on the -based approximation" Is there a missing "point"-[based] here?

      We agree with the reviewer that this analysis is hard to interpret and does not adequately address concerns regarding temporal autocorrelation, and as such we have removed it from the manuscript. We describe the new results that include controlling for temporal distance between trials (pages 11-12, lines 185-208):

      “We have shown that the left hippocampus represents the affiliation and power trials differently, consistent with an abstract dimensional representation. Does it also represent the changing social coordinates of each character? To test this, we multiple-regression RSA searchlight to test whether left hippocampus patterns represent the characters’ changing social locations across interactions (see Figure 3). We restricted the distances to those from trial pairs from the same character and standardized the distances within character (see Figure 3BD). We controlled for temporal distance to ensure the effect was not explainable by the time between trials, and for whether the trials shared the same underlying dimension (affiliation or power; see Location similarity searchlight analyses for more details). At the group level, we controlled for sample and the average reaction time difference between affiliation and power trials. Using the same testing logic as the dimensionality similarity analysis, we first tested our hypothesis in the bilateral hippocampus and found widespread effects in both the left (peak voxel MNI x/y/z = -35/-22/-15, cluster extent = 1470 voxels) and right (peak voxel MNI x/y/z = 37/-19/-14, cluster extent = 1953 voxels) hemispheres. The whole-brain searchlight analysis revealed additional clusters in the left putamen (-27/-3/14, cluster extent = 131 voxels) and left posterior cingulate cortex (-10/-28/41, cluster extent = 304 voxels).”

      “We then asked a second, complementary question: does the hippocampus represent all interactions, across characters, within a shared map? To test for this map-like structure, we repeated the analysis but now included all trial pairs, z-scoring distances globally rather than within character (Figure 3E-F). The remainder of the procedure followed the same logic as the preceding analysis. The hippocampus analysis revealed an extensive right hippocampal cluster (27/27/-14, cluster extent = 1667 voxels). The whole-brain analysis did not show any significant clusters.”

      We emphasize that the results are robust to the inclusion of temporal distance squared, in the methods (pages 23-24, lines 493-496):

      “Although the square of this temporal distance did not explain any additional variance in behavioral distances, we ran a robustness analysis including both temporal distance and its square and saw qualitatively the same clusters with similar effect sizes.”

      Page 8: last paragraph: The text sounds like you have already shown that you can decode character identity from the patterns - but I do not believe you have it this point. I would consider this would be an interesting addition to the paper, though.

      This section has been removed, and we have been careful to not imply this in the current version of the manuscript. While we agree a character identity decoding would enrich our argument, we do not believe our task is well-suited to capture a character identity effect. Each character only has 12 decision trials, and these trials are partially clustered in time - this is one problem of temporal autocorrelation that we thank the reviewers for pushing us to consider in more detail. Dimension and location patterns, on the other hand, are more natural to analyze in our task, especially in representational similarity analyses that test whether the relevant differences scale with neural distances.

      Page 14ff: Why is "Analysis section" not part of "Materials and Methods"? I believe adding the analysis after a careful description of the methods would improve the clarity of this section.

      We agree with the reviewer and have now consolidated these two sections.

      Two or three examples of Affiliation and Power decision trials should be provided, so the reader can form a more thorough understanding of how these dimensions were operationalized. For the RSA analysis, it is important to consider other differences between these two types of trials.

      We agree that adding examples will clarify the operationalization of these dimensions. We now include example affiliation and power trials in a table (page 17-18).

      We thank the reviewer for noting the need to rule out alternative hypotheses; we have added several such tests. Affiliation and power trials were not different in word count (page 17, lines 329-332):

      “To ensure that any observed neural or behavioral differences were not confounded by trivial features of the text, we tested for differences between the affiliation and power trials (where the two options are concatenated). There were no differences in word count (affiliation average = 26.6, power average = 25.6; t-test p = 0.56).”

      They were also not different in their sentiment, as assessed by a Large Language Model (LLM) analysis (page 17, lines 332-335): 

      “The text’s sentiment also did not differ between these trial types (t-test p = 0.72), as quantified by comparing sentiment compound scores (from most negative, −1, to most positive, +1), using a Large Language Model (LLM) specialized for sentiment analysis [26]. “

      The affiliation and power trials were different in terms of semantic content, consistent with our assumptions (page 17, lines 337-347):

      “Our framework assumes that affiliation and power trials differ in their semantic content–that is, in the conceptual meaning of the text, beyond word count or sentiment. To test this assumption, we used an LLM-based semantic embedding analysis. Each decision trial was embedded into a semantic vector. We then measured the cosine similarity between pairs of trials and calculated the difference between average within-dimension similarity (affiliation-affiliation and power-power comparisons) and average between-dimension similarity (affiliationpower comparisons) and assessed its statistical significance with permutation testing (1,000 shuffles of trial labels). As expected, decision trials of the same dimension were more similar to each other than trials of different dimension, across multiple LLMs (OpenAI’s text-embedding-3-small [27]: similarity difference = 0.041, p < 0.001; all-MiniLM-L12-v2 [28]: similarity difference = 0.032, p < 0.001).”

      The affiliation and power trials were different in average reaction time. To control for this difference in the dimension RSA analysis, we added each participant’s absolute value reaction time difference between the trial types as a covariate. The results were nearly identical to what they were before. We updated the text to reflect this new control (page 23, lines 471-474):

      “However, there was a significant difference in the average reaction time between affiliation and power decisions across participants (t<sub>49</sub> = 6.92, p < 0.001; affiliation mean = 4.92 seconds (s), power mean = 4.51 s), so we controlled for this in the group-level analysis.”

      The exact implementation and timing of the behavioral tasks should be described better. How many narrative trials were intermixed with the decision trials? Which characters were they assigned to? How was the sequence of trials determined? Was it fixed across participants, or randomized?

      We agree that additional details are helpful. In the Methods, we now describe this with more detail (page 16, lines 301-318):

      “There are two types of trials: “narrative” trials where background information is provided or characters talk or take actions (a total of 154 trials), and “decision” trials where the participant makes decisions in one-on-one interactions with a character that can change the relationship with that character (a total of 63 trials). On each decision, participants used a button response box to select between the two options. The options (1 or 2, assigned to the index and middle fingers) choice directions (+/-1 arbitrary unit on the current dimension) were counterbalanced.”

      “The sequence of trials, including both narrative and decision trials, were fixed across participants; all that differs are the choices that the participants make. Narrative trials varied in duration, depending on the content (range 2-10 seconds), but were identical across participants. Decision trials always lasted 12 seconds, with two options presented until the participant made a choice, after which a blank screen was presented for the remainder of the duration. All six characters’ decision trials are interleaved with one another, and with the narrative slides. On average, after a decision trial for a given character, participants view ~11 narrative slides and complete ~3 decisions for other characters before returning to another decision with the same character, such that each character’s choices are separated by an average of ~20 seconds (ranging from 12 seconds to 10 min).”

      What is the exact timing of trials during fMRI acquisition - i.e. how long were the trials, what was the ITI, were there long phases of rest to determine the resting baseline? These are all important factors that will determine the covariance between regressors and should be reported carefully. Ideally, I would like to see the trial-by-trial temporal auto-correlation structure across beta-weights to be reported.

      We thank the reviewer for asking for this clarification. We have added the following text to clarify the trial timing (page 16, lines 314-318):

      “All six characters’ decision trials are interleaved with one another and with narrative slides. On average, after a decision trial for a given character, participants view ~11 narrative slides and complete ~3 decisions for other characters before returning to that same character, such that each character’s choices are separated by an average of ~20 seconds (range 12 seconds to 10 min).”

      We now describe the temporal autocorrelation patterns in the supplement, including how we decided on how to control for temporal distance in representational similarity analyses (pages 29-31, lines 593-656):

      “The Social Navigation Task is a narrative-based task, where the relationships with characters evolve over time; trial pairs that are close in time may have more similar fMRI patterns for reasons unrelated to social mapping (e.g., slow drift). It is important to account for the role of time in our analyses, to ensure effects go beyond simple temporal confounds, like the time between decision trials. To aid in this, we quantified how fMRI signals change over time using a pattern autocorrelation function across decision trial lags. We defined the left and right hippocampus and the left and right intracalcarine cortex using the HarvardOxford atlas and thresholded them at 50% probability. We chose intracalcarine corex as an early visual control region that largely corresponds to primary visual cortex (V1), as it is likely to be driven by the visually presented narrative. We used the same trial-wise beta images as in the location similarity RSA (boxcar regressors spanning each decision trial’s reaction time). For each participant and region-of-interest (ROI), we extracted the decision trial-by-voxel beta matrix and quantified three kinds of temporal dependence: beta autocorrelation, multivoxel pattern correlation and multivoxel pattern correlation after regressing out temporal distance.”

      “To estimate the temporal autocorrelation of the trial-wise beta values, we treated each voxel’s beta values as a time series across trials and measured how much a voxel’s response on one trial correlated (Pearson) with its response on previous trials. We averaged these voxel wise autocorrelations within each ROI. At one trial apart (lag 1), both the hippocampus and V1 showed small positive autocorrelations, indicating modest trial-to-trial carryover in response amplitude (see Supplemental figure 1) that by three trials apart was approximately 0.”

      “Because our representational similarity analyses depend on trial-by-trial pattern similarity, we also estimated how multivoxel patterns were autocorrelated over time. For each lag, we computed the Pearson correlation between each trial’s voxelwise pattern and the pattern from the trial that many trials earlier, then averaged those correlations to obtain a single autocorrelation value for that lag. At one trial apart, both regions showed positive autocorrelation, with V1 having greater autocorrelation than the hippocampus; pattern correlations between trials 3 or 4 trials apart reduced across participants, settling into low but positive values. Then, for each participant and ROI, we regressed out the effect of absolute trial onset differences from all pairwise pattern correlations, to mirror the effects of controlling for these temporal distances in regressions. After removing this temporal distance component, the short lag pattern autocorrelation dropped substantially in both regions. The similarity in autocorrelation profiles between the two regions suggests that significant similarity effects in the hippocampus are unlikely to be driven by generic temporal autocorrelation.”

      “Relationship between behavioral location distance and temporal distance “

      “We also quantified how temporal distances between trials relates to their behavioral location distances, participant by participant. Our dimension similarity analysis controls for temporal distance between trials by design (see Social dimension similarity searchlight analysis), but our location similarity analysis does not. To decide on covariates to include in the analysis, we tested whether temporal distances can explain behavioral location distances. For each participant, we computed the correlations between trial pairs’ Euclidean distances in social locations and their linear temporal distances (“linear”) and the temporal distances squared (“quadratic”), to test for nonlinear effects. We then summarized the correlations using one-sample t-tests. The linear relationship was statistically significant (t<sub>49</sub> = 12.24, p < 0.001), whereas the quadratic relationship was not (t<sub>49</sub> = -0.55, p = 0.586). Similarly, in participant specific regressions with both linear and quadratic temporal distances, the linear effect was significant (t<sub>49</sub> = 5.69, p < 0.001) whereas the quadratic effect was not (t<sub>49</sub> = 0.20, p = 0.84). Based on this, we included linear temporal distances as a covariate in our location similarity analyses (see Location similarity searchlight analyses), and verified that adding a quadratic temporal distance covariate does not alter the results. Thus, the reported location-related pattern similarity effects go beyond what can be explained by temporal distance alone.”

    1. Briefing : Feuille de Route de l'Éducation Nationale pour les Droits et le Bien-être des Enfants

      Synthèse

      Ce document synthétise les axes stratégiques et les constats chiffrés présentés par Édouard Geffray, ministre de l'Éducation nationale, lors de son audition devant la délégation aux droits des enfants.

      L'école y est définie par deux fonctions cardinales : instruire et protéger. Les priorités ministérielles s'articulent autour de trois piliers majeurs : la santé mentale des élèves, la lutte contre le harcèlement scolaire et la sécurisation des parcours pour les enfants les plus vulnérables (situation de handicap ou sous protection).

      Le ministre souligne une situation alarmante de la santé mentale des jeunes, exacerbée par les usages numériques, et propose des mesures systémiques : déploiement du programme "Phare", interdiction du portable au lycée, et création d'un cadre de "scolarité protégée".

      Malgré une baisse démographique drastique (un million d'élèves en moins d'ici 2029), le ministère affirme vouloir maintenir une trajectoire de recrutement pour les personnels médico-sociaux afin de répondre à l'explosion des besoins de détection et d'orientation.

      --------------------------------------------------------------------------------

      I. Santé Mentale et Lutte contre le Harcèlement Scolaire : Un Enjeu de Sécurité Absolue

      Le ministre place la santé mentale parmi ses trois priorités absolues, s'appuyant sur des indicateurs de détresse psychologique en forte hausse.

      État des lieux et chiffres clés

      Risques de dépression : 14 % des collégiens et 15 % des lycéens présentent un risque important.

      Idées suicidaires : 24 % des lycéens déclarent avoir eu des pensées suicidaires au cours des 12 derniers mois.

      Harcèlement : Environ 5 % des élèves (soit un élève par classe en moyenne) sont victimes de harcèlement chaque année.

      Urgences : Augmentation de 80 % des passages aux urgences pour intentions ou tentatives de suicide depuis la crise du COVID-19.

      Stratégies de réponse

      Désanonymisation des questionnaires : Le questionnaire annuel de harcèlement (rempli du CE2 à la Terminale) permet désormais aux élèves de décliner leur identité en fin de document pour être recontactés par l'équipe enseignante.

      Formation des personnels : L'objectif est de former deux personnels "sentinelles" par établissement pour repérer et orienter les élèves. Actuellement, la moyenne est de 1,6 personnel formé.

      Dispositif "Coupe-file" : Un mécanisme est en cours de finalisation avec le ministère de la Santé pour garantir aux infirmiers et médecins scolaires une prise de rendez-vous rapide vers les Centres Médico-Psychologiques (CMP) ou la médecine de ville, évitant des délais d'attente de 3 à 6 mois.

      Arsenal répressif : La loi du 2 mars 2022 fait du harcèlement un délit. 10 000 affaires ont été enregistrées par les parquets depuis 2022. Le décret du 16 août 2023 permet désormais de changer d'école l'élève auteur de harcèlement ou de violences intentionnelles.

      --------------------------------------------------------------------------------

      II. Protection de l'Enfance et "Scolarité Protégée"

      L'école s'affirme comme le premier émetteur d'informations préoccupantes (IP) et d'articles 40 en France.

      Signalements : Le nombre d'informations préoccupantes émises par l'école est passé de 50 000 à 80 000 en deux ans. Un guide national de standardisation des alertes est en cours de publication.

      Circulaire "Scolarité Protégée" : Publiée prochainement, elle vise à garantir la continuité pédagogique des enfants confiés à l'Aide Sociale à l'Enfance (ASE), dont 70 % sortent actuellement du système sans diplôme. Elle prévoit :

      ◦ Un suivi individuel par les services départementaux (DASEN).  

      ◦ Des appuis scolaires spécifiques pour éviter les ruptures liées aux changements de foyers ou de familles d'accueil.  

      ◦ Un soutien renforcé à l'orientation et à l'estime de soi.

      --------------------------------------------------------------------------------

      III. École Inclusive et Évolution de l'Accompagnement

      Le ministre distingue les élèves "non accompagnés" (disposant d'une solution pédagogique mais attendant une aide humaine) des élèves "sans solution" (exclus du système faute de structure adaptée).

      De la compensation à l'accessibilité : Le ministère souhaite sortir d'un modèle basé uniquement sur l'aide humaine systématique (AESH) pour privilégier l'accessibilité pédagogique et matérielle. L'objectif est d'éviter "l'externalisation" du handicap à l'intérieur de la classe.

      Pôles d'Appui à la Scolarité (PAS) : Déployés pour favoriser l'intervention du médico-social directement dans les murs de l'école et fluidifier les parcours entre le milieu ordinaire et les structures spécialisées.

      Besoins : 42 000 élèves seraient encore en attente d'accompagnement après les vacances de la Toussaint, malgré la création de 1 200 postes d'AESH supplémentaires pour 2026.

      --------------------------------------------------------------------------------

      IV. Numérique et Éducation à la Vie Affective (EVARS)

      La régulation des écrans

      Le ministre défend une interdiction stricte du portable au lycée (prévue pour 2026), justifiée par des enjeux cognitifs et de santé publique :

      Corrélation scientifique : La dégradation psychique des élèves est proportionnelle à la consommation d'écrans (le risque de troubles anxio-dépressifs passe de 30 % à 60 % pour les gros utilisateurs).

      Conscience avant contenu : Le ministre souhaite rétablir une primauté de l'éducation aux risques numériques avant l'exposition massive aux contenus violents ou faux.

      Éducation à la vie affective, relationnelle et sexuelle (EVARS)

      Obligation : Les trois séances annuelles sont présentées comme "non négociables", tant dans le public que dans le privé sous contrat.

      Constats : 15 % des filles et 12 % des garçons au collège déclarent avoir subi une forme de violence sexuelle.

      Déploiement : Au 31 décembre, 66 % des écoles et 48 % des collèges publics avaient réalisé au moins une séance.

      Formation des enseignants : Le ministère reconnaît la nécessité de protéger les personnels qui, étant parfois eux-mêmes d'anciennes victimes, pourraient subir des traumatismes en dispensant ces enseignements.

      --------------------------------------------------------------------------------

      V. Pilotage Institutionnel et Défis Démographiques

      La gestion des moyens humains

      Le système éducatif fait face à une chute démographique sans précédent :

      Données : Perte d'un million d'élèves entre 2019 et 2029 dans le premier degré. Une génération de 200 000 élèves "disparaît" tous les quatre ans.

      Ajustements : Le ministre justifie les suppressions de postes d'enseignants (4 000 prévus) par cette baisse, tout en souhaitant augmenter progressivement les effectifs médico-sociaux (300 à 500 postes par an) pour compenser l'explosion des besoins en santé mentale.

      L'éducation prioritaire (REP/REP+)

      Le ministre admet que la carte actuelle, figée depuis 2015, est obsolète. Cependant, il refuse une révision avant 2027 pour deux raisons :

      1. Technique : Le processus de concertation avec les collectivités et les syndicats nécessite 15 à 18 mois.

      2. Démocratique : Il considère que ce débat doit appartenir à la prochaine échéance présidentielle et refuse de "figer" une carte qui s'imposerait au futur gouvernement.

      Création d'un défenseur des droits des enfants

      Un adjoint à la médiatrice de l'Éducation nationale sera spécifiquement chargé de la protection de l'enfance. Sa mission sera de traiter les litiges entre scolaire et périscolaire pour assurer une sécurité "de la porte à la porte" et de produire un rapport annuel dédié à ces enjeux.

      --------------------------------------------------------------------------------

      VI. Tableau Synthétique : Chiffres de la Santé Mentale et du Bien-être

      | Indicateur | Donnée Statistique | | --- | --- | | Élèves victimes de harcèlement | 5 % (stable du CE2 à la Terminale) | | Lycéens avec idées suicidaires | 24 % | | Passage aux urgences (suicide) | \+ 80 % depuis le Covid | | Information préoccupantes (École) | 80 000 / an (en hausse de 30 000) | | Sortie de l'ASE sans diplôme | 70 % | | Couverture EVARS (Écoles) | 66 % (au 31/12) | | Élèves en attente d'AESH | 42 000 (Toussaint 2025) |

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript reports a cryo-EM structure of TMAO demethylase from Paracoccus sp. This is an important enzyme in the metabolism of trimethylamine oxide (TMAO) and trimethylamine (TMA) in human gut microbiota, so new information about this enzyme would certainly be of interest.

      Strengths:

      The cryo-EM structure for this enzyme is new and provides new insights into the function of the different protein domains, and a channel for formaldehyde between the two domains.

      Weaknesses:

      (1) The proposed catalytic mechanism in this manuscript does not make sense. Previous mechanistic studies on the Methylocella silvestris TMAO demethylase (FEBS Journal 2016, 283, 3979-3993, reference 7) reported that, as well as a Zn2+ cofactor, there was a dependence upon non-heme Fe2+, and proposed a catalytic mechanism involving deoxygenation to form TMA and an iron(IV)-oxo species, followed by oxidative demethylation to form DMA and formaldehyde.

      In this work, the authors do not mention the previously proposed mechanism, but instead say that elemental analysis "excluded iron". This is alarming, since the previous work has a key role for non-heme iron in the mechanism. The elemental analysis here gives a Zn content of about 0.5 mol/mol protein (and no Fe), whereas the Methylocella TMAO demethylase was reported to contain 0.97 mol Zn/mol protein, and 0.35-0.38 mol Fe/mol protein. It does, therefore, appear that their enzyme is depleted in Zn, and the absence of Fe impacts the mechanism, as explained below.

      The proposed catalytic mechanism in this manuscript, I am sorry to say, does not make sense to me, for several reasons:

      (i) Demethylation to form formaldehyde is not a hydrolytic process; it is an oxidative process (normally accomplished by either cytochrome P450 or non-heme iron-dependent oxygenase). The authors propose that a zinc (II) hydroxide attacks the methyl group, which is unprecedented, and even if it were possible, would generate methanol, not formaldehyde.

      (ii) The amine oxide is then proposed to deoxygenate, with hydroxide appearing on the Zn - unfortunately, amine oxide deoxygenation is a reductive process, for which a reducing agent is needed, and Zn2+ is not a redox-active metal ion;

      (iii) The authors say "forming a tetrahedral intermediate, as described for metalloproteinase", but zinc metalloproteases attack an amide carbonyl to form an oxyanion intermediate, whereas in this mechanism, there is no carbonyl to attack, so this statement is just wrong.

      So on several counts, the proposed mechanism cannot be correct. Some redox cofactor is needed in order to carry out amine oxide deoxygenation, and Zn2+ cannot fulfil that role. Fe2+ could do, which is why the previously proposed mechanism involving an iron(IV)-oxo intermediate is feasible. But the authors claim that their enzyme has no Fe. If so, then there must be some other redox cofactor present. Therefore, the authors need to re-analyse their enzyme carefully and look either for Fe or for some other redox-active metal ion, and then provide convincing experimental evidence for a feasible catalytic mechanism. As it stands, the proposed catalytic mechanism is unacceptable.

      (2) Given the metal content reported here, it is important to be able to compare the specific activity of the enzyme reported here with earlier preparations. The authors do quote a Vmax of 16.52 µM/min/mg; however, these are incorrect units for Vmax, they should be µmol/min/mg. There is a further inconsistency between the text saying µM/min/mg and the Figure saying µM/min/µg.

      (3) The consumption of formaldehyde to form methylene-THF is potentially interesting, but the authors say "HCHO levels decreased in the presence of THF", which could potentially be due to enzyme inhibition by THF. Is there evidence that this is a time-dependent and protein-dependent reaction? Also in Figure 1C, HCHO reduction (%) is not very helpful, because we don't know what concentration of formaldehyde is formed under these conditions; it would be better to quote in units of concentration, rather than %.

      (4) Has this particular TMAO demethylase been reported before? It's not clear which Paracoccus strain the enzyme is from; the Experimental Section just says "Paracoccus sp.", which is not very precise. There has been published work on the Paracoccus PS1 enzyme; is that the strain used? Details about the strain are needed, and the accession for the protein sequence.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Thach et al. report on the structure and function of trimethylamine N-oxide demethylase (TDM). They identify a novel complex assembly composed of multiple TDM monomers and obtain high-resolution structural information for the catalytic site, including an analysis of its metal composition, which leads them to propose a mechanism for the catalytic reaction.

      In addition, the authors describe a novel substrate channel within the TDM complex that connects the N-terminal Zn²-dependent TMAO demethylation domain with the C-terminal tetrahydrofolate (THF)-binding domain. This continuous intramolecular tunnel appears highly optimized for shuttling formaldehyde (HCHO), based on its negative electrostatic properties and restricted width. The authors propose that this channel facilitates the safe transfer of HCHO, enabling its efficient conversion to methylenetetrahydrofolate (MTHF) at the C-terminal domain as a microbial detoxification strategy.

      Strengths:

      The authors provide convincing high-resolution cryo-EM structural evidence (up to 2 Å) revealing an intriguing complex composed of two full monomers and two half-domains. They further present evidence for the metal ion bound at the active site and articulate a plausible hypothesis for the catalytic cycle. Substantial effort is devoted to optimizing and characterizing enzyme activity, including detailed kinetic analyses across a range of pH values, temperatures, and substrate concentrations. Furthermore, the authors validate their structural insights through functional analysis of active-site point mutants.

      In addition, the authors identify a continuous channel for formaldehyde (HCHO) passage within the structure and support this interpretation through molecular dynamics simulations. These analyses suggest an exciting mechanism of specific, dynamic, and gated channeling of HCHO. This finding is particularly appealing, as it implies the existence of a unique, completely enclosed conduit that may be of broad interest, including potential applications in bioengineering.

      Weaknesses:

      Although the idea of an enclosed channel for HCHO is compelling, the experimental evidence supporting enzymatic assistance in the reaction of HCHO with THF is less convincing. The linear regression analysis shown in Figure 1C demonstrates a THF concentration-dependent decrease in HCHO, but the concentrations used for THF greatly exceed its reported KD (enzyme concentration used in this assay is not reported). It has previously been shown that HCHO and THF can couple spontaneously in a non-enzymatic manner, raising the possibility that the observed effect does not require enzymatic channeling. An additional control that can rule out this possibility would help to strengthen the evidence. For example, mutating the THF binding site to prevent THF binding to the protein complex could clarify whether the observed decrease in HCHO depends on enzyme-mediated proximity effects. A mutation which would specifically disable channeling could be even more convincing (maybe at the narrowest bottleneck).

      We agree with the reviewer that HCHO and THF can react spontaneously in a non-enzymatic manner, and our experiments were not intended to demonstrate enzymatic channeling. The linear regression analysis in Figure 1C was designed solely to confirm that HCHO reacts with THF under our assay conditions. Accordingly, THF was titrated over a broad concentration range starting from zero, and the observed THF concentration–dependent decrease in HCHO reflects this chemical reactivity.

      We do not interpret these data as evidence that the enzyme catalyzes or is required for the HCHO–THF coupling reaction. Instead, the structural observation of an enclosed channel is presented as a separate finding. We have clarified this point in the revised text to avoid overinterpretation of the biochemical data (page 2, line 16).

      Another concern is that the observed decrease in HCHO could alternatively arise from a reduced production of HCHO due to a negative allosteric effect of THF binding on the active site. From this perspective, the interpretation would be more convincing if a clear coupled effect could be demonstrated, specifically, that removal of the product (HCHO) from the reaction equilibrium leads to an increase in the catalytic efficiency of the demethylation reaction.

      We agree that, in principle, a decrease in detectable HCHO could also arise from an indirect effect of THF binding on enzyme activity. However, in our study the experiment was not designed to assess catalytic coupling or allosteric regulation. The assay in question monitors HCHO levels under defined conditions and does not distinguish between changes in HCHO production and downstream consumption.

      Additionally, we do not interpret the observed decrease in HCHO as evidence that THF binding enhances catalytic efficiency, or that removal of HCHO shifts the reaction equilibrium. Instead, the data are presented to establish that HCHO can react with THF under the assay conditions. Any potential allosteric effects of THF on the demethylation reaction, or kinetic coupling between HCHO removal and catalysis, are beyond the scope of the current study, and are not claimed.

      While the enzyme kinetics appear to have been performed thoroughly, the description of the kinetic assays in the Methods section is very brief. Important details such as reaction buffer composition, cofactor identity and concentration (Zn<sup>2+</sup>), enzyme concentration, defined temperature, and precise pH are not clearly stated. Moreover, a detailed methodological description could not be found in the cited reference (6), if I am not mistaken.

      Thank you for the suggestion. We have added reference [24] to the methodological description on page 8. The Methods section has been revised accordingly on page 8 under “TDM Activity Assay,” without altering the Zn<sup>2+</sup> concentration.

      The composition of the complex is intriguing but raises some questions. Based on SDS-PAGE analysis, the purified protein appears to be predominantly full-length TDM, and size-exclusion chromatography suggests an apparent molecular weight below 100 kDa. However, the cryo-EM structure reveals a substantially larger complex composed of two full-length monomers and two half-domains.

      We appreciate the reviewer’s careful analysis of the apparent discrepancy between the biochemical characterization and the cryo-EM structure. This issue is addressed in Figure S1, which may have been overlooked.

      As shown in Figure S1, the stability of TDM is highly dependent on protein and salt conditions. At 150 mM NaCl, SEC reveals a dominant peak eluting between 10.5 and 12 mL, corresponding to an estimated molecular weight of ~170–305 kDa (blue dot, Author response image 1). This fraction was explicitly selected for cryo-EM analysis and yields the larger complex observed in the reconstruction. At lower salt concentrations (50 mM) or higher (>150 mM NaCl), the protein either aggregates or elutes near the void volume (~8 mL).

      SDS–PAGE analysis detects full-length TDM together with smaller fragments (~40–50 kDa and ~22–25 kDa). The apparent predominance of full-length protein on SDS–PAGE likely reflects its greater staining intensity per molecule and/or a higher population, rather than the absence of truncated species.

      Author response image 1.

      Given the lack of clear evidence for proteolytic fragments on the SDS-PAGE gel, it is unclear how the observed stoichiometry arises. This raises the possibility of higher-order assemblies or alternative oligomeric states. Did the authors attempt to pick or analyze larger particles during cryo-EM processing? Additional biophysical characterization of particle size distribution - for example, using interferometric scattering microscopy (iSCAT)-could help clarify the oligomeric state of the complex in solution.

      Cryo-EM data were collected exclusively from the size-exclusion chromatography fraction eluting between 10.5 and 12 mL. This fraction was selected to isolate the dominant assembly in solution. Extensive 2D and 3D particle classification did not reveal distinct classes corresponding to smaller species or higher-order oligomeric assemblies. Instead, the vast majority of particles converged to a single, well-defined structure consistent with the 2 full-length + 2 half-domain stoichiometry.

      A minor subpopulation (~2%) exhibited increased flexibility in the N-terminal region of the two full-length subunits, but these particles did not form a separate oligomeric class, indicating conformational heterogeneity rather than alternative assembly states (Author response image 2). Together, these data support the 2+2½ architecture as the predominant and stable complex under the conditions used for cryo-EM. Additional techniques, such as iSCAT, would provide complementary information, but are not required to support the conclusions drawn from the SEC and cryo-EM analyses presented here.

      Author response image 2.

      The authors mention strict symmetry in the complex, yet C2 symmetry was enforced during refinement. While this is reasonable as an initial approach, it would strengthen the structural interpretation to relax the symmetry to C1 using the C2-refined map as a reference. This could reveal subtle asymmetries or domain-specific differences without sacrificing the overall quality of the reconstruction.

      We thank the reviewer for this thoughtful suggestion. In standard cryo-EM data processing, symmetry is typically not imposed initially to minimize potential model bias; accordingly, we first performed C1 refinement before applying C2 symmetry. The resulting C1 reconstructions revealed no detectable asymmetry or domain-specific differences relative to the C2 map. In addition, relaxing the symmetry consistently reduced overall resolution, indicating lower alignment accuracy and further supporting the presence of a predominantly symmetric assembly.

      In this context, the proposed catalytic role of Zn<sup>2+</sup> raises additional questions. Why is a 2:1 enzyme-to-metal stoichiometry observed, and how does this reconcile with previous reports? This point warrants discussion. Does this imply asymmetric catalysis within the complex? Would the stoichiometry change under Zn<sup>2+</sup>-saturating conditions, as no Zn<sup>2+</sup> appears to be added to the buffers? It would be helpful to clarify whether Zn<sup>2+</sup> occupancy is equivalent in both active sites when symmetry is not imposed, or whether partial occupancy is observed.

      The observed ~2:1 enzyme-to-Zn<sup>2+</sup> stoichiometry likely reflects the composition of the 2 full-length + 2 half-domain (2+2½) complex. In this assembly, only the core domains that are fully present in the complex contribute to metal binding. The truncated or half-domains lack the Zn<sup>2+</sup> binding domain. As a result, only two metal-binding sites are occupied per assembled complex, consistent with the measured stoichiometry.

      We note that Zn<sup>2+</sup> was not deliberately added to the buffers, so occupancy may not reflect full saturation. Based on our cryo-EM and biochemical data, both metal-binding sites in the full-length subunits appear to be occupied to an equivalent extent, and no clear evidence of asymmetric catalysis is observed under these current experimental conditions. Full Zn<sup>2+</sup> saturation could potentially increase occupancy, but was not explored in these experiments.

      The divalent ion Zn<sup>2+</sup> is suggested to activate water for the catalytic reaction. I am not sure if there is a need for a water molecule to explain this catalytic mechanism. Can you please elaborate on this more? As one aspect, it might be helpful to explain in more detail how Zn-OH and D220 are recovered in the last step before a new water molecule comes in.

      Thank you for your suggestion. We revised our text in page 2 as bellow.

      Based on our structural and biochemical data, we propose a structurally informed working model for TMAO turnover by TDM (Scheme 1). In this model, Zn<sup>2+</sup> plays a non-redox role by polarizing the O–H bond of the bound hydroxyl, thereby lowering its pK<sub>a</sub>. The D220 carboxylate functions as a general base, abstracting the proton to generate a hydroxide nucleophile. This hydroxide then attacks the electrophilic N-methyl carbon of TMAO, forming a tetrahedral carbinolamine (hemiaminal) intermediate. Subsequent heterolytic cleavage of the C–N bond leads to the release of HCHO. D220 then switches roles to act as a general acid, donating a proton to the departing nitrogen, which facilitates product release and regenerates the active site. This sequence allows a new water molecule to rebind Zn<sup>2+</sup>, enabling subsequent catalytic turnovers. This proposed pathway is consistent with prior mechanistic studies, in which water addition to the azomethine carbon of a cationic Schiff base generates a carbinolamine intermediate, followed by a rate-limiting breakdown to yield an amino alcohol and a carbonyl compound, in the published case, an aldehyde (Pihlaja et al., J. Chem. Soc. Perkin Trans. 2, 1983, 8, 1223–1226).

      Overall, the authors were successful in advancing our structural and functional understanding of the TDM complex. They suggest an interesting oligomeric complex composition which should be investigated with additional biophysical techniques.

      Additionally, they provide an intriguing hypothesis for a new type of substrate channeling. Additional kinetic experiments focusing on HCHO and THF turnover by enzymatic proximity effects would strengthen this potentially fundamental finding. If this channeling mechanism can be supported by stronger experimental evidence, it would substantially advance our understanding and knowledge of biologic conduits and enable future efforts in the design of artificial cascade catalysis systems with high conversion rate and efficiency, as well as detoxification pathways.

      Reviewer #2 (Public review):

      Summary:

      The manuscript reports a cryo-EM structure of TMAO demethylase from Paracoccus sp. This is an important enzyme in the metabolism of trimethylamine oxide (TMAO) and trimethylamine (TMA) in human gut microbiota, so new information about this enzyme would certainly be of interest.

      Strengths:

      The cryo-EM structure for this enzyme is new and provides new insights into the function of the different protein domains, and a channel for formaldehyde between the two domains.

      Weaknesses:

      (1) The proposed catalytic mechanism in this manuscript does not make sense. Previous mechanistic studies on the Methylocella silvestris TMAO demethylase (FEBS Journal 2016, 283, 3979-3993, reference 7) reported that, as well as a Zn2+ cofactor, there was a dependence upon non-heme Fe<sup>2+</sup>, and proposed a catalytic mechanism involving deoxygenation to form TMA and an iron(IV)-oxo species, followed by oxidative demethylation to form DMA and formaldehyde.

      In this work, the authors do not mention the previously proposed mechanism, but instead say that elemental analysis "excluded iron". This is alarming, since the previous work has a key role for non-heme iron in the mechanism. The elemental analysis here gives a Zn content of about 0.5 mol/mol protein (and no Fe), whereas the Methylocella TMAO demethylase was reported to contain 0.97 mol Zn/mol protein, and 0.35-0.38 mol Fe/mol protein. It does, therefore, appear that their enzyme is depleted in Zn, and the absence of Fe impacts the mechanism, as explained below.

      The proposed catalytic mechanism in this manuscript, I am sorry to say, does not make sense to me, for several reasons:

      (i) Demethylation to form formaldehyde is not a hydrolytic process; it is an oxidative process (normally accomplished by either cytochrome P450 or non-heme iron-dependent oxygenase). The authors propose that a zinc (II) hydroxide attacks the methyl group, which is unprecedented, and even if it were possible, would generate methanol, not formaldehyde.

      (ii) The amine oxide is then proposed to deoxygenate, with hydroxide appearing on the Zn - unfortunately, amine oxide deoxygenation is a reductive process, for which a reducing agent is needed, and Zn2+ is not a redox-active metal ion;

      (iii) The authors say "forming a tetrahedral intermediate, as described for metalloproteinase", but zinc metalloproteases attack an amide carbonyl to form an oxyanion intermediate, whereas in this mechanism, there is no carbonyl to attack, so this statement is just wrong.

      So on several counts, the proposed mechanism cannot be correct. Some redox cofactor is needed in order to carry out amine oxide deoxygenation, and Zn<sup>2+</sup>cannot fulfil that role. Fe<sup>2+</sup> could do, which is why the previously proposed mechanism involving an iron(IV)-oxo intermediate is feasible. But the authors claim that their enzyme has no Fe. If so, then there must be some other redox cofactor present. Therefore, the authors need to re-analyse their enzyme carefully and look either for Fe or for some other redox-active metal ion, and then provide convincing experimental evidence for a feasible catalytic mechanism. As it stands, the proposed catalytic mechanism is unacceptable.

      We thank the reviewer for the detailed and thoughtful mechanistic critique. We fully agree that Zn<sup>2+</sup> is not redox-active, and cannot directly mediate oxidative demethylation or amine oxide deoxygenation. We acknowledge that the oxidative step required for the conversion of TMAO to HCHO is not explicitly resolved in the present study. Accordingly, we have revised the manuscript to remove any implication of Zn<sup>2+</sup>-mediated redox chemistry, and have eliminated the previously imprecise analogy to zinc metalloproteases.

      We recognize and now discuss prior biochemical work on TMAO demethylase from Methylocella silvestris (MsTDM), which proposed an iron-dependent oxidative mechanism (Zhu et al., FEBS 2016, 3979–3993). That study reported approximately one Zn<sup>2+</sup> and one non-heme Fe<sup>2+</sup> per active enzyme, implicated iron in catalysis through homology modeling and mutagenesis, and used crossover experiments suggesting a trimethylamine-like intermediate and oxygen transfer from TMAO, consistent with an Fe-dependent redox process. However, that system lacked experimental structural information, and did not define discrete metal-binding sites.

      In contrast,

      (1) Our high-resolution cryo-EM structures and metal analyses of TDM consistently reveal only a single, well-defined Zn<sup>2+</sup>-binding site, with no structural evidence for an additional iron-binding site as in the previous report (Zhu et al., FEBS 2016, 3979–3993).

      (2) To investigate the potential involvement of iron, we expressed TDM in LB medium supplemented with Fe(NH<sub>4</sub>)<sub>2</sub>SO<sub>4</sub> and determined its cryo-EM structure. This structure is identical to the original one, and no EM density corresponding to a second iron ion was observed. Moreover, the previously proposed Fe<sup>2+</sup>-binding residues are spatially distant (Figure S6).

      (3) ICP-MS analysis shows undetectable Iron, and only Zinc ion (Figure S5).

      (4) Our enzyme kinetics analysis with the TDM without Iron is comparable to that of from MsTDM (Figure 1A). The differences in Km and Vmax we propose is due to the difference in the overall sequence of the enzymes. Please also see comment at the end on a new published paper on MsTDM.

      While we cannot comment on the MsTDM results, our ‘experimental’ results do not support the presence of an iron-binding site. Our data indicate that this chemistry is unlikely to be mediated by a canonical non-heme iron center as proposed for MsTDM. We therefore revised our model as a structural framework that rationalizes substrate binding, metal coordination, and product stabilization, while clearly delineating the limits of mechanistic inference supported by the current data.

      The scheme 1 and proposal mechanism section were revised in page 4. Figure S6 was added.

      (2) Given the metal content reported here, it is important to be able to compare the specific activity of the enzyme reported here with earlier preparations. The authors do quote a Vmax of 16.52 µM/min/mg; however, these are incorrect units for Vmax, they should be µmol/min/mg. There is a further inconsistency between the text saying µM/min/mg and the Figure saying µM/min/µg.

      Thank you for the correction. We converted the V<sub>max</sub> unit to nmol/min/mg. and revised the text in page 2. We also compared with the value of the previous report in the TDM enzyme by revising the text on page 2. See also the note on a newly published manuscript and its comparison.

      (3) The consumption of formaldehyde to form methylene-THF is potentially interesting, but the authors say "HCHO levels decreased in the presence of THF", which could potentially be due to enzyme inhibition by THF. Is there evidence that this is a time-dependent and protein-dependent reaction? Also in Figure 1C, HCHO reduction (%) is not very helpful, because we don't know what concentration of formaldehyde is formed under these conditions; it would be better to quote in units of concentration, rather than %.

      We appreciate this important point. We have revised Figure 1C to present HCHO levels in absolute concentration units. While the current data demonstrate reduced detectable HCHO in the presence of THF, we agree that distinguishing between HCHO consumption and potential THF-mediated enzyme inhibition would require dedicated time-course and protein-dependence experiments. We have therefore revised the description to avoid overinterpretation and limit our conclusions to the observed changes in HCHO concentration in page 2, line 18-19.

      (4) Has this particular TMAO demethylase been reported before? It's not clear which Paracoccus strain the enzyme is from; the Experimental Section just says "Paracoccus sp.", which is not very precise. There has been published work on the Paracoccus PS1 enzyme; is that the strain used? Details about the strain are needed, and the accession for the protein sequence.

      Thank you for this comment. We now indicate that the enzyme is derived from Paracoccus sp. DMF and provide the accession number for the protein sequence (WP_263566861) in the Experimental Section (page 8, line 4).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The ITC experiment requires a ligand-into-buffer titration as an additional control. Also, maybe I misunderstood the molar ratio or the concentrations you used, but if you indeed added a total of 4.75 μL of 20 μM THF into 250 μL of 5 μM TDM, it is not clear to me how this leads to a final molar ratio of 3.

      We thank the reviewer for this suggestion. A ligand-into-buffer control ITC experiment was performed and is now included in Figure S8C, which shows no realizable signal.

      Regarding the molar ratio, it is our mistake. The experiment used 2.45 μL injections of 80 μM THF into 250 μL of 5 μM TDM. This corresponds to a final ligand concentration of ~12.8 μM, giving a ligand-to-protein molar ratio of ~2.6. We revised our text in page 9, ITC section.

      (2) Characterization/quality check of all mutant enzymes should be performed by NanoDSF, CD spectroscopy or similar techniques to confirm that proteins are properly folded and fit for kinetic testing.

      We appreciate the reviewer’s suggestion. All mutant proteins, including D220A, D367A, and F327A, were purified with yields similar to the wild-type enzyme. Additionally, cryo-EM maps of the mutants show well-defined density and overall structural integrity consistent with the wild-type. These findings indicate that the introduced mutations do not significantly affect protein folding, supporting their use for kinetic analysis. While NanoDSF might reveal differences in thermal stability due to mutations, it does not provide structural information. Our conclusions are not based on minor differences in thermostability. Our cryo-EM structures of the mutants offer much more reliable structural data than CD spectroscopy.

      (3) Best practice would suggest overlapping pH ranges between different buffer systems in the pH-dependence experiments to rule out buffer-specific effects independent of pH.

      We thank the reviewer for this helpful suggestion. We agree that overlapping pH ranges between different buffer systems can be valuable for excluding buffer-specific effects. In this study, the pH-dependence experiments were intended to provide a qualitative assessment of pH sensitivity rather than a detailed analysis of buffer-independent pKa values. While we cannot fully exclude minor buffer-specific contributions, the overall trends observed were reproducible and sufficient to support the conclusions drawn. We have added a clarifying statement to the revised manuscript to reflect this consideration, page 2, line 12.

      (4) Structural comparison revealed high similarity to a THF-binding protein, with superposition onto a T protein.": It would be nice to show this as an additional figure, as resolution and occupancy for THF are low.

      We thank the reviewer for this suggestion. To address this point, we have revised Figure S6 by adding an additional panel (C, now is Figure S7C) showing the structural superposition of TDM with the THF-binding T protein. This comparison is included to better illustrate the structural similarity, despite the limited resolution and partial occupancy of THF density in our map.

      (5) Editing could have been done more thoroughly. Some spelling mistakes, e.g. "RESEULTS", "redius", "complec"; kinetic rate constants should be written in italic (not uniform between text and figures); Prism version is missing; Vmax of 16.52 µM/min/mg - doublecheck units; Figure S1B: The "arrow on the right" might have gone missing.

      We corrected the spelling in page 2 ~ line 10, page 5 ~ line 34, page 6 ~ line40. Prism version was added. The arrow was added into figure S1B. The Vmax unit is corrected to nmol/min/mg.

      Reviewer #2 (Recommendations for the authors):

      (1) The authors must re-examine the metal content of their purified enzyme, looking in particular for Fe or another redox-active metal ion, which could be involved in a reasonable catalytic mechanism.

      We thank the reviewer for this suggestion and have carefully re-examined the metal content of TDM. Elemental analyses by EDX and ICP-MS consistently detected Zn<sup>2+</sup> in purified TDM (Zn:protein ≈ 1:2), whereas Fe was below the detection limit across multiple independent preparations (Fig. S5A,B). To assess whether iron could be incorporated or play a functional role, we expressed TDM in E. coli grown in LB medium supplemented with Fe(NH<sub>4</sub>SO<sub>4</sub>)<sub>2</sub> and performed activity assays in the presence of exogenous Fe<sup>2+</sup>. Neither condition resulted in enhanced enzymatic activity.

      Consistent with these biochemical data, all cryo-EM structures reveal a single, well-defined metal-binding site coordinated by three conserved cysteine residues and occupied by Zn<sup>2+</sup>, with no evidence for an additional iron species or other redox-active metal site.

      (2) The specific activity of the enzyme should be quoted in the same units as other literature papers, so that the enzyme activity can be compared. It could be, for example, that the content of Fe (or other redox-active metal) is low, and that could then give rise to a low specific activity.

      Thank you for the suggestion, we quoted the enzyme units as similar with previous report. and revised the text in in page 2.

      Since the submission of our paper a new report on MsTDM has been published (Cappa et al., Protein Science 33(11), e70364). It further supports our findings. First, the reported kinetic parameters using ITC (Vmax = 0.309 μmol/s, approximately 240 nmol/min/mg; Km = 0.866 mM) are comparable to our observed (156 nmol/min/mg and 1.33 mM, respectively) in the absence of exogenous iron. Second, the optimal pH for enzymatic activity similar to that observed in our paraTDM. Third, the reported two-state unfolding behavior is consistent with our cryo-EM structural observations, in which the more dynamic subunits appear to destabilize prior to unfolding of the core domains. Based on these findings, we now propose that Zn<sup>2+</sup> appears to function primarily as an organizational cofactor at the core catalytic domain (revised Scheme 1).

    1. Reviewer #1 (Public review):

      This study by Radziun and colleagues investigates the effects of using a hand-augmentation device on mental body representations. The authors use a proprioceptive localisation task to measure metric representations of finger length before and after participants wear the device, and then before and after they learn to use the device, which extends the lengths of the fingers by 10 cm. The authors find changes between different time points, which they interpret as evidence for three distinct forms of plasticity: one related to simply wearing the device, one related to learning to use it, and an aftereffect after taking the device off. A control experiment with a similar device, which does not lengthen the fingers, showed the first and third of these forms of plasticity, but not the second.

      This study takes an interesting approach to a timely and theoretically significant issue. The study appears to be appropriately designed and conducted. There are, however, some points which require clarification.

      (1) The nature of the localization task is unclear. On its face, the task appears to involve localization of each landmark within the 2-dimensional surface of the touchscreen. However, the regression analysis presupposes that localization is made in a 1-dimensional space. Figure S2 shows that three lines are presented on the screen above the index, middle, and ring fingers, which I imagine the participant is meant to use as a guide. But it is at least conceivable that the perceived location or orientation of the finger might not correspond exactly to these lines. While the method can deal gracefully with proximal-distal translations of the fingers (i.e., with the intercept parameter of the regression), it isn't clear how the participant is supposed to respond if their proprioceptive perception of finger location is translated left-right or rotated relative to the lines on the screen. I also worry that presenting a long, thin line to represent each finger on the screen may not be a neutral method and may prime participants to represent the finger as long and thin.

      (2) The task used here fits within a wider family of tasks in the literature using localization judgments of multiple landmarks to map body representations. I feel that some discussion of this broader set of tasks and their use to measure body representation and plasticity is notably absent from the paper. It is also striking to me that some of the present authors have themselves recently criticized the use of landmark localization methods as a measure of represented body size and shape (Peviani et al, 2024, Current Biology). It is therefore surprising to see them use this task here as a measure of represented finger length without commenting on this issue.

      (3) 18 participants strikes me as a relatively small sample size for this type of study. It weakens the manuscript that the authors do not provide any justification, or even comment on, the sample size. This is especially true as participants are excluded from the entire sample, and from specific analyses, on rather post-hoc grounds.

      (4) I have some concerns about the interpretation of contraction in stage 2. The authors claim that wearing the finger extended produces "a contraction",i.e., an "under-representation" (page 12). But in both experiments, regression slopes in stage 2 were not significantly different from 1 (i.e., 0.98 [SE: 0.07] in Exp 1a and 1.04 [SE: 0.09] in Experiment 1b). So how can that be interpreted as "under-representation"?

      (5) I also have concerns about the interpretation of the stretch that is claimed to occur following training. In Exp 1a, regression slopes in stage 3 are on average 1.15. That is LESS than in the pretest at stage 1 (mean: 1.16). The idea of stretch only comes about because of the lower slopes in stage 2, which the authors have interpreted as reflecting contraction. So what the authors call stretch and a 2nd form of plasticity could just be the contraction from stage 2 wearing off or dissipating, since perceived finger length in stage 3 just appears to return to the baseline level seen in stage 1. While the authors describe their results in terms of three distinct forms of plasticity, these are not in fact statistically independent. The dip in regression slopes in stage 2 is interpreted as evidence for two distinct plasticity effects, which I do not find convincing.

      (6) The distinction between plasticity at stage 3 (which appears specific to augmentation) and plasticity at stage 4 (which does not appear specific, as it also occurs in Experiment 1b) feels strained. This feels like a very subtle distinction, and the theoretical significance of it is not convincingly developed.

      (7) The reporting of statistics is not always consistent. For example, 95%CIs are presented for regression slopes in stages 1, 3, and 4, but not for stage 2. Statistics are performed on regression slopes, except for one t-test on page 7 comparing lengths in cm. Estimates of effect size would be nice additions to statistical tests.

      (8) Minor point: On page 4, the authors write, "These included sorting colored blocks, stacking a Jenga tower, and sorting pegs into holes; the latter task required fine-grained manipulation and was used as our outcome measure of motor learning." This suggests that peg sorting was the outcome measure, but in Figure 1D, Jenga is presented as the outcome measure.

    1. Note d'Information : Priorités de la Protection de l’Enfance et Justice des Mineurs

      Synthèse de l'Exécutif

      Ce document synthétise les orientations stratégiques et les réformes engagées par le ministère de la Justice pour renforcer la protection de l’enfance et moderniser la justice des mineurs.

      Les points clés incluent :

      Urgence et Rapidité : Réduction des délais de jugement (passés de 18 mois à 8,7 mois en quatre ans) et création d'une ordonnance de protection provisoire permettant au procureur de statuer en 72 heures.

      Refonte du Placement : Fermeture des Centres Éducatifs Fermés (CEF) publics au profit des Unités de Placement de la Jeunesse et de l'Éducation (UJPE), mettant l'accent sur la continuité pédagogique (52 semaines/an).

      Moyens Humains Massifs : Création de 1 600 postes au ministère de la Justice, dont 50 nouveaux cabinets de juges des enfants en deux ans et 70 postes à la Protection Judiciaire de la Jeunesse (PJJ).

      Évolutions Législatives : Soutien à l'imprescriptibilité des crimes sexuels sur mineurs, à la présence obligatoire de l'avocat pour l'enfant, et volonté de réformer l'« excuse de minorité » pour les crimes les plus graves.

      Protection contre les Fléaux Modernes : Lutte contre la prostitution des mineurs (6 prostituées sur 10 sont mineures), interdiction des téléphones portables en centres de placement, et encadrement du protoxyde d'azote.

      --------------------------------------------------------------------------------

      1. Renforcement de la Protection des Enfants Victimes

      Urgence Judiciaire et Mesures de Sûreté

      L'accent est mis sur la nécessité d'une justice qui s'adapte au rythme de l'enfant.

      Ordonnance de protection provisoire : Un nouveau dispositif permet au procureur d'agir en 72 heures pour protéger immédiatement un mineur, avec des interdictions de contact et l'attribution provisoire du logement au parent protecteur.

      Le juge dispose ensuite de 8 jours pour être saisi et de 15 jours pour statuer.

      Loi du 18 mars 2024 : Prévoit le retrait automatique de l'autorité parentale pour les parents condamnés pour crime ou violence sexuelle sur leur enfant, ainsi que l'élargissement de la suspension de l'exercice de cette autorité dès la mise en examen.

      Accompagnement et Droits des Mineurs

      Avocat pour l'enfant : Soutien à la présence obligatoire d'un avocat en assistance éducative.

      Une expérimentation avec les barreaux est envisagée avant une généralisation législative.

      Unités d'Accueil Pédiatrique (UAPED) : Déploiement en cours sur tout le territoire pour améliorer le recueil de la parole et le soin des victimes.

      Chiens d'assistance judiciaire : Passage de 10 à une trentaine de chiens actuellement, avec un objectif de 100 chiens (un par département) d'ici un à deux ans pour apaiser les enfants lors des procédures.

      --------------------------------------------------------------------------------

      2. Réforme de la Justice Pénale des Mineurs

      Équilibre entre Sanction et Éducation

      La doctrine ministérielle refuse l'opposition entre ces deux concepts.

      La sanction comme acte éducatif : « La sanction fait partie de l'éducation. La sanction toute seule n'est pas un but en soi [...] et une éducation sans aucun interdit mène au n'importe quoi. »

      Efficacité du Code de la Justice Pénale des Mineurs (CJPM) : Les délais entre les faits et la sanction ont été divisés par deux en quatre ans (8,7 mois en 2024 contre 18 mois en 2020).

      Transformation des Structures de Placement

      Le constat sur les Centres Éducatifs Fermés (CEF) est jugé sévère : coût élevé (30 à 50 % de plus), taux de fugue identique aux centres classiques, et déshérence éducative (seulement 5 à 10 heures de cours par semaine).

      Création des UJPE : Ces nouvelles unités fusionnent les anciens foyers et les CEF pour garantir un parcours de reconstruction pédagogique.

      Recrutement de professeurs techniques : Réouverture d'un concours pour 40 professeurs dépendant directement du ministère de la Justice afin d'assurer 26 heures de cours par semaine, 52 semaines sur 52, y compris durant les vacances scolaires.

      Santé et Addictions : Recrutement de 60 infirmiers pour pallier les carences de soins psychiatriques et de prise en charge des addictions dans les centres de placement.

      --------------------------------------------------------------------------------

      3. Moyens et Organisation de la Justice

      Augmentation des Effectifs

      Le budget de la Justice permet une hausse inédite des moyens humains :

      Magistrature : Création de 50 cabinets de juges des enfants supplémentaires en deux ans (notamment à Bobigny, Cambrai, Alès).

      Actuellement, certains cabinets gèrent entre 400 et 500 dossiers.

      PJJ : Recréation de 70 postes, permettant de renforcer les effectifs là où ils baissaient depuis 20 ans (ex: Marseille, Île-de-France).

      Milieu Ouvert : Réaffectation de 150 éducateurs vers le milieu ouvert pour ramener la charge de travail à environ 23 dossiers par agent (contre 25 auparavant).

      Unité de Commandement

      Le système actuel est jugé trop fragmenté (plusieurs ministères concernés, compétences partagées avec les départements pour l'ASE).

      Une volonté de meilleure coordination, voire d'unité de responsabilité, est exprimée.

      --------------------------------------------------------------------------------

      4. Enjeux de Société et Nouvelles Menaces

      Violences Sexuelles et Imprescriptibilité

      Fin de la prescription : Avis favorable pour l'imprescriptibilité des crimes sexuels sur mineurs, ainsi que pour les crimes de sang (assassinats).

      Prostitution des mineurs : Un constat alarmant montre que 60 % des prostituées en France sont mineures.

      Des unités dédiées au sein de la PJJ sont opérationnelles depuis trois mois pour lutter contre ce fléau et les réseaux de proxénétisme.

      Sécurité Numérique et Addictions

      Interdiction des téléphones : La nouvelle circulaire de politique éducative et pénale impose l'interdiction des téléphones portables dans les chambres des centres de placement pour protéger les mineurs des prédations numériques (trafiquants, proxénètes).

      Protoxyde d'azote : Soutien à la pénalisation du transport et de l'achat en ligne (en dehors du cadre médical), alors que les intoxications ont triplé entre 2020 et 2023.

      Débats sur la Responsabilité Pénale

      Excuse de minorité : Position favorable à la fin de l'automatisme de l'atténuation de peine pour les crimes les plus graves (assassinats, tortures) commis par des mineurs de 13 à 15 ans.

      Cela nécessiterait une évolution constitutionnelle tout en préservant la spécialisation du jugement des mineurs.

      --------------------------------------------------------------------------------

      5. Données Clés et Statistiques

      | Indicateur | Donnée Source | | --- | --- | | Délai moyen de jugement (2020) | 18 mois | | Délai moyen de jugement (2024) | 8,7 mois | | Dossiers par cabinet de juge des enfants | 400 à 500 (moyenne) | | Proportion de mineurs parmi les prostitués | 60 % | | Nombre de mineurs à l'ASE | 400 000 (dont 200 000 placés) | | Heures de cours en CEF | < 10h/semaine (contre 26h en milieu classique) | | Placements chez des tiers de confiance | < 9 % (19 000 jeunes) |

      --------------------------------------------------------------------------------

      Citations Marquantes

      « L'enfant ne vit pas au rythme d'un dossier administratif ou d'un dossier judiciaire. [...] 4 mois pour un mineur c'est une vie. »

      « Nous devrions pouvoir en grande partie avoir honte de la façon dont on traite une partie de ces enfants notamment à l'aide sociale à l'enfance. »

      « Le placement doit protéger et pas rendre encore plus vulnérable. »

      « La sanction fait partie de l'éducation. [...] Une éducation sans jamais aucun interdit mène au n'importe quoi. »

    1. Risk

      Step 1: Prepare for Assessment Before you start, you need a plan. You align the assessment with the organization's goals. (Slide 2 explains this in detail).

      Step 2: Conduct Assessment (The Core) This is the "Execution" phase. Memorize this sequence inside the gray box, as it is often a quiz question:

      • Identify Threats: Who/what wants to attack us?
      • Identify Vulnerabilities: Where are we weak?
      • Determine Likelihood: What are the odds of this happening?
      • Determine Impact: If it happens, how bad will it hurt?
      • Determine Risk: Combine Likelihood and Impact to get a Risk score. Formula to remember: Risk = Likelihood × Impact

      Step 3: Communicate Results Notice the arrows go both ways. You don't just report at the end; you talk to stakeholders during the process to ensure facts are correct.

      Step 4: Maintain Assessment Risk assessment is not a one-time event. You must monitor and update it over time as technology changes.

    Annotators

    1. What value is printed when the following statement executes?

      go up from 18 to 20. do 20 divided by 4 = 6. 6 - 4 = 2. 2 is the remainder and the answer to 18 % 4.

    1. ¿A qué hora es la clase de Lengua Española de cada persona? ¿Tienen la clase a una hora igual o diferente?

      ¿A qué hora es la clase de Lengua Española de cada persona? ¿Tienen la clase a una hora igual o diferente? 9:00 AM - 11:00 AM. 9:00 AM - 11:00 AM. 12:00 PM - 2:00 PM.

    1. Défis et controverses de l'éducation des parents : Analyse et perspectives

      Résumé exécutif

      Le présent document synthétise l'analyse du sociologue Claude Martin concernant l'évolution des pratiques et des politiques d'éducation des parents en France.

      Le constat initial révèle un « effet de ciseaux » alarmant : une explosion de la souffrance psychique chez les jeunes coïncidant avec un affaissement de l'offre de soins et de soutien humain.

      L'analyse souligne un basculement paradigmatique majeur : le passage d'un déterminisme social (collectif et structurel) à un déterminisme parental (individuel et comportemental).

      Cette évolution a favorisé l'émergence d'un marché du conseil aux parents et d'une « parentalité positive » qui, bien que prônant la bienveillance, impose de nouvelles injonctions de performance et de bonheur.

      Le document explore également les usages politiques des neurosciences et les controverses actuelles entourant les méthodes éducatives, concluant sur le paradoxe du « double bind » (double contrainte) auquel les parents modernes sont confrontés.

      --------------------------------------------------------------------------------

      1. L'état des lieux : Une jeunesse en souffrance

      La situation actuelle de l'enfance en France est marquée par une dégradation notable de la santé mentale, un phénomène antérieur à la pandémie de COVID-19 mais accentué par celle-ci.

      L'effet de ciseaux

      Le Haut Conseil de l'enfance, de la famille et de l'âge (HCFEA) alerte sur deux phénomènes concomitants :

      Explosion de la demande : Une hausse massive des manifestations de souffrance psychique chez les enfants et adolescents.

      Affaissement de l'offre : Une réduction drastique des moyens de prise en charge (thérapies de parole, lieux d'accueil) et une crise du secteur de la pédopsychiatrie.

      La réponse médicamenteuse

      Faute de structures d'accompagnement suffisantes, la réponse s'est déplacée vers la prescription de psychotropes, avec des augmentations spectaculaires entre 2014 et 2021 :

      | Type de médicament | Augmentation de la prescription (2014-2021) | | --- | --- | | Antidépresseurs | \+ 63 % | | Psychostimulants | \+ 78 % | | Hypnotiques et sédatifs | \+ 155 % | | Antipsychotiques | \+ 50 % |

      Le phénomène du retrait social

      Le document identifie l'émergence en France du phénomène de retrait social (type Hikikomori), touchant principalement des garçons lycéens (15-17 ans).

      Ce refus d'entrer dans la course à la réussite scolaire est parfois analysé, de manière controversée, à travers le prisme de l'influence parentale (notamment des mères jugées excessives ou intrusives).

      --------------------------------------------------------------------------------

      2. Le basculement des déterminismes

      L'approche sociologique a radicalement changé de nature entre les années 1960 et aujourd'hui.

      Du collectif à l'individuel

      Le déterminisme social (Années 60-70) : La réussite ou l'échec d'un enfant était perçu comme le résultat de l'appartenance à une classe sociale et de la reproduction des inégalités. C'était un enjeu de lutte collective et politique.

      Le déterminisme parental (Actuel) : La focale s'est déplacée vers le comportement individuel des parents.

      Les difficultés de l'enfant (santé mentale, échec scolaire, comportement antisocial) sont désormais imputées à un déficit de « compétences parentales ».

      La psychologisation des problèmes publics

      Cette vision individualiste conduit à une responsabilisation accrue des parents, générant souvent un sentiment de culpabilité.

      Des auteurs comme Frank Furedi (Paranoid Parenting) ou d'autres parlent de « parentalité narcissique », soulignant un manque de confiance des adultes dans le futur qui compromettrait leur capacité à éduquer.

      --------------------------------------------------------------------------------

      3. Évolution historique du contrôle de la fonction parentale

      L'éducation des parents n'est pas un concept nouveau, mais elle a traversé plusieurs phases distinctes :

      1. Fin XIXe - Début XXe siècle (Hygiénisme et Protection) : Lutte contre la mortalité infantile et protection contre les « classes dangereuses ».

      Il s'agissait alors d'enseigner aux mères les soins de base et de limiter la puissance paternelle absolue.

      2. L'après-guerre (Le marché du conseil) : Émergence de manuels à succès (Benjamin Spock, Laurence Pernoud, Françoise Dolto).

      Ce secteur économique puissant prospère sur l'inquiétude des parents : plus ils consomment de conseils, plus ils se sentent déroutés, alimentant une consommation accrue.

      3. Années 1990 (L'invention de la « Parentalité ») : Le terme parenting (centré sur l'acte et le comportement plutôt que sur le statut) est traduit par « parentalité ».

      Cela devient un segment à part entière de l'action publique, visant à « soutenir » les parents, mais les prenant en réalité comme cibles d'intervention.

      --------------------------------------------------------------------------------

      4. Neurosciences et "Neuro-parenting"

      L'usage des neurosciences dans l'éducation des parents fait l'objet de critiques importantes, notamment concernant la surinterprétation de données scientifiques.

      Le mythe des trois premières années : Une fascination scientiste pour l'imagerie cérébrale a conduit à l'idée d'une « fenêtre d'opportunité » unique durant les trois premières années de vie.

      Cette vision déterministe présente le bébé comme un « petit ordinateur » dont le câblage dépendrait entièrement des stimuli parentaux.

      L'adolescent stigmatisé : À l'inverse de la vision « mine d'or » du cerveau du nourrisson, le cerveau de l'adolescent est souvent présenté par les politiques publiques comme « mal foutu » ou intrinsèquement problématique, justifiant des interventions urgentes.

      --------------------------------------------------------------------------------

      5. La parentalité positive : Entre bienveillance et injonction

      La « parentalité positive » est devenue un courant dominant, porté par un lobbying actif auprès des pouvoirs publics.

      La controverse du "Time Out"

      Une polémique oppose actuellement deux visions :

      Les partisans du cadre : Préconisent des méthodes simples comme le « Time Out » (envoyer l'enfant dans sa chambre) pour gérer les crises.

      Les radicaux de la bienveillance : Assimilent le « Time Out » à une « violence éducative ordinaire », créant une continuité entre ces pratiques et des dérives graves comme l'infanticide.

      L'injonction au bonheur

      La parentalité moderne impose une « norme sous la peau » : les mères ne doivent pas seulement bien agir, elles doivent être « authentiquement heureuses ». Un faux sourire est perçu comme dangereux pour l'enfant, créant une pression psychologique insoutenable pour les parents.

      --------------------------------------------------------------------------------

      6. Conclusion : Le paradoxe de la mission parentale

      Le document conclut sur l'impasse du « double bind » parental actuel :

      D'un côté : Les parents qui « n'en font pas assez » sont désignés comme irresponsables ou absents.

      De l'autre : Les « parents hélicoptères » (parentalité intensive) sont critiqués pour générer une dépendance problématique chez l'enfant.

      L'analyse de Claude Martin suggère que la politique de parentalité devrait redevenir un soutien collectif et générationnel plutôt qu'une focalisation sur les comportements individuels.

      L'éducation est une improvisation située historiquement ; les modèles parentaux ne peuvent être des invariants déconnectés du contexte social et des limites de chaque génération.

    1. Dossier de Synthèse : L'Implication des Usagers dans les Structures d'Exercice Coordonné

      Synthèse

      Ce document synthétise les enseignements du webinaire régional concernant l'indicateur « Implication des usagers » pour les Maisons de Santé Pluriprofessionnelles (MSP) et les Centres de Santé (CdS).

      Initialement centré sur la satisfaction des patients, cet indicateur a évolué pour devenir un levier global de transformation du système de santé, incitant les structures à passer d'une logique de soin « pour » le patient à une logique de soin « avec » le patient.

      Bien qu'optionnel, cet indicateur est considéré comme un objectif structurant pour l'exercice coordonné, conditionnant une partie du financement par l'Assurance Maladie via l'Accord Cadre Interprofessionnel (ACI).

      En 2024, plus de 70 % des structures ont atteint le niveau 1 de cet indicateur, démontrant une maturité croissante.

      Le passage au niveau 2, qui implique une co-décision et un partenariat pérenne, reste le défi majeur pour les équipes de soins primaires.

      1. Cadre Stratégique et Enjeux de l'Indicateur

      L'implication des usagers n'est plus perçue comme un objectif isolé, mais comme une démarche transversale visant à améliorer l'efficacité des soins et l'adéquation de l'offre de santé aux besoins réels des territoires.

      Objectifs de la démarche

      Améliorer la qualité des soins : En intégrant l'expertise de vie du patient (maladie, handicap).

      Renforcer la démocratie en santé : Donner une voix légitime aux usagers dans la co-construction des actions de santé.

      Évolution du projet de santé : Utiliser les retours des usagers pour faire évoluer de manière vivante le projet de la structure.

      Qualité de vie au travail (QVT) : Le partenariat est identifié comme un levier d'amélioration du quotidien des professionnels.

      Financement et Justification

      Le financement par l'Assurance Maladie est conditionné par la fourniture de justificatifs probants.

      Cette exigence est présentée non pas comme une suspicion, mais comme une garantie de transparence dans la gestion des fonds publics.

      Nouveauté : Les négociations en cours suggèrent une évolution du modèle pour supprimer les niveaux de complexité, tout en maintenant l'évaluation de la satisfaction et la co-décision.

      Dynamisme : Pour être rémunérée, une structure doit démontrer une progression ou une révision de ses outils d'une année sur l'autre.

      2. La Philosophie du Partenariat en Santé

      Le passage au partenariat repose sur un changement de paradigme, souvent appelé le « modèle de Montréal ».

      | Modèle | Approche | Position de l'usager | | --- | --- | --- | | Paternaliste | Pour le patient | Objet de soin, passif. | | Centré sur le patient | Pour le patient | Au centre des préoccupations, mais exclu des décisions d'équipe. | | Partenariat | Avec le patient | Membre de l'équipe, reconnaissance de ses savoirs expérientiels. |

      Le Continuum de l'Engagement

      L'implication se décline en quatre étapes progressives :

      1. Information : Diffusion de données de santé publique ou de fonctionnement de la structure.

      2. Consultation : Recueil d'avis (questionnaires de satisfaction, boîtes à idées).

      3. Collaboration : Travail conjoint sur des projets ponctuels (création d'une affiche, soirée thématique).

      4. Partenariat : Co-construction, co-décision et co-réalisation sur le long terme.

      3. Niveaux d'Atteinte et Justificatifs Requis

      L'indicateur se structure en deux niveaux cumulatifs pour l'octroi de la rémunération.

      Niveau 1 : Information et Consultation

      Actions : Mise en place d'outils pour évaluer la satisfaction et recueillir les besoins.

      Justificatifs : Exemplaires des questionnaires, synthèse des résultats, plan d'action découlant des retours usagers.

      Évolution annuelle : Si la structure reste au niveau 1, elle doit prouver que l'outil a été révisé ou analysé à nouveau.

      Niveau 2 : Collaboration et Partenariat

      Actions : Intégration pérenne des usagers dans la gouvernance ou les groupes de travail.

      Justificatifs : Désignation d'un référent usager, compte-rendu de réunions de co-construction, description de l'apport réel de l'usager dans les décisions.

      Exemple de dynamique : « Si l'année suivante la structure reste au niveau 2, elle doit évaluer ce qui a été fait l'année précédente dans le cadre de la collaboration. »

      4. Les Acteurs du Partenariat

      La diversité des profils permet d'adapter l'implication selon les besoins du projet de santé.

      L'Usager : Patient, personne accompagnée ou proche-aidant.

      Le Patient Partenaire / Expert : Individu ayant développé des compétences suite à sa maladie et pouvant intervenir en Éducation Thérapeutique du Patient (ETP) ou en recherche.

      Le Représentant des Usagers (RU) : Membre d'une association agréée, formé au système de santé et siégeant dans des instances officielles.

      Le Citoyen Engagé : Habitant du quartier souhaitant contribuer à la vie de la structure de proximité.

      Le Médiateur en Santé : Facilite le lien dans les salles d'attente ou lors de l'accueil.

      Donnée clé (Enquête BVA 2021) : 80 % des habitants d'Occitanie souhaitent le développement des regroupements de professionnels et 47 % se disent prêts à s'impliquer auprès de ces équipes.

      5. Exemples Concrets et Ressources

      Le webinaire a mis en avant des initiatives réussies illustrant la mise en œuvre de l'indicateur :

      Éducation Thérapeutique (ETP) : Une MSP a intégré un patient expert pour reconstruire totalement son programme diabète, augmentant significativement la satisfaction de la patientèle.

      Groupes de parole : En Haute-Garonne, une patiente partenaire et une psychologue co-animent mensuellement un groupe de parole sur le cancer.

      Gouvernance : Bien que les SISA (Sociétés Interprofessionnelles de Soins Ambulatoires) soient juridiquement limitées aux professionnels, des comités d'usagers peuvent être créés pour influencer les décisions stratégiques.

      Communication : Utilisation de lettres d'information, de panneaux en salle d'attente ou de vidéos "ambassadeurs" où des patients expliquent l'offre de soins de la structure à leurs pairs.

      Ressources Disponibles

      COPS (Centre Opérationnel du Partenariat en Santé) : Dispositif financé par l'ARS Occitanie offrant des fiches pratiques, un répertoire de patients partenaires et des compagnonnages.

      France Assos Santé : Propose des formations gratuites pour les usagers souhaitant s'impliquer.

      Haute Autorité de Santé (HAS) : Guide sur l'engagement des usagers dans les structures de soins primaires.

      6. Points de Vigilance et Obstacles

      Statut juridique et financier : Il n'existe pas encore de statut de « métier » pour le patient partenaire. La rémunération reste complexe (micro-entreprise ou bénévolat avec défrayage).

      Recrutement : Il est conseillé de recruter un patient partenaire « comme un collaborateur », sur la base de ses compétences, de son savoir-être et de valeurs partagées avec l'équipe.

      Représentativité : Il est illusoire de chercher une représentativité statistique parfaite. L'objectif est de combiner une diversité de visions et de compétences.

      Accompagnement : Compte tenu de l'absence de cadre légal rigide, les structures sont encouragées à se faire accompagner par des tiers facilitateurs pour sécuriser leurs projets.

    1. from phasic import Graph

      ModuleNotFoundError Traceback (most recent call last) Cell In[1], line 1 ----> 1 from phasic.utils import draw_coalescent_tree 2 import matplotlib.pyplot as plt 4 draw_coalescent_tree('GGTTTGGGA')

      ModuleNotFoundError: No module named 'phasic'

    1. L’Attention aux Vulnérabilités : Une Priorité Éthique et Pédagogique

      Résumé Exécutif

      Ce document de synthèse examine le rôle critique de l'attention aux vulnérabilités dans le milieu scolaire, positionnant cette approche non seulement comme une obligation éthique, mais aussi comme un facteur déterminant de l'efficacité pédagogique.

      L'analyse souligne que la relation enseignant-élève est intrinsèquement asymétrique, plaçant l'élève dans une position d'exposition aux risques — de la blessure émotionnelle au décrochage scolaire.

      Les points clés abordés incluent :

      La redéfinition de la vulnérabilité : Elle n'est plus perçue comme un état permanent de la personne, mais comme une situation (momentanée ou durable) affectant jusqu'à la moitié des effectifs scolaires sur une année.

      L'impact des besoins fondamentaux : La satisfaction des besoins de compétence, d'autonomie et d'affiliation est essentielle à la sécurité relationnelle.

      La lutte contre la « Violence Pédagogique Ordinaire » : L'identification et l'élimination des micro-violences (verbales, comportementales) sont impératives.

      Le passage à la bienveillance active : L'adoption de gestes professionnels ciblés, tels que le feedback positif et l'exigence bienveillante, corrèle directement avec la réussite des élèves.

      --------------------------------------------------------------------------------

      1. La Nature de la Relation Pédagogique : Une Asymétrie Fondamentale

      La relation éducative est définie par une asymétrie structurelle. L'enseignant détient la maîtrise des compétences, du statut, des objectifs pédagogiques, de l'espace et du temps, tandis que l'élève évolue dans une position de dépendance et de moindre conscience des enjeux.

      La Vulnérabilité comme Situation

      Le terme vulnérabilité (du latin vulnus, la blessure) désigne une fragilité qui expose l'élève à des risques de blessures concernant ses droits, sa dignité ou, plus fréquemment, ses besoins fondamentaux.

      Évolution conceptuelle : La recherche actuelle privilégie la notion de « situations de vulnérabilité » plutôt que de « personnes vulnérables ».

      Typologie des situations :

      Durables : Élèves en situation de handicap ou à besoins éducatifs particuliers (environ 470 000 à 800 000 élèves incluant les profils neurodéveloppementaux, haut potentiel et allophones).   

      Momentanées : Élèves traversant des crises familiales (séparation), économiques (perte d'emploi des parents), affectives ou liées au parcours migratoire.

      On estime que près de 50 % des élèves vivent de telles phases chaque année.

      --------------------------------------------------------------------------------

      2. Cartographie des Besoins Fondamentaux en Milieu Scolaire

      Pour garantir une relation éthique, l'enseignant doit répondre à une nomenclature de besoins multidimensionnels.

      | Catégorie de Besoin | Composantes Clés | | --- | --- | | Besoins de base (Deci & Ryan) | Compétence, Autonomie, Affiliation. | | Sécurité et Confiance | Sécurité relationnelle, confiance en soi, confiance en l'adulte et en l'institution. | | Socialisation et Équité | Appartenance au groupe, besoin de justice, respect et considération. | | Accompagnement | Besoin d'aide, besoin de temps, besoin de dialogue avec l'adulte. |

      --------------------------------------------------------------------------------

      3. Gestes Professionnels et Leviers de Réussite

      La recherche, notamment les méta-analyses de John Hattie, démontre que les facteurs relationnels ont un impact supérieur à la moyenne sur la réussite scolaire (coefficients de corrélation supérieurs à 0,7, là où le seuil de significativité est à 0,4).

      Levier Majeur : Le Feedback

      Le feedback positif agit comme un levier fondamental pour nourrir le besoin d'estime et de sécurité de l'élève. Il doit être intégré dans les moments pédagogiques critiques :

      • L'accueil des élèves.

      • La mise en activité.

      • Les phases d'évaluation (annonce, correction, exploitation).

      • La gestion des obstacles et des erreurs (dédramatisation).

      Communication et Posture

      La communication se divise en trois dimensions :

      1. Verbale : Les mots utilisés.

      2. Non-verbale : Gestes, mimiques, posture spatiale.

      3. Paraverbale : Ton, volume et débit de la voix (cruciaux pour la perception de la satisfaction de l'enseignant par l'élève).

      --------------------------------------------------------------------------------

      4. La Violence Pédagogique Ordinaire (VPO)

      La VPO regroupe des micro-violences souvent inconscientes mais délétères, désormais interdites par la loi du 10 juillet 2019.

      Manifestations : Cris, moqueries, intimidations, stigmatisations, discriminations sociales, comparaisons excessives ou injonctions paradoxales.

      Conséquences : Stress, mal-être, conduites antisociales et agressivité. Ces comportements ajoutent une vulnérabilité supplémentaire à celle déjà présente, créant un cercle vicieux de l'échec.

      --------------------------------------------------------------------------------

      5. Vers une Éthique de la Bienveillance Active

      L'éthique est ici définie comme une disposition psychique visant à rechercher le comportement le plus juste pour l'élève.

      Distinction entre Bienveillances

      Le passage d'une posture passive à une posture active est nécessaire :

      Bienveillance Passive (ou minimale) : Se limiter à ne pas blesser l'élève et le laisser affronter seul ses difficultés par manque de temps ou de ressources.

      Bienveillance Active : Caractérisée par une qualité de présence, un soutien de proximité, des exigences adaptées et un intérêt réel pour la personne de l'élève au-delà de ses résultats.

      Les 5 Modes d'Expression (selon Gwénola Reto)

      1. S'intéresser à l'élève : Encourager sa pensée et accepter ses divergences.

      2. Prendre en compte les besoins : Identifier les besoins cognitifs et fondamentaux.

      3. Se soucier de son bien-être : Veiller à son intérêt et sa motivation.

      4. Valoriser la personne : Distinguer l'individu de ses résultats normatifs lors des évaluations.

      5. Manifester de la compassion : Montrer une sensibilité face aux difficultés rencontrées par l'élève.

      --------------------------------------------------------------------------------

      Conclusion

      L'attention aux vulnérabilités ne doit pas être perçue comme une baisse d'exigence, mais comme une exigence bienveillante.

      En sécurisant le cadre relationnel et en répondant aux besoins psycho-affectifs, l'enseignant rend l'exigence scolaire acceptable et fructueuse, garantissant ainsi que l'élève reste « dans le jeu de la réussite ».

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study by Howe and colleagues investigates the role of the posterolateral cortical amygdala (plCoA) in mediating innate responses to odors, specifically attraction and aversion. By combining optogenetic stimulation, single-cell RNA sequencing, and spatial analysis, the authors identify a topographically organized circuit within plCoA that governs these behaviors. They show that specific glutamatergic neurons in the anterior and posterior regions of plCoA are responsible for driving attraction and avoidance, respectively, and that these neurons project to distinct downstream regions, including the medial amygdala and nucleus accumbens, to control these responses.

      Strengths:

      The major strength of the study is the thoroughness of the experimental approach, which combines advanced techniques in neural manipulation and mapping with high-resolution molecular profiling. The identification of a topographically organized circuit in plCoA and the connection between molecularly defined populations and distinct behaviors is a notable contribution to understanding the neural basis of innate motivational responses. Additionally, the use of functional manipulations adds depth to the findings, offering valuable insights into the functionality of specific neuronal populations.

      Weaknesses:

      There are some weaknesses in the study's methods and interpretation. The lack of clarity regarding the behavior of the mice during head-fixed imaging experiments raises the possibility that restricted behavior could explain the absence of valence encoding at the population level.

      We agree with idea that head-fixation may alter the state of the animal and the neural encoding of odor. To address this, we have provided further analysis of walking behavior during the imaging sessions, which is provided in Figure S2. Overall, we could not identify any clear patterns in locomotor behavior that are odor-specific. Moreover, when neural activity was sorted depending on the behavioral state (walking, pausing or fleeing) we didn’t observe any apparent patterns in odor-evoked neural activity. This is now discussed in the Results and Limitations sections of the manuscript.

      Furthermore, while the authors employ chemogenetic inhibition of specific pathways, the rationale for this choice over optogenetic inhibition is not fully addressed, and this could potentially affect the interpretation of the results.

      The rationale was logistical. First, inhibition of over a timescale of minutes is problematic with heat generation during prolonged optical stimulation. Second, our behavioral apparatus has a narrow height between the ceiling and floor, making tethering difficult. This is now explained the results section. The trade-off of using chemogenetics is that we are silencing neurons and not specific projections. However, because we find that NAc- and MeA- projecting neurons have little shared collateralization, we believe the conclusion of divergent pathways still stands. This is now discussed in the Limitations section.

      Additionally, the choice of the mplCoA for manipulation, rather than the more directly implicated anterior and posterior subregions, is not well-explained, which could undermine the conclusions drawn about the topographic organization of plCoA.

      We targeted the middle region of plCoA because it contains a mixture of cell types found in both the anterior and posterior plCoA, allowing us to test the hypothesis that cell types, not intra plCoA location, elicit different responses. Had we targeted the anterior or posterior regions, we would expect to simply recapitulate the result from activation of random cells in each region. As a result, we think stimulation in the middle plCoA is a better test for the contribution of cell types. We have now clarified this in the text.

      Despite these concerns, the work provides significant insights into the neural circuits underlying innate behaviors and opens new avenues for further research. The findings are particularly relevant for understanding the neural basis of motivational behaviors in response to sensory stimuli, and the methods used could be valuable for researchers studying similar circuits in other brain regions. If the authors address the methodological issues raised, this work could have a substantial impact on the field, contributing to both basic neuroscience and translational research on the neural control of behavior.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by the Root laboratory and colleagues describes how the posterolateral cortical amygdala (plCoA) generates valenced behaviors. Using a suite of methods, the authors demonstrate that valence encoding is mediated by several factors, including spatial localization of neurons within the plCoA, glutamatergic markers, and projection. The manuscript shows convincingly that multiple features (spatial, genetic, and projection) contribute to overall population encoding of valence. Overall, the authors conduct many challenging experiments, each of which contains the relevant controls, and the results are interpreted within the framework of their experiments.

      Strengths:

      - For a first submission the manuscript is well constructed, containing lots of data sets and clearly presented, in spite of the abundance of experimental results.

      - The authors should be commended for their rigorous anatomical characterizations and posthoc analysis. In the field of circuit neuroscience, this is rarely done so carefully, and when it is, often new insights are gleaned as is the case in the current manuscript.

      - The combination of molecular markers, behavioral readouts and projection mapping together substantially strengthen the results.

      - The focus on this relatively understudied brain region in the context is valence is well appreciated, exciting and novel.

      Weaknesses:

      - Interpretation of calcium imaging data is very limited and requires additional analysis and behavioral responses specific to odors should be considered. If there are neural responses behavioral epochs and responses to those neuronal responses should be displayed and analyzed.

      We have now considered this, see response above.

      - The effect of odor habituation is not considered.

      We considered this, but we did not find any apparent differences in valence encoding as measured by the proportion of neurons with significant valence scores across trials (see Figure 1J).

      - Optogenetic data in the two subregions relies on very careful viral spread and fiber placement. The current anatomy results provided should be clear about the spread of virus in A-P, and D-V axis, providing coordinates for this, to ensure readers the specificity of each sub-zone is real.

      We were careful to exclude animals for improper targeting. The spread of virus is detailed in Figures S3, S8 & S9.

      - The choice of behavioral assays across the two regions doesn't seem balanced and would benefit from more congruency.

      The choice of the 4-quadrant assay was used because this study builds off of our prior experiments that demonstrate a role for the plCoA in innate behavior. It is noteworthy that the responses to odor seen in this assay are generally in agreement with other olfactory behavioral assays, so one wouldn’t predict a different result. Moreover, the approach and avoidance responses measured in this assay are precisely the behaviors we wish to understand. We did examine other non-olfactory behavioral readouts (Figures S3, S8), and didn’t observe any effect of manipulation of these pathways.

      - Rationale for some of the choices of photo-stimulation experiment parameters isn't well defined.

      The parameters for photo-stimulation were based on those used in our past work (Root et al., 2014). We used a gradient of frequency from 1-10 Hz based on the idea that odor likely exists in a gradient and this was meant to mimic a potential gradient, though we don’t know if it exists. The range in stimulation frequencies appears to align with the actual rate of firing of plCoA neurons (Iurilli et al., 2017).

      Reviewer #3 (Public review):

      Summary:

      Combining electrophysiological recording, circuit tracing, single cell RNAseq, and optogenetic and chemogenetic manipulation, Howe and colleagues have identified a graded division between anterior and posterior plCoA and determined the molecular characteristics that distinguish the neurons in this part of the amygdala. They demonstrate that the expression of slc17a6 is mostly restricted to the anterior plCoA whereas slc17a7 is more broadly expressed. Through both anterograde and retrograde tracing experiments, they demonstrate that the anterior plCoA neurons preferentially projected to the MEA whereas those in the posterior plCoA preferentially innervated the nucleus accumbens. Interestingly, optogenetic activation of the aplCoA drives avoidance in a spatial preference assay whereas activating the pplCoA leads to preference. The data support a model that spatially segregated and molecularly defined populations of neurons and their projection targets carry valence specific information for the odors. The discoveries represent a conceptual advance in understanding plCoA function and innate valence coding in the olfactory system.

      Strengths:

      The strongest evidence supporting the model comes from single cell RNASeq, genetically facilitated anterograde and retrograde circuit tracing, and optogenetic stimulation. The evidence clear demonstrates two molecularly defined cell populations with differential projection targets. Stimulating the two populations produced opposite behavioral responses.

      Weaknesses:

      There are a couple of inconsistencies that may be addressed by additional experiments and careful interpretation of the data.

      Stimulating aplCoA or slc17a6 neurons results in spatial avoidance, and stimulating pplCoA or slc17a7 neurons drives approach behaviors. On the other hand, the authors and others in the field also show that there is no apparent spatial bias in odor-driven responses associated with odor valence. This discrepancy may be addressed better. A possibility is that odor-evoked responses are recorded from populations outside of those defined by slc17a6/a7. This may be addressed by marking activated cells and identifying their molecular markers. A second possibility is that optogenetic stimulation activates a broad set of neurons that and does not recapitulate the sparseness of odor responses. It is not known whether sparsely activation by optogenetic stimulation can still drive approach of avoidance behaviors.

      We agree that marking specific genetic or projection defined neurons could help to clarify if there are some neurons have more selective valence responses. However, we are not able to perform these experiments at the moment. We have included new data demonstrating that sparser optogenetic activation evokes behaviors similar in magnitude as the broader activation (see Figure S4).

      The authors show that inhibiting slc17a7 neurons blocks approaching behaviors toward 2-PE. Consistent with this result, inhibiting NAc projection neurons also inhibits approach responses. However, inhibiting aplCOA or slc17a6 neurons does not reduce aversive response to TMT, but blocking MEA projection neurons does. The latter two pieces of evidence are not consistent with each other. One possibility is that the MEA projecting neurons may not be expressing slc17a6. It is not clear that the retrogradely labeling experiments what percentage of MEA- and NACprojecting neurons express slc17a6 and slc17a7. It is possible that neurons expressing neither VGluT1 nor VGluT2 could drive aversive or appetitive responses. This possibility may also explain that silencing slc17a6 neurons does not block avoidance.

      We have now performed RNAscope staining on retrograde tracing to better define this relationship. Although the VGluT1 and VGluT2 neurons have biased projections to the MeA and NAc, respectively, there is some nuance detailed in Figure S10. Generally, MeA projecting neurons are predominately VGluT2+, whereas NAc projecting have about 20% that express both. Some (less than 35%) retrogradely labeled neurons were not detected as VGluT1 or VGluT2 positive, suggesting that other populations could also contribute. We agree that the discrepancy between MeA-projection and VGluT2 silencing is likely due to incomplete targeting of the MeA-projecting population with the VGluT2-cre line. This is included in the Discussion section.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Main:

      (1) For the head-fixed imaging experiments, what is the behavior of the mice during odor exposure? Could the weak reliability of individual neurons be due to a lack of approach or avoidance behavior? Could restricted behavior also explain the lack of valence encoding at the population level?

      We agree that this is a limitation of head-fixed recordings. In the revised manuscript we did attempt to characterize their behavioral response, and look for correlations in odor representation. Although we did find different patterns of odor-evoked walking behavior, these patterns were not reliable or specific to particular odors (Figure S2). For example, one might expect aversive odors to pause walking or elicit a fast fleeing-like response, but we did not observe any apparent differences for locomotion between odors as all odors evoked a mixture of responses (Figure S2A-D, text lines 208-232). We then examined responses to odor depending on the behavioral state (walking, pausing or fleeing) and didn’t observe any apparent patterns in odor responses (Figure S2E,F). Lastly, we acknowledge in the text that the lack of valence encoding may be an artifact of head-fixation (see lines 849-857).

      (2) For the optogenetic manipulations of Vglut1 and Vglut2 neurons, why was the injection and fiber targeted to the medial portion of the plCoA, if the hypothesis was that these glutamatergic neuron populations in different regions (anterior or posterior) are responsible for approach and avoidance? 

      We targeted the middle region of plCoA because it contains a mixture of cell types found in both the anterior and posterior plCoA, allowing us to test the hypothesis that cell types, not intraplCoA location, elicit different responses. Had we targeted the anterior or posterior regions, we would expect to simply recapitulate the result from activation of random cells in each region. As a result, we think stimulation in the middle plCoA is a better test for the contribution of cell types. We have clarified this in the text (Lines 417-419).

      Could this explain the lack of necessity with the DREADD experiments? 

      For the loss of function experiments, a larger volume of virus was injected to cover a larger area and we did confirm targeting of the appropriate areas. Though, it is always possible that the lack of necessity is due to incomplete silencing.

      Further, why was an optogenetic inhibition approach not utilized? 

      Although optogenetic inhibition could have plausibly been used instead, we chose chemogenetic inhibition for two reasons: First, for minutes-long periods of inhibition, optical illumination poses the risk of introducing heat related effects (Owen et al., 2019). In fact, we first tried optical inhibition but controls were exhibited unusually large variance. Second, it is more feasible in our assay as it has a narrow height between the floor and lid that complicates tethering to an optic fiber. Past experiments overcame this with a motorized fiber retraction system (Root et al., 2014), but this is highly variable with user-dependent effects, so we found chemogenetics to be a more practical strategy. We have added a sentence to explain the rationale (see lines 561-563).

      (3) The specific subregion of the nucleus accumbens that was targeted should be named, as distinct parts of the nucleus accumbens can have very different functions. 

      We attempted to define specific subregions of the nucleus accumbens and found that plCoA projection is not specific to the shell or core, anterior or posterior, rather it broadly innervates the entire structure. We have added a note about this in manuscript (see lines 470-471). Given that we did not find notable subregion-specific outputs within the NAc, targeting was directed to the middle region of NAc, with coordinates stated in the methods. 

      (4) Why was an intersectional DREADD approach used to inhibit the projection pathways, as opposed to optogenetic inhibition? The DREADD approach could potentially affect all projection targets, and the authors might want to address how this could influence the interpretation of the results.

      This is partly addressed above in point 2. As for interpretation, we acknowledge that the intersectional approach silences the neurons projecting to a given target and not the specific projection and we have been careful with the wording. Although this may complicate the conclusion, we did map the collaterals for NAc and MeA projecting neurons and find that neurons do not appreciably project to both targets and have minimal projections to other targets. We have now taken care to state that we silence the neurons projecting to a structure, not silencing the projection, and we acknowledge this caveat. However, since the MeA- and NAcprojecting neurons appear to be distinct from each other (largely not collateralizing to each other), the conclusion that these divergent pathways are required still stands. We have added discussion of this in the Limitations section (see lines 859-863).

      Minor:

      (1) Line 402 needs a reference.

      We have added the missing reference (now line 441).

      (2) The Supplemental Figure labeling in the main text should be checked carefully.

      Thank you for pointing this out. We have fixed the prior errors.

      (3) Panel letter D is missing from Figure 2.

      This has been fixed.

      Reviewer #2 (Recommendations for the authors):

      Major Concerns, additional experiments:

      - In the calcium imaging experiments mice were presented with the same odor many times. Overall responses to odor presentations were quite variable and appear to habituate dramatically (Figure S1F). The general conclusion from these experiments are a lack of consistent valence-specific responses of individual neurons, but I wonder if this conclusion is slightly premature. A few potential explanatory factors that may need additional attention are: -First, despite recording video of the mouse's face during experiments, no behavioral response to any odor is described. Is it possible these odors when presented in head-fixed conditions do not have the same valence?

      Yes, we agree that this is a possibility. We have added a discussion in the Limitations section (see lines 849-857). We have also added additional behavioral analysis discussed below.

      On trials with neural responses are there behavioral responses that could be quantified? 

      We have now added data in which we attempt to characterize their behavioral response, to look for correlations in odor representation (see lines 208-228). Although we did observe different patterns of odor-evoked walking behavior, these patterns were not reliable or specific to particular odors (Figure S2). One might expect aversive odors to pause walking or elicit a fast fleeing-like response, but we did not observe any apparent differences for locomotion between odors (Figure S2A-D). Next, we examined responses to odor depending on the behavioral state (walking, pausing or fleeing) and didn’t observe any meaningful differences in odor responses (Figure S2E,F). Lastly, we acknowledge that the odor representation may be different in freely moving animals that exhibit dynamic responses to odor (see lines 859-857).

      - Habituation seems to play a prominent role in the neural signals, is there a larger contribution of valence if you look only at the first delivery (or some subset of the 20 presentations) of an odor type for a given trial? 

      Indeed, we considered this, but we did not find any apparent differences in valence encoding as measured by the proportion of neurons with significant valence scores across trials (see Figure 1J).

      - Is it reasonable to exclude valence encoding as a possibility when largely neurons were unresponsive to the positive valence odors (2PE and peanut) chosen when looking at the average cluster response (Figure 1F)? 

      It is true that we see fewer neurons responding to the appetitive odors (Figure 1H) and smaller average responses within the cluster, but some neurons do respond robustly. If these were valence responses, we would predict that neural responses should be similarly selective, but we do not observe any such selectivity. The sparseness of responses to appetitive odors does cause the average cluster analysis (Figure 1F) to show muted responses to these odors, consistent with the decreased responsivity to appetitive odors. Moreover, single neuron response analysis reveals that a given neuron is not more likely to respond to appetitive or aversive odors with any selectivity greater than chance. For these reasons, we think it is reasonable to conclude an absence of valence responses, which is consistent with the conclusion from another report (Iurilli et al., 2017).

      - While the preference and aversion assay with 4 corners is an interesting set-up and provides a lot of data for this particular manuscript. It would be helpful to test additional behaviors to determine whether these circuits are more conserved. As it stands the current manuscript relies on very broad claims using a single behavioral readout. Some attempts to use head-fixed approaches with more defined odor delivery timelines and/or additional valenced behavioral readouts is warranted.

      We appreciate the suggestion, but are not able to perform these experiments at the moment. The choice of the 4-quadrant assay was used because it built off of our prior experiments that demonstrate a role for the plCoA in innate behavior. It is noteworthy that the responses to odor seen in this assay are generally in agreement with other olfactory behavioral assays, so one wouldn’t predict a different result. The approach and avoidance responses measured in this assay are precisely the behaviors we wish to understand. Moreover, we did examine other nonolfactory behavioral readouts (Figures S3, S8), and didn’t observe any effect of manipulation of these pathways. Lastly, we have tried to define parameters for head-fixed behavior that would permit correlation of neural responses with behavior, including longer stimulations and closed loop locomotion control of odor concentration, but were unsuccessful at establishing parameters that generated reliable behavioral responses. We acknowledge that one limitation of the study is the limited behavioral tests with two odors and whether the circuits are more broadly necessary for other odors. 

      Minor comments:

      • Please define PID in the Results when it is first introduced.

      Done (see line 154)

      • Line 412 Figure S5C-N should be Figure S6C-N.

      Fixed. Now Figure S8C-N due to additional figures (see line 451).

      • Throughout the Discussion it would be helpful if the authors referred to specific Figure panels that support their statements (e.g. lines 654-656 "[...] which is supported by other findings presented here showing that both VGluT2+ and VGluT1+ neurons project to MeA, while the projection to NAc is almost entirely composed of VGluT1+ neurons".

      Thank you for the suggestion. We have added figure references in the discussion.

      • Line 778 "producing" should be "produce".

      Corrected (see line 840)

      • The figures are very busy, especially all the manipulations. The authors are commended for including each data point, but they might consider a more subtle design (translucent lines only for each animal, and one mean dot for the SEM), just to reduce the overall clutter of an already overwhelming figure set. But this is ultimately left to the authors to resolve and style to their liking. 

      Thank you for the suggestion. We have tried some different styles but like the original best.

      Reviewer #3 (Recommendations for the authors):

      If within reach, I suggest that the author determine the percentage of retrogradely labeled neurons to NAc or MEA that expresses GluT1 and GluT2. 

      We have done this for the middle region plCoA that has the greatest mixture of cell types (See Figure S10, lines 504-517). We find that the MeA projecting neurons are mostly VGluT2+ with a minority that express both VGluT1 and VGlut2. NAc-projecting neurons are primarily VGluT1+ with about 20% expressing VGlut2 as well.

      It would also be nice to sparse label of aplCoA and pplCoA using ChR2 to see if sparse activation drives approach or avoidance. 

      We agree that it would be useful to vary the sparseness of the ChR2 expression, to see if produces similar results. We examined this using sparsely labeled odor ensembles, as previously done (Root et al., 2014). Briefly, we used the Arc-CreER mouse to label TMT responsive neurons with a cre-dependent ChR2 AAV vector targeted to the anterior or posterior regions, while previously we had broadly targeted the entirety of plCoA. We had established that this labeling method captures about half of the active cells detected by Arc expression, which is on the order of hundreds of neurons rather than thousands by broad cre-independent expression. Remarkably, we get effects similar in magnitude that are not significantly different from that with broader activation of the anterior or posterior domains (see new Figure S4, lines 267-288). It still remains possible that there is a threshold number of neurons that are necessary to elicit behavior, but that is beyond the scope of the current study. However, these data indicate that the effect of activating anterior and posterior domains is not an artifact of broad stimulation.

    1. Reviewer #1 (Public review):

      Summary:

      This study set out to investigate potential pharmacological drug-drug interactions between the two most common antimalarial classes, the artemisinins and quinolines. There is strong rationale for this aim, because drugs from these classes are already widely-used in Artemisinin Combination Therapies (ACTs) in the clinic, and drug combinations are an important consideration in the development of new medicines. Furthermore, whilst there is ample literature proposing many diverse mechanisms of action and resistance for the artemisinins and quinolines, it is generally accepted that the mechanisms for both classes involve heme metabolism in the parasite, and that artemisinin activity is dependent on activation by reduced heme. The study was designed to measure drug-drug interactions associated with a short pulse exposure (4 h) that is reminiscent of the short duration of artemisinin exposure obtained after in vivo dosing. Clear antagonism was observed between dihydroartemisinin (DHA) and chloroquine, which became even more extensive in chloroquine-resistant parasites. Antagonism was also observed in this assay for the more clinically-relevant ACT partner drugs piperaquine and amodiaquine, but not for other ACT partners mefloquine and lumefantrine, which don't share the 4-aminoquinoline structure or mode of action. Interestingly, chloroquine induced an artemisinin resistance phenotype in the standard in vitro Ring-stage Survival Assay, whereas this effect was not as extensive for piperaquine.

      The authors also utilised a heme-reactive probe to demonstrate that the 4-aminoquinolines can inhibit heme-mediated activation of the probe within parasites, which suggests that the mechanism of antagonism involves the inactivation of heme, rendering it unable to activate the artemisinins. Measurement of protein ubiquitination showed reduced DHA-induced protein damage in the presence of chloroquine, which is also consistent with decreased heme-mediated activation, and/or with decreased DHA activity more generally.

      Overall, the study clearly demonstrates a mechanistic antagonism between DHA and 4-aminoquinoline antimalarials in vitro. It is interesting that this combination is successfully used to treat millions of malaria cases every year, which may raise questions about the clinical relevance of this finding. However, the conclusions in this paper are supported by multiple lines of evidence and the data is clearly and transparently presented, leaving no doubt that DHA activity is compromised by the presence of chloroquine in vitro. It is perhaps fortunate the that the clinical dosing regimens of 4-aminoquinoline-based ACTs have been sufficient to maintain clinical efficacy despite the non-optimal combination. Nevertheless, optimisation of antimalarial combinations and dosing regimens is becoming more important in the current era of increasing resistance to artemisinins and 4-aminoquinolines. Therefore, these findings should be considered when proposing new treatment regimens (including Triple-ACTs) and the assays described in this study should be performed on new drug combinations that are proposed for new or existing antimalarial medicines.

      Strengths:

      This manuscript is clearly written and the data presented is clear and complete. The key conclusions are supported by multiple lines of evidence, and most findings are replicated with multiple drugs within a class, and across multiple parasite strains, thus providing more confidence in the generalisability of these findings across the 4-aminoquinoline and peroxide drug classes.

      A key strength of this study was the focus on short pulse exposures to DHA (4 h in trophs and 3 h in rings), which is relevant to the in vivo exposure of artemisinins. Artemisinin resistance has had a significant impact on treatment outcomes in South-East Asia, and is now emerging in Africa, but is not detected using a 'standard' 48 or 72 h in vitro growth inhibition assay. It is only in the RSA (a short pulse of 3-6 h treatment of early ring stage parasites) that the resistance phenotype can be detected in vitro. Therefore, assays based on this short pulse exposure provide the most relevant approach to determine whether drug-drug interactions are likely to have a clinically-relevant impact on DHA activity. These assays clearly showed antagonism between DHA and 4-aminoquinolines (chloroquine, piperaquine, amodiaquine and ferroquine) in trophozoite stages. Interestingly, whilst chloroquine clearly induced an artemisinin-resistant phenotype in the RSA, piperaquine only had a minor impact on the early ring stage activity of DHA, which may be fortunate considering that piperaquine is a currently recommended DHA partner drug in ACTs, whereas chloroquine is not.

      The evaluation of additional drug combinations at the end of this paper is a valuable addition, which increases the potential impact of this work. The finding of antagonism between piperaquine and OZ439 in trophozoites is consistent with the general interactions observed between peroxides and 4-aminoquinolines, and it may be interesting to see whether piperaquine impacts the ring-stage activity of OZ439.

      The evaluation of reactive heme in parasites using a fluorescent sensor, combined with the measurement of K48-linked ubiquitin, further support the findings of this study, providing independent read-outs for the chloroquine-induced antagonism.<br /> The in-depth discussion of the interpretation and implications of the results are an additional strength of this manuscript. Whilst the discussion section is rather lengthy, there are important caveats to the interpretation of some of these results, and clear relevance to the future management of malaria that require these detailed explanations.

      Overall, this is a high quality manuscript describing an important study that has implications for the selection of antimalarial combinations for new and existing malaria medicines.

      Weaknesses:

      This study is an in vitro study of parasite cultures, and therefore caution should be taken when applying these findings to decisions about clinical combinations. The drug concentrations and exposure durations in these assays are intended to represent clinically relevant exposures, although it is recognised that the in vitro system is somewhat simplified and there may be additional factors that influence in vivo activity. This limitation is reasonably well acknowledged in the manuscript.

      It is also important to recognise that the majority of the key findings regarding antagonism are based on trophozoite-stage parasites, and one must show caution when generalising these findings to other stages or scenarios. For example, piperaquine showed clear antagonism in trophozoite stages, but minimal impact in ring stages under these assay conditions.

      A key limitation is the interpretation of the mechanistic studies that implicate heme-mediated artemisinin activation as the mechanism underpinning antagonism by chloroquine. This study did not directly measure the activation of artemisinins. The data obtained from the activation of the fluorescent probe are generally supportive of chloroquine suppressing the heme-mediated activation of artemisinins, and I think this is the most likely explanation, but there are significant caveats to consider. Primarily, the inconsistency between the fluorescence profile in the chemical reactions and the cell-based assay raise questions about the accuracy of this readout. In the chemical reaction, mefloquine and chloroquine showed identical inhibition of fluorescence, whereas piperaquine had minimal impact. On the contrary, in the cell, chloroquine and piperaquine had similar impacts on fluorescence, but mefloquine had minimal impact. This inconsistency indicates that the cellular fluorescence based on this sensor does not give a simple direct readout of the reactivity of ferrous heme, and therefore, these results should be interpreted with caution. Indeed, the correlation between fluorescence and antagonism for the tested drugs is a correlation, not causation. There could be several reasons for the disconnect between the chemical and biological results, either via additional mechanisms that quench fluorescence, or the presence of biomolecules that alter the oxidation state or coordination chemistry of heme or other potential catalysts of this sensor. It is possible that another factor that influences the H-FluNox fluorescence in cells also influences the DHA activity in cells, leading to the correlation with activity. It should be noted that H-FluNox is not a chemical analogue of artemisinins. It's activation relies on Fenton-like chemistry, but with a N-O rather that O-O bond, and it possesses very different steric and electronic substituents around the reactive centre, which are known to alter reactivity to different iron sources. Despite these limitations, the authors have provided reasonable justification for the use of this probe to directly visualise heme reactivity in cells, and the results are still informative.

      Another interesting finding that was not elaborated by the authors is the impact of chloroquine in the DHA dose-response curves from the ring stage assays. Detection of artemisinin resistance in the RSA generally focuses on the % survival at high DHA concentrations (700 nM) as there is minimal shift in the IC50 (see Fig 2), however, chloroquine clearly induces a shift in the IC50 (~5-fold), where the whole curve is shifted to the right, whereas the increase in % survival is relatively small. This different profile suggests that the mechanism of chloroquine-induced antagonism may be different to the mechanism of artemisinin resistance. Current evidence regarding the mechanism of artemisinin resistance generally points towards decreased heme-mediated drug activation due to a decrease in hemoglobin uptake, which should be analogous to the decrease in heme-mediated drug activation caused by chloroquine. However, these different dose response curves suggest different mechanisms are primarily responsible. Additional mechanisms have been proposed for artemisinin resistance, involving redox or heat stress responses, proteostatic responses, mitochondrial function, dormancy and PI3K signalling among others. Whilst the H-FluNox probe generally supports the idea that chloroquine suppresses heme-mediated DHA activation, it remains plausible that chloroquine could induce these, or other, cellular responses that suppress DHA activity.

      Impact:

      This study has important implications for the selection of drugs to form combinations for the treatment of malaria. The overall findings of antagonism between peroxide antimalarials and 4-aminoquinolines in the trophozoite stage are robust, and the this carries across to the ring stage for chloroquine.

      The manuscript also provides a plausible mechanism to explain the antagonism, although future work will be required to further explore the details of this mechanism and to rule out alternative factors that may contribute.

      Overall, this is an important contribution to the field and provides a clear justification for the evaluation of potential drug combinations in relevant in vitro assays before clinical testing.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript by Rosenthal and Goldberg investigates interactions between artemisinins and its quinoline partner drugs currently used for treating uncomplicated Plasmodium falciparum malaria. The authors show that chloroquine (CQ), piperaquine, and amodiaquine antagonize dihydroartemisinin (DHA) activity, and in CQ-resistant parasites, the interaction is described as "superantagonism," linked to the pfcrt genotype. Mechanistically, application of the heme-reactive probe H-FluNox indicates that quinolines render cytosolic heme chemically inert, thereby reducing peroxide activation. The work is further extended to triple ACTs and ozonide-quinoline combinations, with implications for artemisinin-based combination therapy (ACT) design, including triple ACTs.

      Strengths:

      The manuscript is clearly written, methodologically careful, and addresses a clinically relevant question. The pulsing assay format more accurately models in vivo artemisinin exposure than conventional 72-hour assays, and the use of H-FluNox and Ac-H-FluNox probes provides mechanistic depth by distinguishing chemically active versus inert heme. These elements represent important refinements beyond prior studies, adding nuance to our understanding of artemisinin-quinoline interactions.

      Weaknesses:

      Several points warrant consideration. The novelty of the work is somewhat incremental, as antagonism between artemisinins and quinolines is well established. Multiple prior studies using standard fixed-ratio isobologram assays have shown that DHA exhibits indifferent or antagonistic interactions with chloroquine, piperaquine, and amodiaquine (e.g., Davis et al., 2006; Fivelman et al., 2007; Muangnoicharoen et al., 2009), with recent work highlighting the role of parasite genetic background, including pfcrt and pfmdr1, in modulating these interactions (Eastman et al., 2016). High-throughput drug screens likewise identify quinoline-artemisinin combinations as mostly antagonistic. The present manuscript adds refinement by applying pulsed-exposure assays and heme probes rather than establishing antagonism de novo.

      The dataset focuses on several parasite lines assayed in vitro, so claims about broad clinical implications should be tempered, and the discussion could more clearly address how in vitro antagonism may or may not translate to clinical outcomes. The conclusion that artemisinins are predominantly activated in the cytoplasm is intriguing but relies heavily on Ac-H-FluNox data, which may have limitations in accessing the digestive vacuole and should be acknowledged explicitly. The term "superantagonism" is striking but may appear rhetorical; clarifying its reproducibility across replicates and providing a mechanistic definition would strengthen the framing. Finally, some discussion points, such as questioning the clinical utility of DHA-PPQ, should be moderated to better align conclusions with the presented data while acknowledging the complexity of in vivo pharmacology and clinical outcomes.

      Despite these mild reservations, the data are interesting and of high quality and provide important new information for the field.

      Editor's Review of the Revision: The authors have provided a well-reasoned rebuttal to the comments of the three reviewers. Most of the changes were incorporated in their revised Discussion. Their data with the active heme probe H-FluNox are novel and the authors reveal interesting interactions between peroxide and 4-aminoquinoline-based antimalarials that open new avenues of research especially when considering antimalarial combinations that combine these chemical scaffolds. This study will be of broad interest to investigators studying and developing antimalarial drugs and combinations and the impact of Plasmodium falciparum resistance mechanisms. A minor recommendation would be that the authors state H-FluNox when referring to their small molecule probe in the abstract, so that it is captured in PubMed searches.

    3. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      We appreciate the positive assessment. We recognize that since all of the work in this manuscript was done in vitro, there are reasonable concerns about the translatability of these data to clinical settings. These results should not directly inform malaria policy, but we hope that these data bring new considerations to the approach for choosing strategic antimalarial combinations. We have modified the manuscript to clarify this distinction.

      Public Reviews

      Reviewer #1 (Public Review):

      We thank the reviewer for their thoughtful summary of this manuscript. It is important to note that DHA-PPQ did show antagonism in RSAs. In this modified RSA, 200 nM PPQ alone inhibited growth of PPQ-sensitive parasites approximately 20%. If DHA and PPQ were additive, then we would expect that addition of 200 nM PPQ would shift the DHA dose response curve to the left and result in a lower DHA IC50. Please refer to Figure 4a and b as examples of additive relationships in dose-response assays. We observed no significant shift in IC50 values between DHA alone and DHA + PPQ. This suggests antagonism, albeit not to the extent seen with CQ. We have modified the manuscript to emphasize this point. As the reviewer pointed out, it is fortunate that despite being antagonistic, clinically used artemisinin-4-aminoquinoline combinations are effective, provided that parasites are sensitive to the 4-aminoquinoline. It is possible that superantagonism is required to observe a noticeable effect on treatment efficacy (Sutherland et al. 2003 and Kofoed et al. 2003), but that classical antagonism may still have silent consequences. For example, if PPQ blocks some DHA activation, this might result in DHA-PPQ acting more like a pseudo-monotherapy. However, as the reviewer pointed out, while our data suggest that DHA-PPQ and AS-ADQ are “non-optimal” combinations, the clinical consequences of these interactions are unclear. We have modified the manuscript to emphasize the later point.

      While the Ac-H-FluNox and ubiquitin data point to a likely mechanism for DHA-quinoline antagonism, we agree that there are other possible mechanisms to explain this interaction.  We have addressed this limitation in the discussion section. Though we tried to measure DHA activation in parasites directly, these attempts were unsuccessful. We acknowledge that the chemistry of DHA and Ac-H-FluNox activation is not identical and that caution should be taken when interpreting these data. Nevertheless, we believe that Ac-H-FluNox is the best currently available tool to measure “active heme” in live parasites and is the best available proxy to assess DHA activation in live parasites. These points are now addressed in the discussion section. Both in vitro and in parasite studies point to a roll for CQ in modulating heme, though an exact mechanism will require further examination. Similar to the reviewer, we were perplexed by the differences observed between in vitro and in parasite assays with PPQ and MFQ. We proposed possible hypotheses to explain these discrepancies in the discussion section. Interestingly, our data corelate well with hemozoin inhibition assays in which all three antimalarials inhibit hemozoin formation in solution, but only CQ and PPQ inhibit hemozoin formation in parasites. In both assays, in-parasite experiments are likely to be more informative for mechanistic assessment.

      It remains unclear why K13 genotype influences RSA values, but not early ring DHA IC50 values. In K13<sup>WT</sup> parasites, both RSA values and DHA IC50 values were increased 3-5 fold upon addition of CQ. This suggests that CQ-mediated resistance is more robust than that conferred by K13 genotype. However, this does not necessarily suggest a different resistance mechanism. We acknowledge that in addition to modulating heme, it is possible that CQ may enhance DHA survival by promoting parasite stress responses. Future studies will be needed to test this alternative hypothesis. This limitation has been acknowledged in the manuscript. We have also addressed the reviewer’s point that other factors, including poor pharmacokinetic exposure, contributed to OZ439-PPQ treatment failure.

      Reviewer #2 (Public Review):

      We appreciate the positive feedback. We agree that there have been previous studies, many of which we cited, assessing interactions of these antimalarials. We also acknowledge that previous work, including our own, has shown that parasite genetics can alter drug-drug interactions. We have included the author’s recommended citations to the list of references that we cited. Importantly, our work was unique not only for utilizing a pulsing format, but also for revealing a superantagonistic phenotype, assessing interactions in an RSA format, and investigating a mechanism to explain these interactions. We agree with the reviewer that implications from this in vitro work should be cautious, but hope that this work contributes another dimension to critical thinking about drug-drug interactions for future combination therapies. We have modified the manuscript to temper any unintended recommendations or implications.

      The reviewer notes that we conclude “artemisinins are predominantly activated in the cytoplasm”. We recognize that the site of artemisinin activation is contentious. We were very clear to state that our data combined with others suggest that artemisinins can be activated in the parasite cytoplasm. We did not state that this is the primary site of activation. We were clear to point out that technical limitations may prevent Ac-H-FluNox signal in the digestive vacuole, but determined that low pH alone could not explain the absence of a digestive vacuole signal.

      With regard to the “reproducibility” and “mechanistic definition” of superantagonism, we observed what we defined as a one-sided superantagonistic relationship for three different parasites (Dd2, Dd2 PfCRT<sup>Dd2</sup>, and Dd2 K13<sup>R539T</sup>) for a total of nine independent replicates. In the text, we define that these isoboles are unique in that they had mean ΣFIC50 values > 2.4 and peak ΣFIC50 values >4 with points extending upward instead of curving back to the axis. As further evidence of the reproducibility of this relationship, we show that CQ has a significant rescuing effect on parasite survival to DHA as assessed by RSAs and IC50 values in early rings.

      Reviewer #3 (Public Review):

      We thank the reviewer for their positive feedback. We acknowledge that no combinations tested in this manuscript were synergistic. However, two combinations, DHA-MFQ and DHA-LM, were additive, which provides context for contextualizing antagonistic relationships. We have previously reported synergistic and additive isobolograms for peroxide-proteasome inhibitor combinations using this same pulsing format (Rosenthal and Ng 2021). These published results are now cited in the manuscript.

      We believe that these findings are specific to 4-aminoquinoline-peroxide combinations, and that these findings cannot be generalized to antimalarials with different mechanisms of action. Note that the aryl amino alcohols, MFQ and LM, were additive with DHA. Since the mechanism of action of MFQ and LM are poorly understood, it is difficult to speculate on a mechanism underlying these interactions.

      We agree with the reviewer that while the heme probe may provide some mechanistic insight to explain DHA-quinoline interactions, there is much more to learn about CQ-heme chemistry, particularly within parasites.

      The focus of this manuscript was to add a new dimension to considerations about pairings for combination therapies. It is outside the scope of this manuscript to suggest alternative combinations. However, we agree that synergistic combinations would likely be more strategic clinically.

      An in vitro setup allows us to eliminate many confounding variables in order to directly assess the impact of partner drugs on DHA activity. However, we agree that in vivo conditions are incredibly more complex, and explicitly state this.

      We agree that in the future, modeling studies could provide insight into how antagonism may contribute to real-world efficacy. This is outside the scope of our studies.

      Recommendations for the Authors:

      Reviewer #1 (Recommendations for the Authors):

      The key weaknesses identified in this manuscript are described in the 'weaknesses' section of the public review. The major one is the inconsistency around the H-FluNox response in the chemical vs biological experiments. I can't think of a simple experiment to resolve this issue, but it is good that this data is openly provided in the manuscript. I believe there could be more discussion to clarify this limitation with the current study, and the conclusions, and particularly the title, should be softened regarding the mechanism of antagonism being based on heme reactivity.

      We have softened the title and conclusions to take into account the limitations of our studies.

      (1) Please double-check the definitions for isobologram interpretation. In most antimicrobial interaction studies, I see the threshold for antagonism at sumFIC50 of 1.5, or even 2. 1.25 is often interpreted as additive in many studies.

      We acknowledge that different studies use various cutoff values. Our interpretations for additive versus antagonistic versus superantagonistic were based not only on mean ΣFIC50 values, but also isobologram shape. For example, the flat isoboles for MFQ-DHA were clearly distinct from the curved isoboles of PPQ-DHA. It is unclear what cutoff value(s) would be most clinically relevant.

      (2) For the MFQ-PPQ interaction study, please make it clear that these drugs have very long half-lives (weeks), so the 4 h pulse assay isn't really relevant to their overall activity. It probably shows a slower onset of action, but there is plenty of drug remaining for many days in the clinical scenario, so perhaps the data from the traditional 48h assay is more relevant. The same consideration applies to OZ439, which may impact the interpretation of that data.

      We have now included the half-lives of these compounds in the discussion section. Our intent was to use a pulsing format to make these isobolograms comparable with the other assays. It is important to note that pulses can reveal stronger phenotypes that might be missed with traditional methods. Thus, while 48 h assays may better mimic in vivo conditions, they could also mask important phenotypes.

      Reviewer #3 (Recommendations for the Authors):

      I have included most of my concerns in the public review. Below are some additional specific points for consideration:

      (1) It is expected to include a synergistic combination as a control (e.g., artemisinin + lumefantrine) to contextualize the degree of antagonism observed. The experimental design should show some synergistic profiles in comparison. Adding a few experiments by including a synergistic control is needed.

      Both MFQ-DHA and LM-DHA combinations were additive, which provides context for antagonistic combinations. This is now stated in the results section pertaining to Figure 1. We have also included a reference to our previous publication in which we demonstrated that proteasome inhibitor-peroxide combinations are synergistic to additive using this same pulsing format.

      (2) Consider in vivo validation or pharmacokinetic/pharmacodynamic modeling to strengthen the translational relevance of the findings when it comes to doses and the IC50 correlations.

      We agree that this would be useful to do in future, but it is outside the scope of the current study.

      (3) It would be beneficial to include a discussion section on how the findings are generalizable to different Plasmodium falciparum genotypes (3D7, Dd2, MRA-1284) and their relevance.

      Findings were consistent across three parasite backgrounds depending on PfCRT genotype. This point has been included in the discussion section. The background of these parasites is also provided in Table 1.

      (4) Potential evaluation criteria to understand where certain combinations should be reconsidered can be included as a suggestion for the wider audience.

      Our in vitro studies suggest that pulsing isobolograms would be a useful assay to include when evaluating combination therapies. While we believe that synergistic combinations would be more strategic than antagonistic combinations, we cannot provide evaluation criteria or make recommendations for reconsidering currently used combinations.

      (5) Further elaborate on the mechanistic basis of heme inactivation by quinolines. If data are available, please include more data on the specificity of the process.

      Despite our best efforts, we were unable to evaluate quinoline-heme interactions in parasites. Even in vitro, this interaction has remined elusive for decades. We agree that this would be an important future step towards supporting a specific mechanism for quinoline-DHA antagonism.

    1. Reviewer #2 (Public review):

      Summary:

      In this study, the authors identify a previously uncharacterised regulator of mitochondrial function using a genetic screen and propose a role for this protein in supporting mitochondrial protein production. They provide evidence that the protein localises to mitochondria, interacts with components of the mitochondrial translation machinery, and is required for normal heart function in an animal model.

      Strengths:

      A major strength of the work is the use of multiple independent approaches to assess mitochondrial activity and protein production, which together provide support for the central conclusions. The in vivo data linking loss of this factor to impaired heart function are particularly compelling and elevate the relevance of the study beyond a purely cell-based context.

      Weaknesses:

      Given prior reports placing this protein outside mitochondria, its mitochondrial localisation would benefit from more rigorous and quantitative validation, and the proposed mechanism of the interaction with the mitochondrial translation machinery remains only partially explored. In addition, the physiological analysis is largely limited to the heart, leaving open questions about how broadly this pathway operates across tissues.

      Major comments:

      (1) Evidence for mitochondrial localization of EOLA1<br /> EOLA1 has previously been reported as a nuclear and cytosolic protein and is not annotated in MitoCarta 3.0, making rigorous validation of its mitochondrial localization particularly important. Although the authors provide several lines of evidence, interpretation is complicated by the use of different cell lines across localization, interaction, and functional experiments. Greater consistency in the cellular models used would strengthen the conclusions. The immunofluorescence analysis of tagged EOLA1 would also benefit from quantification across more cells and the inclusion of an additional mitochondrial marker (e.g., an outer membrane marker such as TOM20), as HSP60 staining can vary with mitochondrial state.

      (2) Normalization of OCR measurements<br /> Clarification of how Seahorse oxygen consumption rate measurements were normalized (e.g., cell number or protein content) would aid interpretation, particularly given potential effects of Eola1 loss on cell growth.

      (3) Linking interaction data to functional phenotypes<br /> Loss-of-function analyses are performed in mouse cell lines, whereas localization and interactome studies are conducted in human HEK293T cells. The absence of a human EOLA1 knockout model makes it difficult to directly connect the interaction data to the observed functional phenotypes. Additional validation or discussion of species conservation would improve clarity.

      (4) Mechanistic interpretation of the EOLA1-TUFM-12S rRNA interaction<br /> The identification of TUFM and 12S mt-rRNA as EOLA1 interactors is an interesting finding; however, the basis for prioritizing TUFM among the many mitochondrial proteins identified in the interactome is not fully explained. Providing enrichment statistics and functional categorization of mitochondrial interactors would increase transparency. In addition, the proposed role of the ASCH domain in RNA binding would be strengthened by structure-informed or mutational analysis of the conserved RNA-binding motif.

      (5) Interpretation of mitochondrial translation and protein abundance data<br /> Several assays supporting impaired mitochondrial translation would benefit from additional controls and quantification. The de novo mitochondrial translation assay (Fig. 3h) is not quantified, making it difficult to assess the magnitude and reproducibility of the effect. In addition, western blots showing reduced levels of mitochondrially encoded OXPHOS subunits (Figure 3g) lack a mitochondrial loading control (e.g., TOM20 or VDAC). Since loss of EOLA1 may affect mitochondrial mass, normalization to a mitochondrial marker is necessary. Relatedly, it would be informative to assess whether steady-state levels of mitoribosomal proteins (e.g., MRPS15, MRPL37) and nuclear-encoded OXPHOS subunits are altered upon Eola1 loss, both in knockout cell lines and in the knockout mouse.

      (6) Physiological scope of the in vivo analysis<br /> The cardiac phenotype observed in the whole-body Eola1 knockout mouse is compelling, but the focus on a single tissue limits interpretation of EOLA1's broader physiological role. Examination of additional high-energy-demand tissues would help clarify whether the observed effects are heart-specific or more general. In addition, the presence of residual EOLA1 protein bands in western blots (Figure 4a) and remaining Eola1 transcripts in qRT-PCR analyses (Extended Figure 4e) from knockout tissues should be addressed. The authors should clarify whether these signals reflect incomplete knockout, alternative isoforms, antibody cross-reactivity, or technical background.

      (7) Relationship to previously reported MT2A interaction<br /> Given prior reports of EOLA1 interaction with MT2A, a brief comment on whether MT2A was detected in the authors' co-immunoprecipitation experiments and how this relates to the proposed mitochondrial role would be useful.

    1. Document de Synthèse : Le Projet FUSÉ – Une Approche Structurelle pour la Réussite des Élèves Fragilisés

      Résumé Exécutif

      Le projet FUSÉ (Formation à l’utilisation de stratégies efficaces pour l’engagement) est une initiative novatrice mise en œuvre à l’école secondaire Carrefour (Centre de services scolaire des Draveurs) pour contrer le décrochage scolaire précoce.

      Ce projet cible les élèves du premier cycle du secondaire en situation de grande vulnérabilité, particulièrement ceux ayant des acquis de 6e année mais se trouvant en échec dans plusieurs matières à sanction.

      Partant du constat que le redoublement traditionnel ne produisait aucun résultat positif (33 % de taux de sortie sans diplôme), la direction a instauré une structure rigoureuse remplaçant la culture du redoublement par un accompagnement intensif basé sur l’autodétermination et la réussite immédiate.

      Après une année d'application, les résultats sont probants : sur 39 élèves ciblés, 20 ont réussi leur passage en secondaire 3, évitant ainsi des trajectoires de formation moins qualifiantes.

      Le projet repose sur une mobilisation des services complémentaires, une réorganisation des horaires et l'utilisation d'un quartier général dédié : le « Bistrado ».

      --------------------------------------------------------------------------------

      1. Contexte et Problématique

      1.1 Un constat d'échec systémique

      L'école secondaire Carrefour, située en milieu urbain défavorisé, accueille environ 2 000 élèves. Avant l'implantation de FUSÉ, l'école faisait face à des défis majeurs :

      Taux de décrochage élevé : 33 % de sorties sans diplôme au régulier, contre une moyenne québécoise de 24,6 % pour des milieux équivalents.

      Décrochage précoce : Le profil type du décrocheur se dessinait dès l'âge de 15 ans, souvent suite à une reprise de la première année du secondaire.

      Inefficacité du redoublement : Les données montraient que les élèves reprenant leur secondaire 1 obtenaient des résultats inférieurs à leur première tentative, tout en développant des problèmes de comportement et de motivation accrus.

      1.2 L'urgence d'agir

      En mars 2024, les prévisions indiquaient que 42 élèves sur 200 au régulier étaient en échec dans au moins trois matières à sanction.

      Face à la pression du personnel pour un redoublement massif ou un transfert en adaptation scolaire (non justifié par les acquis académiques), la direction a choisi de rompre avec les pratiques établies.

      --------------------------------------------------------------------------------

      2. Fondements et Vision du Projet FUSÉ

      Le projet s'appuie sur une philosophie de « création du possible » lorsque les méthodes traditionnelles échouent.

      2.1 Objectifs centraux

      Maintenir la trajectoire scolaire : Éviter que les élèves ne soient dirigés prématurément vers des parcours comme la FMS (Formation menant à l'exercice d'un métier semi-spécialisé).

      Favoriser l'autodétermination : Baser l'intervention sur les besoins fondamentaux d'appartenance, de relation et de compétence.

      Inverser l'effort : Faire en sorte que l'élève devienne l'acteur principal de sa réussite, plutôt que de voir les adultes « travailler plus fort que l'élève ».

      2.2 Cadre théorique et leviers

      Le projet s'inspire de modèles existants tels que :

      • L'approche Check & Connect (utilisée au 2e cycle sous le nom de « Boussole éducative »).

      • Le Plan d'intervention autodéterminé, soutenu par une formation de la conseillère pédagogique du centre de services.

      • L'utilisation de données probantes pour identifier les facteurs de risque et de protection.

      --------------------------------------------------------------------------------

      3. Structure Opérationnelle et Mise en Œuvre

      La réussite de FUSÉ repose sur une structure « bétonnée » plutôt que sur un simple changement de culture imposé au personnel enseignant.

      3.1 Le « Bistrado » : Le Quartier Général

      Le bistro étudiant de l'école est transformé chaque matin en centre de services centralisé pour les élèves FUSÉ. C'est un lieu sécurisant, loin de l'agitation des classes, où s'effectue l'accueil quotidien.

      3.2 L'Intervenant Pivot

      Chaque élève est lié à un intervenant pivot (agent de réadaptation, orthopédagogue, enseignant ressource ou intervenant en toxicomanie). Ce dernier :

      • Centralise les communications.

      • Assure un accueil quotidien (les « Soleils FUSÉ »).

      • Suit les objectifs personnels de l'élève.

      3.3 Analyse des données et sous-groupes

      Les élèves sont regroupés selon la nature de leurs besoins, tout en restant intégrés dans leur profil ou programme d'origine (pas de classes fermées) :

      | Profil de sous-groupe | Nature des difficultés | | --- | --- | | Comportement | Manifestations comportementales perturbatrices. | | Motivation / Assiduité | Taux d'absentéisme élevé, désengagement. | | Apprentissage | Lacunes académiques graves en français ou mathématiques. |

      --------------------------------------------------------------------------------

      4. L'Expérience Élève et Engagement

      4.1 Le contrat d'engagement

      La participation est volontaire. L'élève doit signer un contrat d'engagement. Si l'engagement fait défaut, l'élève peut être retiré du projet, avec la possibilité d'y revenir lorsqu'il se sent prêt.

      4.2 Le déroulement quotidien

      Période solée (8h40 - 9h00) : Accueil au Bistrado, petit-déjeuner pour les élèves en milieu défavorisé, et fixation d'objectifs quotidiens ou hebdomadaires (ex: arriver à l'heure, participer en classe).

      Suivi des objectifs : Les réussites sont soulignées par des « billets de tirage » et des certificats de reconnaissance, favorisant l'émulation.

      Horaire différencié : Pour certains élèves, des matières comme les arts, l'anglais ou le CCQ sont temporairement allégées pour permettre des périodes de rattrapage intensif en français et mathématiques avec des enseignants ressources.

      --------------------------------------------------------------------------------

      5. Résultats et Impact

      5.1 Statistiques de la première cohorte (39 élèves)

      Les résultats ont surpassé les attentes initiales de la direction :

      20 élèves ont intégré le secondaire 3 régulier.

      4 élèves ont été dirigés vers la FMS.

      2 élèves vers le Pré-DEP.

      1 élève vers la formation générale des adultes.

      8 élèves ont repris leur secondaire 2 (mais avec un meilleur accompagnement).

      Seulement 4 abandons (dont 2 en cours d'année).

      5.2 Gains qualitatifs

      Amélioration du lien école-famille : Les parents, souvent découragés, ont retrouvé de l'espoir grâce à une communication axée sur le positif.

      Cohérence organisationnelle : Le personnel partage désormais un langage commun autour de l'autodétermination.

      Épanouissement social : Participation à des activités d'émulation (ex: sorties au théâtre) et implication bénévole des élèves au sein de l'école.

      --------------------------------------------------------------------------------

      6. Évolution : FUSÉ 2.0 et Perspectives

      Fort de son succès, le projet entame sa deuxième année avec des ajustements majeurs :

      1. Enseignement multiniveaux : Création de groupes en français et mathématiques pour les élèves ayant des lacunes profondes (niveau 5e année primaire), tout en évitant le cloisonnement.

      2. Expansion au secondaire 1 : Identification précoce des élèves fragiles dès la rentrée pour prévenir l'échec.

      3. Intégration systémique : Fusion de l'approche FUSÉ dans la « Boussole éducative » globale de l'école pour assurer une transition fluide entre le premier et le deuxième cycle.

      4. Adaptation scolaire : Réflexion sur l'application de l'approche fusée pour les élèves en adaptation afin de viser une progression constante plutôt que la simple réussite de fin d'année.

      Le projet FUSÉ démontre qu'en réallouant les ressources existantes et en structurant rigoureusement l'accompagnement, il est possible de modifier radicalement la trajectoire d'élèves que le système considérait autrefois comme perdus.

    1. Es un objeto, un producto manufacturado. Desdeluego, lo mismo se dice de las computadoras. En segundo lugar, afirma el Sócratesde Platón, la escritura destruye la memoria. Los que la utilicen se harán olvidadizosal depender de un recurso exterior por lo que les falta en recursos internos. Laescritura debilita el pensamiento. Hoy en día, los padres, y otros además de ellos,temen que las calculadoras de bolsillo proporcionen un recurso externo para lo quedebiera ser el recurso interno de las tablas de multiplicaciones aprendidas dememoria. Las calculadoras debilitan el pensamiento, le quitan el trabajo que lomantiene en forma. En tercer lugar, un texto escrito no produce respuestas. Si uno lepide a una persona que explique sus palabras, es posible obtener una explicación; siuno se lo pide a un texto, no se recibe nada a cambio, salvo las mismas palabras, amenudo estúpidas, que provocaron la pregunta en un principio.

      Aquí se da el desarrollo de lo argumentos por la frase de platon donde menciona que: "es inhumana al pretender establecer fuera del pensamiento lo que en realidad solo puede existir dentro de él". Argumentos que sostienen esta frase: 1. Es un objeto, un producto manufacturado. Lo mismo se dice de las computadoras 2. La escritura destruye la memoria. Coloca el ejemplo de que la invención de las calculadoras es un recurso externo que produjo que las tablas aprendidas de memoria que son un recurso interno sean olvidadas. 3. Un texto escrito no produce respuestas, si se le pregunta a una persona que explica sus palabras presencialmente este podría hacerlo, pero en un texto escrito no se podría porque no tenemos al autor en frente. 4. El hecho de que la palabra escrita no puede defender como es capaz de hacerlo la palabra hablada natural. Menciona que el habla y pensamiento real está medida por un contexto de ida y vuelta, pero en la palabra escrita es pasiva, es decir, fuera de dicho contexto, irreal y artificial al igual que las computadoras.

    Annotators

    1. Cadre de référence sur les mesures de contrôle en milieu scolaire : Note de synthèse

      https://www.youtube.com/watch?v=D43t0L_G7-Y

      Résumé exécutif

      Ce document de référence, fruit d'une collaboration entre le ministère de l’Éducation (MEQ) et la Fédération des centres de services scolaires du Québec (FCSSQ), définit les orientations nationales concernant l’utilisation des mesures de contrôle — contention et isolement — dans les établissements d'enseignement.

      La prémisse fondamentale est que ces mesures ne doivent être envisagées qu'en dernier recours, exclusivement dans des situations d'urgence où la sécurité de l'élève ou d'autrui est menacée de façon imminente.

      Le cadre privilégie une approche préventive et éducative, structurée autour du Système de soutien à paliers multiples (SSPM), visant à réduire au minimum le recours à la force ou à la contrainte.

      Il clarifie les responsabilités légales et professionnelles, notamment depuis les modifications réglementaires d'octobre 2023 habilitant certains professionnels (psychologues et psychoéducateurs) à décider de l’utilisation de mesures de contention.

      La mise en œuvre repose sur une démarche rigoureuse en cinq étapes, incluant l'élaboration de protocoles spécifiques (école ou élève) et l'application de modalités postsituationnelles pour assurer le bien-être et la réévaluation constante des pratiques.

      1. Fondements et principes directeurs

      Le recours aux mesures de contrôle est strictement encadré par des références légales (Charte des droits et libertés, Code civil, Loi sur l'instruction publique) et doit respecter les principes de dignité, d'intégrité et de sécurité de l'élève.

      Principes fondamentaux de l'intervention :

      Dernier recours : Utilisé uniquement lorsque les interventions préventives et les mesures alternatives ont échoué.

      Danger imminent : La menace doit être caractérisée par sa prévisibilité, son immédiateté et la gravité de ses conséquences.

      Contrainte minimale : La mesure doit être la moins restrictive possible et durer le moins longtemps possible (cesser dès que le danger est écarté).

      Respect et dignité : L'intervention doit être empreinte de bienveillance et de chaleur humaine, sous une surveillance constante.

      Suivi obligatoire : Chaque application doit faire l'objet d'un suivi postsituationnel pour évaluer l'efficacité et réguler les futures interventions.

      2. Définitions des mesures de contrôle

      Le cadre distingue plusieurs types d'interventions pour assurer une compréhension commune au sein du réseau scolaire.

      | Type de mesure | Description | Exemples | | --- | --- | --- | | Contention physique | Utilisation de la force humaine pour immobiliser ou diriger un élève contre son gré. | Tenir le bras d'un élève qui résiste ou le maintenir s'il frappe. | | Contention mécanique | Emploi d'un équipement ou de matériel pour limiter le mouvement. | Mitaines de sécurité, vestes de retenue dans le transport scolaire. | | Retrait de matériel | Confiscation d'un appareil palliant normalement un handicap. | Retirer les freins d'un fauteuil roulant ou confisquer une marchette. | | Isolement | Confinement de l'élève dans un lieu d'où il ne peut sortir librement. | Tenir la poignée d'une porte fermée ou bloquer physiquement l'accès. |

      Note : L'administration de substances chimiques à des fins de contrôle nécessite une prescription médicale et n'est pas traitée dans ce document.

      3. Cadre opérationnel : Intervention planifiée vs non planifiée

      Le cadre distingue deux contextes d'application, impactant directement les responsabilités professionnelles.

      | Caractéristique | Intervention Non Planifiée | Intervention Planifiée | | --- | --- | --- | | Contexte | Comportement inhabituel et imprévisible. | Comportement connu et susceptible de se répéter. | | Outil de gestion | Protocole-école (universel). | Protocole-élève (personnalisé, lié au Plan d'intervention). | | Décision (Contention) | Activité non réservée (urgence). | Activité réservée aux professionnels habilités. | | Décision (Isolement) | Activité non réservée. | Activité non réservée (mais encadrée). | | Application | Activité non réservée. | Activité non réservée. |

      4. La démarche d'intervention en cinq étapes

      Pour assurer la sécurité et le respect des droits, une structure systématique est proposée :

      1. Élaboration du protocole : Mise en place préventive de balises (comité-école pour le protocole-école ; équipe-école et parents pour le protocole-élève).

      2. Application des interventions préventives et alternatives : Utilisation de stratégies éducatives pour éviter la crise (diversion, sécurisation de l'environnement).

      3. Évaluation du danger : Analyse rigoureuse de la situation selon les critères de prévisibilité, d'immédiateté et de gravité.

      4. Application de la mesure de contrôle : Mise en œuvre selon les balises du protocole et les recommandations professionnelles.

      5. Modalités postsituationnelles : Retour sur l'événement, établissement des faits, soutien aux témoins (élèves et adultes) et révision du protocole.

      5. Prévention et climat scolaire

      La prévention est la "première voie d'action". Le document souligne l'importance du Système de soutien à paliers multiples (SSPM) :

      Palier 1 (Universel) : Soutien proactif pour tous les élèves (climat sain, règles claires, relations positives).

      Palier 2 (Ciblé) : Soutien supplémentaire pour les élèves à risque (autorégulation, habiletés sociales).

      Palier 3 (Intensif) : Interventions individualisées pour les difficultés graves ou persistantes.

      Le modèle "3 x 3" du CSSMB est cité en exemple, croisant l'intensité de l'intervention avec les sphères individuelle, scolaire et familiale.

      6. Rôles et responsabilités clés

      Le succès de ce cadre repose sur une responsabilité partagée :

      Direction d'établissement : Coordonne l'élaboration des protocoles, assure la formation du personnel et veille au bien-être physique et psychologique de tous.

      Personnel professionnel habilité (Ergothérapeutes, infirmiers, médecins, physiothérapeutes, psychoéducateurs, psychologues) : Réalise l'évaluation clinique, décide de la mesure en contexte planifié et émet des recommandations.

      Intervenants scolaires : Collaborent à l'analyse des comportements, appliquent les mesures en suivant les protocoles et informent la direction.

      Parents et élèves : Doivent être impliqués activement dans l'élaboration du protocole-élève. Un consentement libre et éclairé est requis pour toute mesure planifiée.

      Citations et informations critiques

      « Une mesure de contrôle [...] est une intervention de dernier recours qui devrait être réalisée exclusivement en situation d’urgence, c’est-à-dire lorsque la sécurité du personnel ou des élèves est menacée. » — Bernard Drainville, Ministre de l'Éducation

      « L’utilisation d’une mesure de contrôle n’est pas préconisée en milieu scolaire. [...] Elle ne doit jamais être employée comme mesure éducative ou punitive ou encore pour faciliter la surveillance de l’élève. » — Source Contextuelle, Section 1.1

      « Le recours aux mesures de contrôle est susceptible d’entraîner des blessures physiques et psychologiques qui peuvent avoir des implications à long terme. » — Source Contextuelle, Section 1

    1. If "7 out of 10" means something very different to different people, that's a fundamental challenge for the WELLBY as a tool for comparing interventions.

      This is a bit too simple. Note WELLBY, as used in the simplest approaches, mainly requires differences to be comparable-- and even linear -- across individuals. Moving from 1 to 3 is equally valued as moving from 4 to 6 or 8 to 10, and gets twice the value in this measure as moving 2 people from 3 to 4.

    1. Le Partenariat en Santé : Synthèse de trois Expériences de Terrain

      Ce document de synthèse analyse les interventions de trois équipes lors d'une session organisée par le Centre Opérationnel du Partenariat en Santé (COPS).

      Il explore la mise en œuvre concrète du partenariat en santé à travers les secteurs des soins primaires, du sanitaire et du médico-social.

      Résumé Exécutif

      L'intégration du patient et de ses proches comme partenaires actifs transforme durablement les pratiques de soin.

      Les retours d'expérience mettent en lumière une transition fondamentale : passer d'une logique de « faire pour » le patient à une logique de « faire avec » lui.

      Points clés à retenir :

      Diversité des modèles : Le partenariat s'adapte à différents contextes, de la gouvernance des structures territoriales (CPTS) à la co-construction de parcours hospitaliers spécifiques ou au soutien à domicile.

      Défis opérationnels : Le recrutement des patients partenaires, l'acculturation des professionnels, la gestion du temps commun et la pérennisation des financements constituent les principaux obstacles.

      Bénéfices mutuels : Le partenariat améliore la pertinence des soins, réduit l'isolement des familles et renforce le sens du travail pour les professionnels de santé, contribuant ainsi à une meilleure qualité de vie au travail (QVT).

      Rigueur méthodologique : Pour éviter le « tokenisme » (participation de façade), une méthodologie rigoureuse et une coordination dédiée sont essentielles.

      --------------------------------------------------------------------------------

      1. Expérience en Soins Primaires : La CPTS du Grand Pic Saint-Loup

      Les Communautés Professionnelles Territoriales de Santé (CPTS) regroupent des acteurs de santé libéraux pour mener des actions de santé publique. Dans cette expérience, le partenariat est envisagé comme une confrontation de « morceaux de réalité ».

      Niveaux d'implication

      Le partenariat au sein de la CPTS se décline sur plusieurs strates :

      Consultation : Réalisation d'enquêtes sur l'expérience des patients dans les lieux de soins non programmés pour comprendre leurs motivations de déplacement.

      Parcours de soins : Implication de patients experts dans des groupes de travail pluriprofessionnels (insuffisance cardiaque, diabète, santé orale).

      Gouvernance : Création d'un collège spécifique au sein de l'association ouvert aux patients, élus et habitants, disposant de voix consultatives au conseil d'administration.

      Freins et leviers identifiés

      | Catégorie | Détails | | --- | --- | | Freins | Confusion sémantique (multiplicité des termes : patient expert, traceur, coach) ; Difficulté de recrutement local ; Absence de statut administratif (SIRET) pour rémunérer les patients sans association. | | Leviers | Appui des médecins spécialistes hospitaliers déjà acculturés ; Création d'espaces de rencontre hors cabinets médicaux (ex: dépistage en centre commercial). |

      --------------------------------------------------------------------------------

      2. Expérience en Secteur Sanitaire : Polyclinique Saint-Roch

      Le projet « Au cœur des soins », mené avec l'association Tremplin, porte sur le parcours des enfants porteurs de fentes faciales. Il repose sur une collaboration étroite entre parents partenaires et soignants.

      Objectifs et Méthodologie

      L'ambition est de promouvoir une relation de soin partenariale dès le diagnostic.

      1. Recueil de l'expérience : Écoute du vécu des parents.

      2. Approfondissement : Identification précise des besoins.

      3. Co-construction : Utilisation d'outils participatifs nouveaux en milieu hospitalier.

      L'ingrédient secret : Une coordination dédiée (représentant 2/3 des fonds du projet) pour organiser les espaces de dialogue et garantir la rigueur de la démarche, évitant ainsi d'utiliser les patients pour la simple forme.

      Impacts observés

      Pour les familles : Reconnaissance de leur rôle d'acteur et réduction du sentiment d'isolement.

      Pour les professionnels : Meilleure compréhension des besoins réels. Une orthophoniste témoigne : « J'ai le sentiment d'aller plus vite, d'être plus efficace... la charge mentale est aussi vraiment moindre. »

      Relationnel : Établissement d'une horizontalité dans les échanges, permettant aux patients de comprendre aussi les contraintes des soignants.

      --------------------------------------------------------------------------------

      3. Expérience en Secteur Médico-Social : Association AA

      L'association AA (Aide et Soins à domicile) s'est engagée dans le partenariat suite à une crise majeure : un conflit délétère entre une équipe de soins et une aidante, ayant entraîné un épuisement professionnel massif (10 arrêts de travail sur 10 salariés).

      Évolution de la démarche

      Initialement, l'association a commis l'erreur de construire la démarche entre professionnels uniquement. Le « rétropédalage » a été nécessaire pour intégrer réellement des aidants et des personnes accompagnées dans les groupes de travail en 2025.

      Les témoignages clés recueillis :

      Mme Isabelle (personne accompagnée) : Souligne l'importance d'être attentif à la demande : « Parfois le salarié agit comme il souhaite mais pas comme la personne le souhaite. »

      M. Marc (aidant) : Note que les soignants doivent accepter les conseils des tiers lorsqu'ils manquent de connaissance sur le patient spécifique.

      Défis spécifiques au domicile

      Fatigabilité : La participation des usagers est contrainte par leur état de santé ou leur charge d'aidant.

      Changement de paradigme : Abandonner le terme de « prise en charge » (jugé passif) au profit de « prendre en soin » ou « accompagner ».

      Transparence : Accepter de recevoir des critiques directes et parfois dures sur la qualité de l'accompagnement.

      --------------------------------------------------------------------------------

      4. Analyse Transversale : Obstacles, Leviers et Perspectives

      L'analyse comparée des trois interventions permet de dégager des constantes dans la mise en œuvre du partenariat en santé.

      Synthèse des obstacles communs

      1. L'Acculturation : Le niveau de maturité face au partenariat est très hétérogène. Certains professionnels y voient une remise en question de leur autorité, d'autres une perte de temps.

      2. Le Recrutement : Trouver le « bon patient pour le bon parcours », disponible et prêt à s'investir dans la durée, reste complexe.

      3. La Temporalité : Aligner les agendas des professionnels libéraux, des salariés hospitaliers et des patients (souvent fatigués) est un défi logistique permanent.

      4. Le Financement : La pérennisation des ressources pour rémunérer le temps de coordination et l'expertise des patients est cruciale.

      Facteurs de succès

      Volonté Institutionnelle : Un engagement fort de la direction et des cadres est indispensable pour lever les résistances.

      Savoirs Expérientiels : Reconnaître que le savoir issu du vécu de la maladie est complémentaire au savoir scientifique et clinique.

      Évaluation de l'impact : Bien que difficile, la mesure de l'amélioration de l'expérience patient et de la qualité des soins est nécessaire pour valider la démarche à long terme.

      Conclusion sur la Qualité de Vie au Travail (QVT)

      Une observation majeure émerge : le partenariat en santé est un levier puissant de bien-être au travail.

      En améliorant la compréhension des besoins et en réduisant les situations conflictuelles, il redonne du sens aux missions des professionnels et diminue leur charge mentale, malgré l'investissement temporel initial requis pour sa mise en place.

    1. Briefing : Perspectives et Jalons du Partenariat en Santé

      Synthèse Sommaire

      Ce document de synthèse détaille les perspectives post-événement de la journée régionale consacrée au partenariat en santé en Occitanie.

      Il s'articule autour de l'action coordonnée de trois entités clés : la Structure Régionale d'Appui (SRA), France Assos Santé Occitanie et le Centre Opérationnel du Partenariat en Santé (COPS).

      L'objectif central est de transformer les réflexions de la journée en actions concrètes par le biais de la formation, de l'accompagnement méthodologique et de la mise à disposition de ressources structurantes.

      Les points saillants incluent l'intégration du partenariat dans l'évaluation des pratiques professionnelles à l'horizon 2026, le renforcement de la synergie entre représentants des usagers et patients partenaires, et le déploiement d'outils numériques pour faciliter le maillage territorial des projets de santé.

      --------------------------------------------------------------------------------

      1. Orientations Stratégiques de la Structure Régionale d'Appui (SRA)

      La SRA réaffirme son ambition d'agir collectivement pour l'amélioration des parcours de santé à travers huit thématiques prioritaires, dont le partenariat fait partie intégrante.

      Modalités d'Intervention

      L'action de la SRA se déploie selon plusieurs axes opérationnels :

      Information et Sensibilisation : Organisation de journées régionales annuelles.

      Formation et Enseignement : Participation à l'enseignement universitaire et à la recherche.

      Accompagnement Méthodologique : Soutien à l'évaluation des pratiques et des organisations sur le terrain, sans se substituer aux acteurs locaux.

      Production de Données : Publication de travaux de recherche dans le domaine de la santé.

      Prospective : L'Horizon 2026

      Une ambition majeure a été annoncée pour l'année 2026 : l'intégration de la thématique du partenariat en santé au cœur de l'évaluation des pratiques professionnelles (EPP). Ce sujet, reconnu comme complexe et passionnant, fera l'objet d'un appel à manifestation d'intérêt pour les acteurs souhaitant approfondir cette réflexion.

      --------------------------------------------------------------------------------

      2. France Assos Santé Occitanie : Mobilisation et Formation

      En tant qu'union d'associations agréées, France Assos Santé joue un rôle de fédérateur tant au niveau régional que national.

      Structure et Représentation

      | Niveau | Volume d'associations | Domaines couverts | | --- | --- | --- | | Régional | 70 associations | Personnes malades, situation de handicap, consommateurs, santé environnementale, associations familiales, précarité. | | National | ~100 associations | Identiques au niveau régional. |

      Missions et Ressources pour les Usagers

      Information et Veille : Observation du bon fonctionnement du système de santé et interventions médiatiques.

      Accès aux données : Mise à disposition d'un site web (régional et national) et d'un extranet dédié aux représentants des usagers (RU) comprenant fiches pratiques et guides.

      Guide de référence : Co-construction avec "Savoir Patient" d'un guide sur les facettes de l'engagement de l'usager partenaire (pair-aidance, recherche, formation des professionnels).

      Dispositif de Formation

      L'organisme propose un parcours structuré pour accompagner les mandats des RU :

      Volume : 41 jours de formation dispensés l'an passé dans 7 départements.

      Formation "RU et Patients Partenaires" : Un module spécifique visant à améliorer la collaboration et la connaissance mutuelle entre ces deux types d'acteurs de l'engagement.

      Accessibilité : Formations disponibles en présentiel et en distanciel via un catalogue dédié.

      --------------------------------------------------------------------------------

      3. Le Centre Opérationnel du Partenariat en Santé (COPS) : Appui Opérationnel

      Le COPS se définit comme un facilitateur de projets de partenariat, agissant concrètement auprès des structures et des acteurs.

      Accompagnement de Projets

      Le COPS intervient en binôme (incluant un chargé de projet et une perspective professionnelle) sur sollicitation via une plateforme dédiée. Les domaines d'appui incluent :

      • Le médico-social et les soins primaires.

      • La co-construction de parcours (ex: hospitalisation à domicile - HAD, oncologie, santé mentale).

      • L'accompagnement stratégique et la qualité.

      Outils et Plateforme Collaborative

      La plateforme participative du COPS offre plusieurs services en libre accès :

      Répertoire et Cartographie : Outils permettant d'identifier des patients partenaires ou des structures porteuses de projets pour favoriser le réseautage autonome.

      Ressources Multimédia : "Copcasts" (podcasts), webinaires, supports de présentation et guides (dont la fiche "Engager" pour l'implication des patients).

      Formation : Offre de e-learning certifiée Qualiopi, avec des formats "à la carte" pour les équipes projets.

      --------------------------------------------------------------------------------

      4. Jalons et Événements à Venir

      Le calendrier institutionnel prévoit plusieurs étapes clés pour maintenir la dynamique du partenariat en santé :

      | Date / Période | Événement / Action | Thématique | | --- | --- | --- | | 9 décembre | Webinaire | Lien entre partenariat en santé et Qualité de Vie au Travail (QVT). | | Prochainement | Soirée départementale | Déplacement dans le Lot (actions "aller vers"). | | 1er Trimestre 2026 | Soirée départementale | Rencontre dans les Pyrénées-Orientales (PO). | | Courant 2026 | Nouveaux formats | Groupes d'analyse de pratiques (mixtes, patients et professionnels) et ateliers de co-développement. |

      5. Synthèse Éthique

      L'Espace de réflexion éthique Occitanie, représenté par le Professeur Michel Clanet, assure une fonction de "grand témoin".

      Son rôle est d'analyser la place du partenariat en santé dans la démarche éthique globale, soulignant que l'engagement des partenaires n'est pas seulement une modalité organisationnelle, mais une réflexion profonde sur la pratique du soin et le respect des parties prenantes.

    1. Sécuriser l'apprentissage et l'épanouissement : Mettre fin à la violence dans et par l'éducation

      Synthèse de haut niveau

      La violence en milieu éducatif constitue une crise mondiale d'une ampleur alarmante, touchant environ un milliard d'enfants chaque année.

      Loin d'être des incidents isolés, ces violences — qu'elles soient physiques, sexuelles ou psychologiques — s'inscrivent dans un continuum qui entrave le droit fondamental à l'éducation et compromet le développement des sociétés.

      L'impact économique est colossal, avec une perte estimée à 11 000 milliards de dollars en revenus futurs à l'échelle mondiale.

      Le présent document souligne l'impératif de passer d'interventions fragmentées à une approche holistique et systémique.

      L'éducation ne doit plus seulement être vue comme un lieu où la violence se produit, mais comme le levier principal pour la prévenir.

      Pour transformer durablement les écoles en sanctuaires de sécurité, il est impératif d'intégrer la prévention et la réponse à la violence au cœur même des systèmes éducatifs, et non comme une simple responsabilité additionnelle.

      --------------------------------------------------------------------------------

      I. État des lieux : Les multiples visages de la violence

      La violence en milieu éducatif est un phénomène complexe qui dépasse largement le cadre des agressions physiques visibles. Elle se manifeste sous plusieurs formes interdépendantes :

      1. Typologie de la violence envers les apprenants

      Violence physique : Inclut les bagarres, les attaques et les châtiments corporels. Plus d'un tiers des élèves ont été impliqués dans une bagarre physique au cours de l'année écoulée.

      Violence psychologique : Humiliation, intimidation, insultes et exclusion sociale. À titre d'exemple, 42 % des jeunes LGBTQ+ rapportent avoir été ridiculisés ou menacés à l'école.

      Violence sexuelle : Harcèlement, attouchements et rapports forcés. Jusqu'à 25 % des adolescents subissent des violences sexuelles, dont 40 % se produisent dans l'enceinte scolaire.

      Harcèlement (Bullying) : Caractérisé par un déséquilibre de pouvoir, il touche 1 apprenant sur 3 chaque mois à travers le monde.

      Violence facilitée par la technologie : Le cyberharcèlement et l'exploitation en ligne amplifient la portée des agressions au-delà des murs de l'école.

      2. Violence institutionnelle et structurelle

      La violence ne provient pas uniquement des individus ; elle peut être intégrée au système lui-même via :

      • Des politiques discriminatoires (ex: codes vestimentaires biaisés).

      • Des méthodes d'enseignement inéquitables ou un curriculum excluant certains groupes.

      • La normalisation de la violence comme outil de discipline.

      3. Violence contre le personnel éducatif

      Le personnel n'est pas épargné. Une enquête révèle que près de 80 % des enseignants ont subi une forme de violence à l'école au cours d'une année scolaire, ce qui dégrade leur bien-être et leur efficacité pédagogique.

      --------------------------------------------------------------------------------

      II. Analyse des moteurs de la violence : Une approche intersectionnelle

      La violence est alimentée par une interaction complexe de facteurs à plusieurs niveaux. L'identité de l'apprenant (genre, handicap, race, orientation sexuelle) détermine souvent la nature et l'intensité de la violence subie.

      | Niveau de facteur | Exemples de moteurs identifiés | | --- | --- | | Individuel | Antécédents de violence domestique, manque de sensibilisation aux droits. | | Interpersonnel | Mauvaise gestion des conflits, absence de modèles adultes positifs. | | Systémique | Manque de formation sur la discipline positive, absence de protocoles de signalement. | | Communautaire | Normalisation des châtiments corporels, influence des gangs ou des conflits locaux. | | Sociétal | Inégalités socio-économiques, cadres juridiques faibles ou inexistants. | | Normatif | Normes de genre néfastes (valorisation de la dureté masculine, soumission féminine). |

      La dimension de genre (SRGBV)

      La violence de genre en milieu scolaire (SRGBV) est omniprésente. Les filles sont plus exposées au harcèlement sexuel et aux grossesses précoces forcées, tandis que les garçons subissent davantage de châtiments corporels et de violences physiques, souvent au nom de normes de masculinité rigides.

      --------------------------------------------------------------------------------

      III. Les répercussions : Au-delà de l'enceinte scolaire

      Les conséquences de la violence sont profondes et durables, affectant non seulement l'individu mais aussi la société entière :

      Impact éducatif : Les élèves victimes sont trois fois plus susceptibles de se sentir aliénés et deux fois plus enclins à manquer l'école. Cela mène à une baisse des résultats en lecture et calcul, et souvent au décrochage scolaire.

      Santé mentale : Anxiété, dépression, perte d'estime de soi et comportements d'automutilation.

      Santé physique : Risques accrus de VIH, d'infections sexuellement transmissibles et de grossesses non planifiées (facteur majeur de décrochage chez les adolescentes).

      Coût économique : La violence entrave le développement du capital humain, entraînant des pertes de revenus massives sur toute une vie.

      --------------------------------------------------------------------------------

      IV. Le cadre d'action : Une approche holistique

      Pour mettre fin à la violence, l'UNESCO et ses partenaires préconisent une transformation radicale basée sur six piliers fondamentaux :

      1. Curriculum et apprentissage : Intégrer des programmes d'éducation sexuelle complète (ESC), d'apprentissage socio-émotionnel (SEL) et de prévention de la violence pour transformer les attitudes dès le plus jeune âge.

      2. Environnement scolaire : Créer des espaces physiques sûrs (toilettes séparées, éclairage) et promouvoir une culture de "discipline positive" qui exclut tout châtiment corporel.

      3. Mécanismes de signalement : Mettre en place des systèmes confidentiels, accessibles et adaptés aux enfants (lignes d'assistance, boîtes aux lettres, focal points).

      4. Politiques et lois : Adopter des législations nationales interdisant explicitement les châtiments corporels (comme au Pérou en 2015) et promouvoir l'inclusion radicale (comme en Sierra Leone).

      5. Partenariats et mobilisation : Collaborer avec les syndicats d'enseignants, les parents, les leaders communautaires et les entreprises technologiques.

      6. Données et preuves : Utiliser des outils de diagnostic et des enquêtes numériques (ex: système Ma’An en Jordanie) pour orienter les interventions de manière factuelle.

      --------------------------------------------------------------------------------

      V. Perspectives pour un changement durable

      La réussite de cette transformation repose sur quatre principes transversaux :

      Centrage sur l'apprenant : Prioriser la sécurité et le "ne pas nuire" (do no harm).

      Sensibilité aux traumatismes : Éviter la re-traumatisation lors du soutien aux victimes.

      Adaptation au contexte : Reconnaître que les solutions en zone de conflit diffèrent de celles en zone urbaine stable.

      Transformation du rôle de l'enseignant : Soutenir les enseignants non seulement comme protecteurs, mais aussi comme individus ayant besoin de protection et de formation continue.

      Citation clé

      "Puisque les guerres prennent naissance dans l'esprit des hommes et des femmes, c'est dans l'esprit des hommes et des femmes que doivent être élevées les défenses de la paix."Acte constitutif de l'UNESCO

      En conclusion, mettre fin à la violence dans l'éducation n'est pas seulement une obligation morale et légale, c'est une condition sine qua non pour bâtir une société juste, inclusive et prospère.

      L'heure est à l'action collective et systématique pour faire de chaque école un véritable havre de paix.

    1. Sécuriser l'apprentissage et l'épanouissement : Mettre fin à la violence dans et par l'éducation

      Synthèse de haut niveau

      La violence en milieu éducatif constitue une crise mondiale d'une ampleur alarmante, touchant environ un milliard d'enfants chaque année.

      Loin d'être des incidents isolés, ces violences — qu'elles soient physiques, sexuelles ou psychologiques — s'inscrivent dans un continuum qui entrave le droit fondamental à l'éducation et compromet le développement des sociétés.

      L'impact économique est colossal, avec une perte estimée à 11 000 milliards de dollars en revenus futurs à l'échelle mondiale.

      Le présent document souligne l'impératif de passer d'interventions fragmentées à une approche holistique et systémique.

      L'éducation ne doit plus seulement être vue comme un lieu où la violence se produit, mais comme le levier principal pour la prévenir.

      Pour transformer durablement les écoles en sanctuaires de sécurité, il est impératif d'intégrer la prévention et la réponse à la violence au cœur même des systèmes éducatifs, et non comme une simple responsabilité additionnelle.

      --------------------------------------------------------------------------------

      I. État des lieux : Les multiples visages de la violence

      La violence en milieu éducatif est un phénomène complexe qui dépasse largement le cadre des agressions physiques visibles. Elle se manifeste sous plusieurs formes interdépendantes :

      1. Typologie de la violence envers les apprenants

      Violence physique : Inclut les bagarres, les attaques et les châtiments corporels. Plus d'un tiers des élèves ont été impliqués dans une bagarre physique au cours de l'année écoulée.

      Violence psychologique : Humiliation, intimidation, insultes et exclusion sociale. À titre d'exemple, 42 % des jeunes LGBTQ+ rapportent avoir été ridiculisés ou menacés à l'école.

      Violence sexuelle : Harcèlement, attouchements et rapports forcés. Jusqu'à 25 % des adolescents subissent des violences sexuelles, dont 40 % se produisent dans l'enceinte scolaire.

      Harcèlement (Bullying) : Caractérisé par un déséquilibre de pouvoir, il touche 1 apprenant sur 3 chaque mois à travers le monde.

      Violence facilitée par la technologie : Le cyberharcèlement et l'exploitation en ligne amplifient la portée des agressions au-delà des murs de l'école.

      2. Violence institutionnelle et structurelle

      La violence ne provient pas uniquement des individus ; elle peut être intégrée au système lui-même via :

      • Des politiques discriminatoires (ex: codes vestimentaires biaisés).

      • Des méthodes d'enseignement inéquitables ou un curriculum excluant certains groupes.

      • La normalisation de la violence comme outil de discipline.

      3. Violence contre le personnel éducatif

      Le personnel n'est pas épargné. Une enquête révèle que près de 80 % des enseignants ont subi une forme de violence à l'école au cours d'une année scolaire, ce qui dégrade leur bien-être et leur efficacité pédagogique.

      --------------------------------------------------------------------------------

      II. Analyse des moteurs de la violence : Une approche intersectionnelle

      La violence est alimentée par une interaction complexe de facteurs à plusieurs niveaux. L'identité de l'apprenant (genre, handicap, race, orientation sexuelle) détermine souvent la nature et l'intensité de la violence subie.

      | Niveau de facteur | Exemples de moteurs identifiés | | --- | --- | | Individuel | Antécédents de violence domestique, manque de sensibilisation aux droits. | | Interpersonnel | Mauvaise gestion des conflits, absence de modèles adultes positifs. | | Systémique | Manque de formation sur la discipline positive, absence de protocoles de signalement. | | Communautaire | Normalisation des châtiments corporels, influence des gangs ou des conflits locaux. | | Sociétal | Inégalités socio-économiques, cadres juridiques faibles ou inexistants. | | Normatif | Normes de genre néfastes (valorisation de la dureté masculine, soumission féminine). |

      La dimension de genre (SRGBV)

      La violence de genre en milieu scolaire (SRGBV) est omniprésente. Les filles sont plus exposées au harcèlement sexuel et aux grossesses précoces forcées, tandis que les garçons subissent davantage de châtiments corporels et de violences physiques, souvent au nom de normes de masculinité rigides.

      --------------------------------------------------------------------------------

      III. Les répercussions : Au-delà de l'enceinte scolaire

      Les conséquences de la violence sont profondes et durables, affectant non seulement l'individu mais aussi la société entière :

      Impact éducatif : Les élèves victimes sont trois fois plus susceptibles de se sentir aliénés et deux fois plus enclins à manquer l'école. Cela mène à une baisse des résultats en lecture et calcul, et souvent au décrochage scolaire.

      Santé mentale : Anxiété, dépression, perte d'estime de soi et comportements d'automutilation.

      Santé physique : Risques accrus de VIH, d'infections sexuellement transmissibles et de grossesses non planifiées (facteur majeur de décrochage chez les adolescentes).

      Coût économique : La violence entrave le développement du capital humain, entraînant des pertes de revenus massives sur toute une vie.

      --------------------------------------------------------------------------------

      IV. Le cadre d'action : Une approche holistique

      Pour mettre fin à la violence, l'UNESCO et ses partenaires préconisent une transformation radicale basée sur six piliers fondamentaux :

      1. Curriculum et apprentissage : Intégrer des programmes d'éducation sexuelle complète (ESC), d'apprentissage socio-émotionnel (SEL) et de prévention de la violence pour transformer les attitudes dès le plus jeune âge.

      2. Environnement scolaire : Créer des espaces physiques sûrs (toilettes séparées, éclairage) et promouvoir une culture de "discipline positive" qui exclut tout châtiment corporel.

      3. Mécanismes de signalement : Mettre en place des systèmes confidentiels, accessibles et adaptés aux enfants (lignes d'assistance, boîtes aux lettres, focal points).

      4. Politiques et lois : Adopter des législations nationales interdisant explicitement les châtiments corporels (comme au Pérou en 2015) et promouvoir l'inclusion radicale (comme en Sierra Leone).

      5. Partenariats et mobilisation : Collaborer avec les syndicats d'enseignants, les parents, les leaders communautaires et les entreprises technologiques.

      6. Données et preuves : Utiliser des outils de diagnostic et des enquêtes numériques (ex: système Ma’An en Jordanie) pour orienter les interventions de manière factuelle.

      --------------------------------------------------------------------------------

      V. Perspectives pour un changement durable

      La réussite de cette transformation repose sur quatre principes transversaux :

      Centrage sur l'apprenant : Prioriser la sécurité et le "ne pas nuire" (do no harm).

      Sensibilité aux traumatismes : Éviter la re-traumatisation lors du soutien aux victimes.

      Adaptation au contexte : Reconnaître que les solutions en zone de conflit diffèrent de celles en zone urbaine stable.

      Transformation du rôle de l'enseignant : Soutenir les enseignants non seulement comme protecteurs, mais aussi comme individus ayant besoin de protection et de formation continue.

      Citation clé

      "Puisque les guerres prennent naissance dans l'esprit des hommes et des femmes, c'est dans l'esprit des hommes et des femmes que doivent être élevées les défenses de la paix."Acte constitutif de l'UNESCO

      En conclusion, mettre fin à la violence dans l'éducation n'est pas seulement une obligation morale et légale, c'est une condition sine qua non pour bâtir une société juste, inclusive et prospère.

      L'heure est à l'action collective et systématique pour faire de chaque école un véritable havre de paix.

    1. La Protection de l’Enfance en France : Analyse de la Crise et Préconisations du CESE

      Synthèse (Executive Summary)

      Le système de protection de l’enfance en France traverse une crise profonde et structurelle qui menace ses missions fondamentales.

      Bien que le cadre législatif (lois de 2007, 2016 et 2022) soit considéré comme l'un des plus aboutis, plaçant l'intérêt supérieur et les besoins fondamentaux de l'enfant au cœur des dispositifs, un décalage alarmant persiste entre l'ambition légale et la réalité du terrain.

      Les points critiques identifiés incluent une augmentation constante des besoins (+49 % de mineurs accueillis en 20 ans), une pénurie sévère de professionnels qualifiés, et une hétérogénéité territoriale préoccupante.

      L'un des constats les plus graves est l'inexécution d'une part significative des décisions de justice destinées à protéger les enfants en danger.

      Le Conseil économique, social et environnemental (CESE) appelle à une remobilisation nationale, une gouvernance interministérielle renforcée sous l'égide du Premier ministre, et une garantie d'égalité de traitement pour tous les mineurs, incluant les mineurs non accompagnés (MNA) et les enfants en situation de handicap.

      --------------------------------------------------------------------------------

      I. Un État de Crise Structurelle et Statistique

      A. Une hausse préoccupante de la demande de protection

      Les données de l'Observatoire national de la protection de l'enfance (ONPE) et de la DREES révèlent une pression sans précédent sur les services de l'Aide Sociale à l'Enfance (ASE) :

      Chiffres clés : Au 31 décembre 2022, 344 682 mineurs et jeunes majeurs sont pris en charge.

      Évolution : Le nombre de jeunes accueillis en établissement a augmenté de plus de 50 % entre 2011 et 2022.

      Déjudiciarisation en échec : Malgré la volonté de privilégier l'administratif, 82 % des prises en charge de mineurs résultent d'une décision judiciaire.

      B. Le lien entre pauvreté et protection de l'enfance

      Il existe une corrélation forte entre la précarité économique et l'intervention de la protection de l'enfance. La France affiche un taux de pauvreté infantile de 20 % (33ème position sur 39 pays de l'UE/OCDE).

      Conséquences : 2,9 millions d'enfants vivent sous le seuil de pauvreté ; 42 000 sont sans domicile fixe.

      Coût social : Les événements traumatisants subis pendant l'enfance coûtent environ 34,5 milliards d'euros par an à la France en frais de santé et entraînent une perte d'espérance de vie de 20 ans pour les victimes.

      --------------------------------------------------------------------------------

      II. Défaillances de Gouvernance et de Financement

      A. Pilotage national et territorial

      La gouvernance actuelle souffre d'un manque de lisibilité interministérielle et de disparités territoriales majeures.

      Inégalités territoriales : Le taux de prise en charge varie de 10 pour 1000 en Guyane à 49 pour 1000 dans la Nièvre.

      Financement : Les dépenses des départements pour l'ASE ont atteint 9,7 milliards d'euros en 2023. Les ressources (principalement les DMTO) sont volatiles et déconnectées de la dynamique des besoins.

      Contractualisation : Le levier financier de l'État reste marginal (environ 140 M€ via le programme 304) par rapport aux budgets départementaux.

      B. L'inexécution des décisions de justice

      Le système repose sur des juges en sous-effectif (un juge suit 450 à 500 enfants contre un idéal de 325). En raison du manque de places en structure, des décisions de placement ne sont pas exécutées, laissant des enfants en danger dans leur milieu familial, ou "mal exécutées" dans des structures inadaptées.

      --------------------------------------------------------------------------------

      III. Garantir les Droits et les Besoins de l'Enfant

      A. Le Projet pour l'Enfant (PPE) : Une obligation non respectée

      Instauré en 2007, le PPE doit être la "boussole" du parcours de l'enfant pour garantir sa stabilité et son développement. Cependant, il n'est toujours pas effectif dans de nombreux départements.

      Préconisation : Faire du PPE une condition préalable à l'attribution des financements de l'État.

      B. La prise en charge de la santé et du handicap

      Les enfants de l'ASE présentent des pathologies psychiques et somatiques plus fréquentes.

      Urgence psychologique : Le CESE demande que tout enfant protégé soit présumé en situation d'urgence psychologique pour faciliter l'accès immédiat aux soins (CMPP).

      Handicap : Environ 25 % des enfants accueillis sont en situation de handicap, mais seul un tiers bénéficie d'un accompagnement médico-social adapté.

      --------------------------------------------------------------------------------

      IV. Groupes Particulièrement Vulnérables

      A. Les Mineurs Non Accompagnés (MNA) : Une protection "au rabais"

      Le CESE dénonce une approche de plus en plus centrée sur les politiques migratoires plutôt que sur la protection de l'enfance.

      Discrimination financière : Le prix de journée pour un MNA est souvent de 50-60 € contre 170 € pour les autres mineurs.

      Évaluation de la minorité : Les procédures sont jugées lapidaires et s'appuient trop souvent sur des tests osseux au manque de fiabilité scientifique avéré.

      B. Les jeunes majeurs

      La sortie du dispositif à 18 ou 21 ans reste une rupture brutale. Une étude de l'Insee indique qu'un quart des sans-abri sont d'anciens enfants placés.

      --------------------------------------------------------------------------------

      V. Les Professionnels : Une Crise d'Attractivité Majeure

      Le secteur souffre d'une pénurie de personnel dans toutes les catégories (éducateurs, assistants familiaux, médecins scolaires).

      Assistants familiaux : Leurs effectifs ont baissé de 9 % en 6 ans.

      Médecine scolaire : Moins de 800 médecins pour 12 millions d'élèves, ce qui entrave le repérage précoce.

      Conditions de travail : Les horaires atypiques, les faibles rémunérations et le sentiment de "travail en miettes" découragent les vocations.

      --------------------------------------------------------------------------------

      VI. Tableau Synthétique des Préconisations Clés du CESE

      | N° | Thématique | Mesure Principale | | --- | --- | --- | | 1 | Statistique | Missionner le GIP France Enfance Protégée pour un état des lieux annuel exhaustif des besoins et des mesures non exécutées. | | 2 & 3 | État | Créer une stratégie interministérielle bisannuelle avec péréquation financière et incitations pour les départements. | | 4 | Coordination | Généraliser les Comités Départementaux pour la Protection de l'Enfance (CDPE) pour décloisonner les acteurs. | | 6 | MNA | Interdire toute distinction de traitement entre MNA et autres mineurs (santé, éducation). | | 8 | Formation | Définir un plan de formation commun à tous les professionnels "sentinelles" (Éducation nationale, police, santé). | | 9 | Accueil | Diversifier les modes de prise en charge en multipliant les petites unités de vie (moins de 7 enfants). | | 10 | PPE | Rendre le "Projet pour l'Enfant" effectif et obligatoire pour tout financement. | | 11 | Santé | Systématiser l'accueil rapide en pédopsychiatrie (présomption d'urgence psychologique). | | 13 | Justice | Assistance systématique d'un avocat spécialisé pour l'enfant protégé. | | 15 | Contrôle | Créer une autorité nationale indépendante pour le contrôle des structures d'accueil. | | 17 | Droit | Créer un Code de l'Enfance regroupant l'ensemble des droits, libertés et devoirs des enfants. | | 18 | Encadrement | Publier les décrets sur le socle minimal d'encadrement et instaurer un nombre maximal de mesures par travailleur social. |

      --------------------------------------------------------------------------------

      Conclusion

      La protection de l'enfance ne peut plus être la variable d'ajustement des dysfonctionnements institutionnels.

      Le CESE insiste sur le fait que l'enfant doit être le sujet et non l'objet de la protection.

      Sans un investissement massif dans les ressources humaines et une coordination réelle entre l'État et les départements, la promesse républicaine de protéger les plus vulnérables ne pourra être tenue.

    1. Synthèse de la Réflexion Éthique et du Partenariat en Santé

      Résumé Exécutif

      Ce document synthétise les interventions du Professeur Michel Clanet lors de la journée « Redonner du sens », consacrée au partenariat en santé et à l'éthique. L'analyse met en lumière le rôle pivot des Espaces de Réflexion Éthique (ERE) dans l'acculturation des professionnels et des citoyens aux enjeux de la bioéthique.

      Les points clés incluent :

      La redéfinition de la relation de soin comme une rencontre entre deux vulnérabilités (soignant et soigné), visant une horizontalité accrue via le partenariat.

      La distinction entre conscience professionnelle et conscience morale, dont le conflit génère le « dilemme éthique ».

      L'institutionnalisation de la réflexion éthique par le dialogue collégial, indispensable pour éclairer la décision clinique et institutionnelle.

      L'élargissement de l'éthique à la démocratie en santé, intégrant la prévention, la santé environnementale et la lutte contre les inégalités sociales.

      --------------------------------------------------------------------------------

      I. Les Espaces de Réflexion Éthique (ERE) : Cadre et Missions

      Les ERE sont des structures institutionnelles nées d'un concept de 2004 et créées officiellement en 2012. Ils dépendent du ministère de la Santé (DGOS) et sont rattachés aux centres hospitaliers universitaires (CHU). En Occitanie, le site principal se situe à Toulouse, avec un site d’appui à Montpellier.

      Missions principales

      Leurs actions s'articulent autour de deux axes majeurs :

      1. Secteur du soin et de l'accompagnement :

      ◦ Former et acculturer les professionnels à la réflexion éthique et à la bioéthique.    ◦ Répondre aux exigences de certification des établissements de santé et médico-sociaux.    ◦ Produire des guides pratiques (ex: La collégialité au domicile, La prise en charge de la vulnérabilité au domicile).

      2. Secteur de la cité et du citoyen :

      ◦ Agir comme prolongement régional du Comité Consultatif National d'Éthique (CCNE).    ◦ Organiser des États Généraux de la Santé (prochaine session au printemps 2026) pour recueillir la vision citoyenne sur des thèmes comme l’intelligence artificielle, la fin de vie, la PMA ou la santé environnement.

      --------------------------------------------------------------------------------

      II. Fondements Conceptuels de la Relation de Soin

      L'éthique en santé s'appuie sur une distinction entre la technique et l'intentionnalité.

      La double dimension du soin (selon Frédéric Vorms)

      Toute pratique de soin comporte deux éléments inséparables :

      Soigner quelque chose : L'aspect pratique et technique visant à traiter une maladie ou une souffrance isolée.

      Soigner quelqu'un : La dimension relationnelle et intentionnelle. Le soin s'exerce par égard pour autrui ; il ne suffit pas de pouvoir soigner, il faut le vouloir.

      La phénoménologie de l'attention (selon Jean-Philippe Pierron)

      La relation de soin se décline en trois niveaux d'attention :

      Faire attention : Prendre conscience de la vulnérabilité de l'autre.

      Être attentif : Exercer sa compétence technique et professionnelle.

      Être attentionné : Porter un regard de sollicitude et de disponibilité vers l'autre.

      --------------------------------------------------------------------------------

      III. La Démarche de Réflexion Éthique en Pratique

      L'éthique n'est pas qu'un cadre normatif ; c'est un engagement et un questionnement permanent sur la légitimité de l'action (« Que faut-il faire pour bien faire ? »).

      Le dilemme éthique

      Le dilemme naît d'un glissement ou d'un conflit entre :

      • La conscience professionnelle, parfois prisonnière de la connaissance technique et des procédures.

      • La conscience morale, qui renvoie aux valeurs fondamentales.

      Le dialogue collégial

      Pour résoudre une situation complexe, les structures éthiques privilégient le dialogue collégial, dont les caractéristiques sont :

      Absence de hiérarchie : La parole d'un médecin, d'un cadre ou d'un directeur a la même valeur que celle des autres professionnels.

      Multiplicité des regards : Écoute de tous les acteurs, y compris le recueil de la voix du patient.

      Éclairage et non décision : La réunion collégiale n'est pas un organe décisionnel mais une instance qui éclaire le responsable de la décision finale.

      Les principes cardinaux de l'éthique

      La réflexion s'appuie sur quatre piliers fondamentaux :

      1. Le respect de l'autonomie (liberté de choix).

      2. La bienfaisance (agir pour le bien).

      3. La non-malfaisance (éviter de nuire).

      4. La justice et l'équité.

      --------------------------------------------------------------------------------

      IV. Éthique et Partenariat en Santé : Une Convergence

      Le partenariat en santé est présenté comme un levier éthique majeur permettant de rééquilibrer la relation de soin.

      | Concept clé | Impact Éthique | | --- | --- | | Horizontalité | Lutte contre le « pouvoir du sarrau » (pouvoir médical) pour établir une relation plus égalitaire. | | Reconnaissance réciproque | Admettre que la vulnérabilité est partagée entre le soigné (besoin de soin) et le soignant (limites techniques/morales). | | Savoirs expérientiels | Reconnaissance par le soignant que le patient possède des savoirs propres et multiples. | | Pouvoir d'agir | Renforcement de l'autonomie et de la liberté de choix du patient (Empowerment). |

      Note critique : Une interrogation subsiste quant à l'équité d'accès au statut de « patient partenaire ». Il existe un risque de biais de recrutement, où certains profils pourraient ne pas se sentir légitimes pour assumer ce rôle.

      --------------------------------------------------------------------------------

      V. Perspective Macro : Démocratie en Santé et Prévention

      Le partenariat doit dépasser le cadre individuel du soin pour s'inscrire dans une dimension politique et sociale.

      Plaidoyer pour le partenariat : Nécessité de communiquer davantage pour convaincre les acteurs encore réticents ou ignorants du concept.

      Inégalités sociales de santé : Urgence d'aller vers les populations « invisibles » et précaires pour garantir une véritable équité dans le partenariat.

      Prévention et Citoyenneté : La santé commence dès l'enfance et concerne le maintien du bien-être. Le citoyen doit être acteur de la prévention, notamment face aux déterminants environnementaux (ex: maladies professionnelles liées aux pesticides chez les agriculteurs).

      Conclusion : Le partenariat en santé et la démocratie sanitaire relèvent d'un même combat éthique visant à impliquer fondamentalement les citoyens dans la gestion de leur santé et de leur environnement.

    1. Qualité et Partenariat en Santé : Vers un Nouveau Paradigme de Soins

      Synthèse

      Ce document synthétise les réflexions issues d'interventions d'experts sur l'articulation entre la qualité des parcours de santé et l'engagement des usagers.

      Le constat central est la nécessité de passer d'une vision paternaliste du soin à un véritable partenariat de co-leadership.

      L'analyse repose sur une réponse graduée de l'offre de soins (primaire, territoriale, tertiaire) et une définition de la qualité articulée autour de cinq piliers : accessibilité, pertinence, attentes des usagers, sécurité et efficience.

      Le partenariat est présenté non comme une finalité, mais comme un moyen d'atteindre une qualité optimale.

      Ce changement de modèle s'appuie sur la reconnaissance des « savoirs expérientiels » du patient, qui consacre en moyenne 6 250 heures par an à sa propre santé, contre seulement 5 à 10 heures en présence de professionnels.

      --------------------------------------------------------------------------------

      I. La Structure des Parcours de Santé et la Qualité

      Le déploiement d'un parcours de santé de qualité repose sur une organisation graduée et une coopération étroite entre les différents échelons de soins.

      A. Une offre de soins graduée

      Le parcours est conceptualisé selon trois niveaux de réponse aux besoins de l'usager :

      Soins primaires (proximité) : Fondés sur l'exercice coordonné (équipes de proximité, pharmacies, laboratoires, imagerie en coupe) pour répondre aux besoins immédiats.

      Équipes de référence territoriales : Portées par des établissements publics ou privés pour les soins non programmés et les urgences.

      Soins tertiaires : Centres d'expertise régionaux pour les maladies rares ou spécifiques.

      B. Les leviers de la performance

      Pour garantir la fluidité de ce parcours, deux notions sont essentielles :

      1. La délégation de tâches : Sortir du dogme de la réponse exclusivement médicale au profit de nouveaux métiers (infirmiers de pratique avancée, coordinateurs de parcours).

      2. La coopération : Nécessité d'une orchestration territoriale, souvent pilotée par les Agences Régionales de Santé (ARS).

      C. Les cinq dimensions de la qualité

      La qualité ne se définit pas par un seul aspect, mais par un équilibre harmonieux entre cinq facteurs fondamentaux :

      1. Accessibilité : Capacité du système à fournir une réponse en temps utile.

      2. Pertinence : Conformité aux données de la science et adéquation de la réponse au besoin.

      3. Attentes de l'usager : Respect des valeurs et des préférences de la personne.

      4. Sécurité : Garantie de la sécurité des soins et des réponses apportées.

      5. Efficience : Utilisation optimale des fonds de la solidarité nationale, par nature limités.

      --------------------------------------------------------------------------------

      II. Le Continuum de l'Engagement et le Partenariat

      S'appuyant sur les recommandations de la Haute Autorité de Santé (HAS) de septembre 2020, l'analyse distingue quatre niveaux d'engagement des usagers.

      A. Les quatre niveaux d'engagement

      | Niveau | Leadership | Type d'interaction | | --- | --- | --- | | Information | Professionnel | Transmission de documents (ex: flyers d'accueil). | | Consultation | Professionnel | Recueil de la satisfaction ou de l'expérience (enquêtes). | | Collaboration | Professionnel | Relecture ou avis sur des documents par des usagers. | | Partenariat | Co-leadership | Co-construction, co-décision et co-mise en œuvre. |

      B. Le partenariat comme moyen stratégique

      Le partenariat avec les patients et leurs proches-aidants n'est pas une fin en soi, mais un levier au service de la sécurité et de la qualité des parcours. L'objectif est d'atteindre, pour chaque situation, le niveau d'engagement le plus élevé possible.

      --------------------------------------------------------------------------------

      III. La Reconnaissance du Savoir Expérientiel

      L'argument majeur en faveur du partenariat réside dans la disparité entre le temps clinique et le temps de vie avec la maladie.

      Le constat chiffré : Une personne vivant avec une vulnérabilité de santé passe entre 5 et 10 heures par an avec des professionnels. En revanche, elle consacre environ 6 250 heures par an à prendre soin de sa santé par elle-même.

      La « Vivrologie » : Ce terme désigne l'expertise issue de l'expérience de la maladie. Elle englobe des savoirs spécifiques : gestion de la vie intime, adaptation des traitements pendant les vacances, maintien de l'activité professionnelle.

      Le Modèle de Montréal : Ce modèle remplace le paternalisme par une vision où le patient est un membre à part entière de l'équipe de soins. Le centre de gravité n'est plus le patient lui-même, mais le projet de santé.

      --------------------------------------------------------------------------------

      IV. Dimensions et Typologies du Partenariat

      Le partenariat doit être envisagé de manière systémique, avec des impacts à trois niveaux :

      1. Micro : La relation individuelle de soin entre le patient et le professionnel.

      2. Méso : L'organisation des soins, l'enseignement et la recherche.

      3. Macro : La définition des politiques publiques de santé.

      Les profils de patients partenaires

      Il n'existe pas un profil unique de patient partenaire, mais des compétences spécifiques selon le domaine d'intervention :

      Patient partenaire de soins : Focalisé sur son propre projet de santé.

      Patient partenaire formateur : Intervient dans la formation des futurs professionnels.

      Patient partenaire chercheur : Contribue à la recherche clinique ou organisationnelle.

      Patient partenaire ressource : Apporte son expertise dans l'éducation thérapeutique ou l'amélioration de la qualité des soins.

      --------------------------------------------------------------------------------

      V. Conclusion : Vers une Culture Partagée

      Le Comité Régional d’Impulsion et d’Analyse du Partenariat en Santé (CRAPS) définit le partenariat comme une action commune pour le bien-être global, s'appuyant sur la complémentarité des savoirs.

      Une évolution sémantique et culturelle est préconisée : passer de la « prise en charge » à la « prise en soins ». Cette transition souligne que le patient n'est pas une « charge » pesant sur le système, mais une véritable solution pour améliorer l'efficacité et la pertinence de l'offre de santé.

      Le partenariat est, en somme, la rencontre réciproque de deux expertises : celle, scientifique, du professionnel et celle, expérientielle, du patient.

    1. Transgression adolescente, climat socio-éducatif et sanction éducative : Synthèse de recherche-action

      Synthèse de direction

      Ce document synthétise les résultats d'une recherche-action de trois ans menée par Valérie Benoit et Annik Skrivan (Haute école pédagogique du canton de Vaud, Suisse) au sein d'un établissement scolaire de 500 élèves.

      L'étude remet en question l'efficacité des pratiques punitives traditionnelles face aux transgressions adolescentes et propose un changement de paradigme vers la sanction éducative.

      Les conclusions majeures indiquent que :

      • Les pratiques punitives classiques (heures d'arrêt, copies) sont perçues par les élèves comme inutiles, voire incitatrices à la récidive.

      • Le climat scolaire se dégrade significativement à mesure que les élèves grandissent (passage du cycle 2 au cycle 3), particulièrement concernant la relation enseignant-élève.

      • L'insécurité est fortement ressentie par les élèves dans les espaces "hors murs" (gares, parkings) et dans les lieux de transition.

      • La mise en place d'un Espace de Sanction Éducative (ESE) favorise la responsabilisation, la restauration du lien social et une meilleure compréhension des besoins fondamentaux des adolescents.

      --------------------------------------------------------------------------------

      1. Cadre théorique et contextuel de la recherche

      1.1. Le contexte de l'école vaudoise

      La recherche s'inscrit dans un cadre législatif et structurel spécifique au canton de Vaud :

      Loi sur l'enseignement obligatoire (Léo) : Structure le secondaire 1 en deux voies (prégymnasiale et générale). La voie générale, marquée par des niveaux de compétences différents selon les branches, entraîne un morcellement des groupes classes et une complexité relationnelle accrue.

      Concept 360 : Politique d'école inclusive intégrant des élèves à besoins éducatifs particuliers, augmentant l'hétérogénéité des classes.

      Impact post-pandémique : La crise du COVID-19 a agi comme révélateur de problèmes latents, exacerbant les troubles de la santé mentale et créant un effet d'anomie (perte de sens des normes) chez les jeunes.

      1.2. Comprendre l'adolescence

      L'adolescence est définie comme une phase de mutation et de "liminalité" :

      Rite de passage : En l'absence de rites formalisés dans la société actuelle, l'adolescent se crée ses propres épreuves, souvent par la transgression.

      Besoins fondamentaux : Outre les besoins pédagogiques, l'adolescent a des besoins sociaux essentiels : sécurité, confiance, responsabilité, autonomie, affection, et reconnaissance.

      Expression par l'agir : Les adolescents privilégient l'action au discours pour exprimer leurs émotions et construire leur identité (processus de séparation-individuation).

      --------------------------------------------------------------------------------

      2. Analyse du climat socio-éducatif et des perceptions

      La recherche-action s'appuie sur le questionnaire environnement socio-éducatif (Cais), révélant des divergences marquées entre les acteurs.

      2.1. Évolution des perceptions selon l'âge

      Il existe une corrélation négative entre l'âge et la perception du climat scolaire.

      Les élèves plus âgés (12-16 ans) perçoivent l'environnement de manière beaucoup plus négative que les plus jeunes (10-12 ans).

      | Dimension évaluée | Perception Cycle 2 (10-12 ans) | Perception Cycle 3 (12-16 ans) | | --- | --- | --- | | Relations élèves-enseignants | Plutôt positives | Chute massive / Perçues comme froides | | Soutien pédagogique | Présent | Perçu comme insuffisant | | Sentiment de participation | Modéré | Très faible (parole muselée) | | Gestion des comportements | Acceptable | Perçue comme injuste/punitive |

      2.2. La problématique de la sécurité

      Contrairement aux autres dimensions, le sentiment de sécurité est plus élevé chez les plus grands, car les plus jeunes craignent les agressions des aînés.

      Zones de vulnérabilité : Les lieux les moins sécurisés sont la gare, le parking et le voisinage immédiat.

      Violence insidieuse : Les élèves dénoncent une violence verbale et une agressivité de la part de certains enseignants (cris, humiliations, dénigrement).

      2.3. Divergence des priorités

      Une déconnexion est observée entre les préoccupations des élèves et celles des enseignants :

      Priorité élèves : Violence physique/verbale, vols et qualité de la relation humaine.

      Priorité enseignants : Réussite scolaire, absentéisme et décrochage.

      --------------------------------------------------------------------------------

      3. De la punition à la sanction éducative

      3.1. L'inefficacité des pratiques punitives

      Les témoignages d'élèves confirment que les punitions classiques n'enseignent rien :

      Max (15 ans) : "Les heures d'arrêt ça me fait plus rien... c'est juste punir pour punir."

      Yann : "Je sors, je me dis pas que je vais arrêter de faire des bêtises." La punition génère souvent un sentiment d'humiliation et de désintérêt, augmentant le risque de décrochage.

      3.2. Le modèle de l'Espace de Sanction Éducative (ESE)

      Inspiré des travaux d'Éric Prairat et Élisabeth Maheu, l'ESE repose sur une approche de "langage trait-d'union".

      Les quatre contraintes de la sanction éducative :

      1. Rappel de la règle : Expliquer le sens de la loi pour la cohésion du groupe.

      2. Mise en mots de la transgression : Identifier les besoins inassouvis et proposer des comportements alternatifs.

      3. Obligation de réparer : Restaurer le lien social avec les personnes lésées.

      4. Sanction individuelle : S'adresser au sujet responsable dans un cadre privé.

      Les principes structurants :

      Signification : Prendre le temps d'expliquer l'acte.

      Objectivation : Se centrer sur l'acte commis, jamais sur la personnalité du jeune.

      Privation : La sanction doit avoir lieu sur un temps libre (mercredi après-midi) pour marquer la limite.

      --------------------------------------------------------------------------------

      4. Obstacles et perspectives professionnelles

      4.1. Résistances du corps enseignant

      La recherche met en lumière des tensions identitaires chez les enseignants :

      Vision de la mission : Certains se considèrent uniquement comme des "transmetteurs de savoir" et refusent la dimension éducative ou relationnelle de leur métier.

      Posture "Adultocentrée" : Une tendance à percevoir la transgression comme une attaque personnelle plutôt que comme un symptôme du développement adolescent.

      Narcissisme territorial : Difficulté à collaborer et à harmoniser les pratiques de gestion de classe au sein de l'établissement.

      4.2. Pistes d'amélioration

      Pour pérenniser ce changement de regard, plusieurs leviers sont identifiés :

      Mesures préventives : Ne pas attendre l'explosion du comportement. Travailler sur l'engagement des élèves et la qualité de la relation quotidienne.

      Formation continue : Développer les compétences socio-émotionnelles, l'éthique professorale et la gestion de classe.

      Posture d'autorité éducative : Passer de l'autoritarisme (soumission) à une autorité qui contient et sécurise sans "casser" l'adolescent.

      Patience institutionnelle : Les résultats (baisse des incivilités) demandent du temps et un soutien fort de la direction.

      --------------------------------------------------------------------------------

      Citations clés

      "Un ado qui ne transgresse pas, c'est un ado qui m'angoisse... c'est une bombe à retardement."Annik Skrivan

      "La sanction est un moyen de promouvoir un sujet responsable en lui imputant les conséquences de ses actes."Éric Prairat (cité par Valérie Benoit)

      "L'enseignant par nature est extrêmement narcissique et territorial... l'école est un lieu où on soumet et on contraint, mais il ne faut pas oublier qu'on socialise aussi beaucoup."Annik Skrivan

    1. L’Évaluation en Contexte Scolaire : Enjeux Éthiques et Débats Politiques

      Résumé Analytique

      Ce document de synthèse analyse les enjeux complexes de l'évaluation en milieu scolaire, tels qu'exposés par Camille Roelens.

      L'évaluation ne doit pas être perçue comme un simple outil technique, mais comme un objet philosophique et politique central dans une société démocratique.

      Le constat de départ est paradoxal : bien que l'évaluation soit souvent jugée obscure et injuste (la "science sinistre" de la docimologie), elle demeure omniprésente et incontournable.

      L'analyse démontre que l'école moderne a pour mission de produire des individus autonomes et de gérer la stratification sociale dans une société où les rangs de naissance ont disparu.

      L'évaluation devient alors le mécanisme de création d'"inégalités justes". Cependant, aucun modèle de justice scolaire — qu'il soit méritocratique, distributif ou basé sur des minima garantis — n'est parfait.

      Le document souligne que l'enjeu actuel de l'école réside dans la reconquête de sa légitimité à travers une "bienveillance" redéfinie, visant à accompagner chaque élève vers une autonomie réelle plutôt que de simplement valider des acquis.

      --------------------------------------------------------------------------------

      1. Les Paradoxes de l'Évaluation

      L'évaluation en milieu scolaire repose sur trois constats fondamentaux, dont deux critiques et un pragmatique.

      L'obscurité : Il est souvent difficile de déterminer avec précision ce qui est réellement évalué (la compétence réelle, la capacité à gérer le stress, ou la compréhension de la consigne).

      L'injustice perçue : Le sentiment que l'effort ne se traduit pas toujours par la réussite crée une perception de l'évaluation comme une épreuve "tragique" ou inéquitable.

      L'omniprésence (Le "2+1") : Malgré ces défauts, l'évaluation est un "thème incontournable". Elle s'exerce de manière "sauvage" et constante dans tous les aspects de la vie sociale (jugement sur un film, un restaurant, ou choix de partenaires sportifs).

      --------------------------------------------------------------------------------

      2. Critique de la Philosophie de l'Évaluation

      Selon les travaux de Danilo Martuccelli, l'évaluation est devenue une véritable philosophie structurante de la société moderne, reposant sur huit principes majeurs, souvent contestables.

      Principes et Critiques de Martuccelli

      | Principe de la philosophie de l'évaluation | Critique et limites | | --- | --- | | Tout est mesurable et évaluable. | Toutes les pratiques ne sont pas également quantifiables sans biaiser la réalité. | | Tout le monde doit être évalué et mis en concurrence. | L'évaluation n'est pas homologue selon les acteurs et les enjeux (ex: concours vs suivi). | | Assure une gestion transparente du pouvoir. | L'évaluation n'est pas une information neutre ; c'est un instrument de pouvoir. | | Assure la meilleure utilisation des ressources. | L'évaluation a un coût financier et humain massif (inspections, concours). | | Augmente l'efficacité (carotte et bâton). | C'est un pouvoir performatif qui oriente les comportements de manière insidieuse. | | Motive et implique les acteurs. | L'impact est radicalement différent si l'évaluation vise un individu ou un groupe. | | Légitime les organisations (monopole des grades). | Elle alimente une crise de légitimité entre la théorie et la réalité du terrain. | | Incarne la rationalisation moderne. | L'évaluation est devenue une "croyance collective" non rationnelle. |

      --------------------------------------------------------------------------------

      3. L'École comme Rouage de la Modernité Démocratique

      Dans une société d'"égalité des conditions" (Tocqueville), où la naissance ne détermine plus le rang, la stratification sociale doit être reconstruite. Deux leviers principaux assurent cette fonction : le marché et l'école.

      La fabrication de l'individu : L'école a pour mission de transformer des enfants "dépendants et vulnérables" en individus "libres, égaux et autonomes". L'évaluation sert à vérifier si cette demande sociale est remplie.

      La gestion des inégalités : Puisque tout le monde ne peut être "soliste", l'école doit sélectionner. Cette tâche est décrite comme "Sisyphe" : elle est structurellement injuste car elle évalue parfois des acquis non transmis par l'école (capital culturel familial), mais elle est indispensable pour éviter l'arbitraire ou le tirage au sort.

      --------------------------------------------------------------------------------

      4. Les Quatre Modèles de Justice Scolaire

      Français Dubet et Marie Duru-Bellat identifient quatre modèles de justice, chacun présentant des avantages et des dérives potentielles.

      4.1. L'égalité des chances et le mérite

      Principe : Les mêmes épreuves pour tous, correction anonyme.

      Faiblesse : Ce modèle ignore que l'école ne représente qu'une fraction du temps de vie de l'enfant. Il est "rude pour les vaincus" et tend à reproduire les appartenances sociales sous couvert de mérite.

      4.2. La justice distributive (et inclusive)

      Principe : "Donner plus à ceux qui ont moins" (ex: éducation prioritaire). L'autonomie est vue comme une capacité accompagnée (étayage).

      Faiblesse : Risque d'obsession de l'efficacité pédagogique et de stigmatisation (effet "étiquette" REP+). Ce modèle pèse lourdement sur la vocation des enseignants, parfois poussés jusqu'à l'épuisement.

      4.3. Les minima garantis (Inspiration de John Rawls)

      Principe : Déterminer les règles de justice derrière un "voile d'ignorance". Le système le moins injuste est celui qui traite le mieux les plus faibles (principe du socle commun).

      Faiblesse : Souvent perçu comme un "smic culturel" ou un renoncement à l'excellence.

      4.4. Les sphères de justice et effets sociaux (Michael Walzer)

      Principe : Les inégalités dans une sphère (scolaire) ne devraient pas contaminer les autres sphères de la vie.

      Faiblesse : En France, le diplôme est excessivement déterminant pour le destin social global. L'évaluation est dramatisée car elle "joue la peau" des élèves.

      --------------------------------------------------------------------------------

      5. Vers une Autonomie Réelle : Capabilités et Bienveillance

      L'éducation vise l'autonomie (capacité d'agir, de choisir et de penser par soi-même). Cependant, l'autonomie en droit n'est pas l'autonomie en fait.

      La notion de Capabilités (Amartya Sen) : L'autonomie dépend de la connexion entre les capacités personnelles et un contexte facilitateur. Évaluer un élève sans tenir compte de son environnement (ex: barrière de la langue) est une erreur d'évaluation.

      La Bienveillance comme levier de légitimité : Dans un contexte de "déclin des institutions", l'école ne peut plus imposer sa légitimité par simple statut. La bienveillance doit être comprise en trois sens :

      1. Bien veiller : Comprendre le monde et la singularité de chaque élève.   

      2. Bien veiller sur : Avoir soin de la relation et des individus (sollicitude et tact).  

      3. Bien veiller à : Donner concrètement les moyens de l'autonomie.

      Conclusion

      L'évaluation scolaire est au cœur d'un "polythéisme des jugements". Il n'existe pas de solution parfaite, mais une quête de l'évaluation "la moins pire".

      L'école juste ne peut reposer sur un seul principe, mais sur une composition de principes croisés.

      L'enjeu ultime est de passer d'une fonction de sélection prioritaire à une fonction de transmission d'outils d'autonomie intellectuelle, tout en acceptant que l'école ne peut, à elle seule, régler tous les problèmes de la société.

    1. Analyse de l’Expérience Émotionnelle en Milieu Scolaire : Le Dispositif des « Moments Spéciaux »

      Synthèse

      Ce document de synthèse détaille les recherches menées par Sophie Necker et ses collègues sur la saisie des états émotionnels au sein de la classe.

      S’appuyant sur une étude menée en 2021 dans deux classes de CM2, le projet repose sur le dispositif de la « boîte à moments spéciaux ».

      Cette méthode permet d’accéder à la subjectivité des élèves et des enseignants à travers l'écriture quotidienne et volontaire de billets anonymes.

      Les conclusions mettent en lumière la dimension systémique des émotions, où les vécus individuels s'entremêlent pour former un paysage émotionnel collectif.

      L’innovation majeure de cette recherche réside dans la création de « l’Émoscope », une cartographie graphique permettant de visualiser la complexité des interactions entre déclencheurs, évaluations subjectives et expressions émotionnelles à l’échelle d’une journée de classe.

      --------------------------------------------------------------------------------

      1. Le Dispositif de Recherche : La Boîte à Moments Spéciaux

      La recherche vise à accéder aux traces des émotions et à la subjectivité des acteurs en milieu scolaire.

      Méthodologie et Protocole de Recueil

      Contexte : Étude réalisée en mai 2021 dans deux classes de CM2 à Lille (51 élèves et 2 enseignantes).

      Le Support : Des bandelettes de papier (environ 10 cm de haut) intitulées « billet moment spécial ».

      La Consigne : « Tu as vécu un moment spécial dans la classe aujourd'hui. Peux-tu l'écrire et le mettre dans la boîte s'il te plaît ? ».

      Caractéristiques du recueil :

      ◦ Écriture volontaire et quotidienne en fin de journée.    ◦ Anonymat préservé pour favoriser la liberté d’expression.  

      ◦ Durée d’un mois, totalisant 764 billets recueillis.

      Le « Moment Spécial » : Défini par sa singularité et sa significativité pour l’individu, sans injonction de valence positive ou négative.

      Il s'inspire des concepts de « moments optimaux » ou de « flow », mais élargi à toute intensité émotionnelle.

      --------------------------------------------------------------------------------

      2. Fondements Théoriques : Une Approche Systémique

      La recherche considère l’expérience vécue comme un objet scientifique à part entière.

      L'Interdépendance Émotionnelle

      La classe est envisagée comme un système d’interactions réciproques et complexes :

      Influence mutuelle : Les états émotionnels de l'enseignant impactent ceux des élèves et réciproquement.

      Attention conjointe : La perception de la situation est déterminée par le partage de l'attention entre les acteurs.

      Relation élève-enseignant : Cette relation influence la qualité de vie scolaire, les comportements et le regard porté sur les apprentissages.

      Définition de l'Émotion

      L’émotion est comprise comme un processus évaluatif dynamique :

      • Elle permet à l’individu de spécifier la signification d’une situation à ses yeux.

      • Une même situation peut donner lieu à des évaluations différentes selon les individus ou les contextes.

      Les composantes de l'évaluation (selon Audrin) :

      1. Physiologique : Réactions corporelles (ex. frissons).  

      2. Expression motrice : Expressions faciales, voix, posture.   

      3. Motivationnelle : Tendance à l'action (approche ou fuite).  

      4. Sentiment subjectif : Synthèse des différentes dimensions.

      --------------------------------------------------------------------------------

      3. Analyse des Résultats : Typologie des Expériences

      L'analyse des billets révèle plusieurs dimensions du rapport au monde scolaire.

      Rapport à Soi et à Autrui

      Connaissance de soi : Les billets expriment des attirances ou des antipathies (« Je déteste la danse »).

      Sentiment de compétence : La réussite ou la difficulté face à une tâche génère des émotions saillantes (fierté, stress de l'évaluation).

      Présence d'autrui : L'autre peut être déclencheur (exposé d'un camarade), partenaire d'émotion ou destinataire d'une action.

      L'enseignant est souvent évoqué indirectement à travers ses choix pédagogiques et didactiques.

      Continuité et Rupture

      Zone de confort et continuité : Moments venant renforcer l'identité de l'élève ou s'inscrivant dans une unité sociale et temporelle réconfortante.

      Rupture et irruption : Émotions liées à la nouveauté, à la découverte de connaissances, à des activités inhabituelles ou à des irruptions spatiales (intervenant extérieur, sortie).

      Littératie Émotionnelle et Verbalisation

      L'étude observe une gradation dans la capacité des élèves à verbaliser l'émotion :

      Niveau 1 : Nommer uniquement le déclencheur (ex: « L'histoire »).

      Niveau 2 : Décrire les faits ou les actions.

      Niveau 3 : Transcrire le ressenti ou attribuer une valeur (ex: « J'ai aimé »).

      Niveau 4 : Argumenter l'évaluation (ex: « C'est passionnant car... »).

      --------------------------------------------------------------------------------

      4. L’Émoscope : Cartographier le Paysage Émotionnel

      L'innovation majeure de la recherche est la création de l'Émoscope, un outil de représentation graphique.

      | Caractéristique de l'Émoscope | Fonctionnalité | | --- | --- | | Structure | Une roue où chaque portion représente un billet individuel. | | Code Couleur | Identifie l'événement déclencheur (ex: sport, conseil de classe, exposé). | | Pictogrammes | Indiquent la nature du rapport (soi, autrui, rupture, continuité). | | Bulles de Verbatim | Reprennent les mots exacts utilisés pour décrire l'émotion. | | Flèches | Symbolisent le processus évaluatif et les composantes identifiées. |

      Cet outil permet de passer de l’analyse d’un billet individuel à une vision globale du climat de la classe sur une unité de temps donnée (la journée).

      --------------------------------------------------------------------------------

      5. Perspectives et Implications Pédagogiques

      La recherche ouvre des pistes pour la formation et la pratique enseignante.

      Pour les Praticiens et Chercheurs

      Analyse de pratiques : Utiliser l'Émoscope pour comparer les vécus selon les enseignants ou les dispositifs pédagogiques.

      Évolution méthodologique : Envisager des formats numériques (audio, vidéo) pour lever les freins liés aux compétences rédactionnelles.

      Suivi longitudinal : Utiliser des carnets de billets pour suivre l'évolution émotionnelle d'un élève sur le long terme.

      Pour la Formation

      Conscientisation : Aider les futurs enseignants à comprendre la systémie émotionnelle de la classe.

      Indicateur d'apprentissage : Explorer les émotions des élèves comme des marqueurs de progression et de sécurité affective.

      Conclusion de l'Étude

      Le dispositif de la boîte à moments spéciaux démontre que les émotions, bien que subjectives, peuvent être saisies et cartographiées.

      Elles constituent une porte d'entrée essentielle pour comprendre les dynamiques d'apprentissage et le bien-être au sein de la communauté éducative.

    1. mes participants devaient choisir dans quels contextes de communication ils souhaitaient ou non transmettre ces différentes informations

      Suite de l'explication de l'expérimentation : la consigne donnée aux participants était de choisir dans quels contextes conversationnels ils souhaitaient partager ces différents types d'informations.

      VI 1: les contextes conversationnels à 6 modalités.

      VI 2 : les informations à 4 modalités.

      VD : choix de partager une d'information dans un ou plusieurs contextes conversationnels.

    1. Enquête sur le Milieu Périscolaire et les Établissements Privés : Failles de Sécurité et Défaillances Institutionnelles

      Résumé Exécutif

      Cette synthèse met en lumière une crise de confiance et de sécurité au sein du système périscolaire et des établissements scolaires en France.

      L'enquête révèle que le temps périscolaire — qui peut représenter jusqu'à cinq heures par jour pour 5,5 millions d'élèves — souffre d'un manque criant de surveillance et de données officielles.

      Malgré la multiplication des signalements d'agressions sexuelles et de maltraitances, les structures administratives (mairies et Éducation nationale) sont accusées d'inertie, voire d'avoir instauré une forme d'omerta pour protéger l'image des institutions.

      Le recrutement précaire, l'absence de suivi statistique des violences au niveau ministériel et les retards dans les enquêtes administratives créent un environnement vulnérable pour les enfants, particulièrement en maternelle.

      1. Le Secteur Périscolaire : Un Système sous Haute Tension

      Le temps périscolaire concerne 90 % des enfants de maternelle et d'élémentaire.

      Bien que ces activités se déroulent au sein des écoles, elles dépendent des municipalités et non de l'Éducation nationale.

      Données Clés sur l'Encadrement

      Volume horaire : Jusqu'à 5 heures par jour (accueil du matin, cantine, étude du soir).

      Population concernée : 5,5 millions d'élèves.

      Perception du métier : Qualifié de « sous-métier » ou de « profession poubelle » par certains acteurs, reflétant une précarité qui impacte la qualité du recrutement.

      Financement : L'État finance à 75 % les établissements privés sous contrat, mais les contrôles sur les violences éducatives ou sexuelles y sont jugés insuffisants par des lanceurs d'alerte.

      Défaillances de Recrutement

      L'enquête souligne des processus d'embauche parfois expéditifs.

      À Rezé, un animateur condamné pour agressions sur 12 mineurs avait été recruté à 51 ans sans expérience préalable dans l'enfance, après une carrière dans la grande distribution.

      L'entretien d'embauche a été décrit comme s'étant déroulé « assez rapidement ».

      2. État des Lieux des Violences et de l'Invisibilité Statistique

      Un constat majeur de l'enquête est l'absence totale de données centralisées sur les violences en milieu périscolaire.

      Néant Statistique : Le ministère de la Justice a confirmé ne pas enregistrer de données spécifiques sur les violences commises par des animateurs périscolaires.

      Réalité du terrain : En compilant les articles de la presse régionale sur 10 ans, l'enquête a recensé au moins une centaine d'affaires médiatisées partout en France (Marseille, Moselle, Courbevoie, Haute-Savoie, etc.).

      Typologie des faits :

      ◦ Agressions sexuelles et viols sur mineurs.   

      ◦ Maltraitances physiques (étranglements, violences à la cantine).  

      ◦ Tentatives de corruption de mineurs.

      3. Analyse des Failles Institutionnelles : L'Omerta et la Gestion des Signalements

      L'enquête pointe du doigt une gestion administrative défaillante qui privilégie souvent la protection de l'institution au détriment de la sécurité des enfants.

      Dysfonctionnements Identifiés

      | Type de Dysfonctionnement | Description et Conséquences | | --- | --- | | Déplacement des agents | Pratique consistant à déplacer un animateur signalé d'une école à une autre plutôt que de le sanctionner ou de l'écarter. | | Absence de suites administratives | Dans l'affaire du 15e arrondissement de Paris, deux ans après l'ouverture d'une enquête administrative, aucun débriefing n'a été fourni aux familles. | | Ignorance des alertes parentales | Des parents avaient alerté sur des comportements suspects (animateur seul avec un enfant, porte fermée) dès 2019, soit des années avant l'arrestation de l'agresseur présumé. | | Espaces à risques | Malgré un rapport de 2015 recommandant de prohiber les espaces isolés (comme les coins bibliothèque), ces lieux ont continué d'être utilisés sans surveillance adéquate. |

      Citations Marquantes sur l'Institution

      • « C'était toujours on protège l'institution, on règle ça entre nous mais rien ne sort. »

      • « Le sanctuaire qui se brise » : expression utilisée par les parents pour décrire la perte de confiance envers l'école.

      • « Vous avez l'impression que tout le monde est complice de cette omerta. »

      4. Impact Psychologique et Parole de l'Enfant

      Le professeur Thierry Bobet, pédopsychiatre, apporte un éclairage crucial sur la difficulté de recueillir la parole des victimes, particulièrement entre 3 et 6 ans.

      Les Obstacles à la Révélation

      1. Absence de représentation : Un enfant de maternelle n'a aucune notion de ce qu'est la sexualité adulte. Il utilise des termes comme « quelqu'un m'a embêté ».

      2. Confusion de l'autorité : L'animateur représente une extension de l'autorité parentale, ce qui rend la dénonciation paradoxale pour l'enfant.

      3. Fragilité de la mémoire : Entre 3 et 6 ans, la mémoire n'est pas mature.

      Un souvenir peut être précis pendant six mois puis devenir confus, d'où l'urgence d'une prise en charge rapide.

      Signaux d'Alerte Observés par les Parents

      Régressions : Retour des couches, pipi au lit, demande de biberons.

      Troubles du comportement : Crises violentes au moment de partir à l'école, terreurs nocturnes, phobie scolaire.

      Comportements sexualisés : Jeux ou mimiques inadaptés à l'âge de l'enfant (ex: postures « vulgaires » induites par l'adulte).

      5. Cas d'Étude : Le Processus de Manipulation

      L'enquête détaille des modes opératoires récurrents visant à isoler les enfants et à instaurer un climat de secret.

      Le secret comme outil de contrôle : « Vous ne dites rien à la maîtresse, c'est notre secret. »

      Rituels détournés : Dans une école parisienne, l'animateur utilisait des chansons et des jeux (ex: « la culotte de mon grand-père ») pour amener les enfants à se déshabiller et à subir des attouchements sous couvert d'activité ludique.

      Posture de l'agresseur : Souvent décrit initialement comme un « papi un peu ours » ou quelqu'un de très apprécié qui « adore les enfants », utilisant cette image pour manipuler l'entourage et isoler les victimes.

      Conclusion

      L'enquête de Cash Investigation démontre que les violences dans le milieu périscolaire ne sont pas des faits divers isolés, mais le résultat de failles structurelles :

      • manque de moyens des collectivités,
      • absence de contrôle rigoureux de l'État sur le financement des écoles privées et culture du secret au sein des administrations.

      L'urgence est à la transparence statistique et à une réforme profonde des protocoles de signalement et d'encadrement pour protéger les publics vulnérables.

    1. État des Lieux du Périscolaire et de l'Enseignement Privé : Enquête sur les Violences et les Défaillances Institutionnelles

      Résumé Exécutif

      Ce document de synthèse expose les conclusions d'une enquête approfondie sur la sécurité et l'encadrement des enfants au sein du périscolaire public et des établissements privés sous contrat en France.

      Points clés identifiés :

      Insécurité structurelle du périscolaire : Le secteur souffre d'un manque de statistiques officielles sur les violences, de recrutements précaires sans vérification de compétences réelles et d'un encadrement souvent en sous-effectif.

      Culture de l'omerta dans le privé : Malgré un financement public à hauteur de 75 %, certains établissements privés privilégient la protection de leur image institutionnelle au détriment du signalement des violences sexuelles ou pédagogiques.

      Échec de la réponse judiciaire : 73 % des plaintes pour violences sexuelles sur mineurs sont classées sans suite, et les délais d'instruction (parfois plusieurs années) nuisent à la fiabilité de la parole de l'enfant.

      Pratiques de "chaises musicales" : Au lieu d'être sanctionnés, certains animateurs signalés pour comportements inappropriés sont simplement déplacés d'une école à une autre.

      Urgence d'une réforme : Les experts préconisent une professionnalisation accrue, une centralisation des signalements et l'adoption de protocoles d'audition spécialisés (type protocole "Niche").

      --------------------------------------------------------------------------------

      1. Le Secteur Périscolaire Public : Un Système sous Haute Tension

      Le temps périscolaire concerne 5,5 millions d'élèves en France. Bien qu'il se déroule dans l'enceinte des écoles, il dépend des mairies et non de l'Éducation nationale.

      1.1. Une profession dévalorisée et précaire

      Le secteur est décrit par les intervenants comme une « profession poubelle » ou un « sous-métier ».

      Conditions de travail : Temps partiels imposés, plannings morcelés et salaires de misère (entre 600 et 700 € nets par mois).

      Recrutement "à la va-vite" : Pour combler les manques, les mairies embauchent des vacataires sans aucune expérience.

      Une journaliste infiltrée a été recrutée en 6 jours après un entretien où seules sa disponibilité et sa « bienveillance » ont été interrogées, sans test de compétences avec les enfants.

      1.2. Défaillances d'encadrement et de surveillance

      Sous-effectifs chroniques : La loi impose un animateur pour 14 enfants de moins de 6 ans, mais des taux de 1 pour 23 ou plus sont observés sur le terrain.

      Surveillance passive : L'enquête révèle des animateurs absorbés par leur téléphone portable durant les temps de cantine ou de cour de récréation, enfreignant la charte de l'animateur.

      Violences verbales et physiques : Des scènes de cris systématiques, d'humiliations et d'intimidation (« ferme ta bouche », privation de nourriture) ont été documentées.

      --------------------------------------------------------------------------------

      2. Violences Sexuelles : Des Alertes Ignorées aux Sanctions Insuffisantes

      En 10 ans, rien qu'à Paris, 128 animateurs ont été suspendus pour suspicion de violences sexuelles.

      2.1. Le dysfonctionnement des signalements

      Plusieurs cas démontrent que les alertes des parents ne sont pas toujours transmises à la direction :

      Affaire de l'école Baudin (Paris) : Des parents avaient alerté sur des attouchements dès septembre 2024.

      L'information n'a pas été remontée, et l'animateur est resté en poste jusqu'à son interpellation en avril 2025 pour agression sur cinq enfants.

      Affaire de l'école Emerio (Paris) : Un animateur de bibliothèque, en poste depuis 20 ans, a été mis en examen. Des parents avaient pourtant signalé des situations suspectes (portes fermées, enfants sur les genoux) dès 2019.

      2.2. Le déplacement des agents problématiques

      L'enquête confirme une pratique de « mauvaise habitude » : le déplacement d'un animateur signalé pour maltraitance vers une autre école au sein du même arrondissement, au lieu d'un licenciement ou d'une sanction disciplinaire ferme.

      | Cas de figure | Mesure constatée | Impact | | --- | --- | --- | | Maltraitance physique (fessée/secouage) | Déplacement dans une autre maternelle | Risque de récidive sur un nouveau public | | Comportements inappropriés | Mutation d'une école maternelle à une école élémentaire | Absence de dossier de suivi centralisé |

      --------------------------------------------------------------------------------

      3. L'Enseignement Privé Sous Contrat : Entre Omerta et Autonomie

      L'État finance l'enseignement privé à hauteur de 10,9 milliards d'euros (2024), payant l'intégralité des salaires des enseignants.

      3.1. La protection de l'image institutionnelle

      Dans certains établissements catholiques, comme l'institution Champagnat (Alsace), la priorité semble être de « laver le linge sale en famille ».

      Pressions sur les victimes : Des enregistrements montrent des religieux incitant des victimes d'agressions sexuelles à retirer leur plainte pour ne pas nuire à la réputation de l'école.

      Rétention d'information : Un établissement a attendu 9 mois avant de signaler au rectorat une enseignante ayant une relation sexuelle avec un mineur de 15 ans.

      3.2. Le manque de contrôle étatique

      Le Secrétariat Général de l'Enseignement Catholique (SGEC) a longtemps freiné l'adoption de l'application « Faits Établissement », souhaitant filtrer les signalements avant qu'ils n'atteignent le ministère.

      Ce « ministère bis » limite la visibilité de l'État sur la réalité des violences dans le privé.

      --------------------------------------------------------------------------------

      4. Dérives Idéologiques et Maltraitances : Le Cas de l'Institution "L'Espérance"

      Cet établissement de Vendée, sous tutelle de la Fraternité Saint-Pierre, illustre les failles extrêmes du contrôle des écoles sous contrat.

      Violences rituelles : Le directeur pratiquait un système de "pactes" où il recevait ou donnait des claques aux élèves devant toute l'école en fonction des résultats scolaires.

      Climat de haine : Des anciens élèves témoignent de propos racistes, homophobes et xénophobes omniprésents (croix gammées sur les murs, surnoms racistes comme "Bamboula" ou "Chang").

      Non-respect des programmes : Des cours d'éducation civique sont refusés car jugés "républicains", remplacés par des enseignements sur la monarchie ou la scolastique médiévale.

      Encadrement défaillant : L'absence de surveillants adultes la nuit, remplacés par des élèves de terminale (« capitaines d'internat »), a favorisé des humiliations (rituel de la mare).

      --------------------------------------------------------------------------------

      5. La Réponse de la Justice et de la Psychiatrie

      5.1. Le traumatisme de l'enfant et la parole différée

      Le professeur Thierry Bobet et le docteur Louis Alvarez soulignent que :

      • Un enfant de maternelle n'a aucune représentation de la sexualité adulte ; il ne parlera pas d'agression mais de quelqu'un qui l'a « embêté ».

      • Le secret est souvent imposé par l'agresseur par le biais de "jeux" ou de "secrets".

      • La mémoire des 3-6 ans est immature : si l'audition n'est pas immédiate, les souvenirs deviennent confus, favorisant les classements sans suite.

      5.2. Statistiques et Justice

      Taux de condamnation : Seules 3 % des plaintes pour viol sur mineur aboutissent à une condamnation en France.

      Le protocole "Niche" : Utilisé dans les pays nordiques (taux de poursuite de 60 %), ce protocole d'audition filmé et standardisé est encore trop peu utilisé en France (25 % des cas contre 90 % dans certains pays).

      --------------------------------------------------------------------------------

      6. Modèles Inspirants et Pistes de Solution

      6.1. L'exemple de la commune de Lemont (Vosges)

      La municipalité a fait le choix politique d'un « périscolaire premium » :

      Ratios d'encadrement : 1 animateur pour 10 enfants (mieux que les 1 pour 14 légaux).

      Professionnalisation : Les temps de préparation et de réunion sont rémunérés.

      Stabilité : Contrats allant jusqu'à 33 heures par semaine pour fidéliser le personnel.

      6.2. Recommandations des experts

      1. Centralisation : Création d'un fichier national des signalements incluant les violences physiques et psychologiques (pas seulement sexuelles).

      2. Formation : Rendre obligatoire la formation sur la protection de l'enfance et la Convention internationale des droits de l'enfant pour tout personnel encadrant.

      3. Transparence : Soumettre les établissements privés aux mêmes obligations de signalement immédiat (« Faits Établissement ») que le public.

      4. Priorité Judiciaire : Créer un "ticket accélérateur" pour que les enquêtes impliquant des mineurs soient traitées en priorité absolue afin de préserver la fiabilité des preuves.

    1. Note de Synthèse : La Violence à l'École et les Stratégies d'Intervention Efficaces

      Résumé Exécutif

      Cette note de synthèse analyse les propos de Claire Baumont, Docteure en psychopédagogie, sur la violence en milieu scolaire.

      L'idée maîtresse est que la perception d'une augmentation généralisée de la violence dans les écoles n'est pas étayée par des données probantes, mais plutôt alimentée par une couverture médiatique alarmiste.

      Le monitorage national québécois (2013-2019) n'a pas confirmé cette hausse et a même noté de légères améliorations.

      La professeure Baumont insiste sur l'importance de « l'effet établissement » : la nécessité pour chaque école de baser ses interventions sur les faits observés localement, là où le personnel a un pouvoir d'action réel, plutôt que sur des moyennes nationales ou des récits extérieurs.

      L'analyse révèle également que les formes d'agression les plus rapportées ne sont pas toujours celles attendues.

      Les comportements d'humiliation et les regards méprisants de la part des adultes envers les élèves, ainsi que les agressions entre collègues, se classent parmi les plus fréquents (3e ou 4e position), bien avant la cyberintimidation.

      Les stratégies d'intervention les plus efficaces ont évolué, passant d'approches punitives inefficaces à des approches systémiques axées sur le climat scolaire et, plus récemment, sur le développement des compétences socio-émotionnelles des élèves et du personnel.

      La clé réside dans le renforcement des relations par des actions quotidiennes et la responsabilisation du personnel scolaire en tant que modèles.

      1. L'Expertise de Claire Baumont

      L'analyse est fondée sur les perspectives de Claire Baumont, une experte reconnue dans le domaine :

      Formation et expérience : Docteure en psychopédagogie, elle a été psychologue scolaire et clinicienne auprès de jeunes avec d'importants problèmes d'adaptation.

      Carrière académique : Professeure associée au Département d'études sur l'enseignement et l'apprentissage de l'Université Laval.

      Recherche de pointe : Elle a dirigé la Chaire de recherche sur le bien-être et la prévention de la violence à l'école (2012-2023) et le premier monitorage national de la violence dans les écoles québécoises (2013-2019).

      Objectif : Ses recherches visent à améliorer la qualité de vie des élèves et du personnel scolaire.

      2. Mythes et Réalités : La Montée de la Violence Scolaire

      Un thème central de la discussion est la remise en question de la perception d'une augmentation de la violence dans les écoles.

      Une narration médiatique persistante : La professeure Baumont souligne que les médias rapportent une "montée de la violence" depuis près de 40 ans, souvent en généralisant à partir d'événements ponctuels et en créant un climat d'insécurité.

      Absence de preuves empiriques : Le monitorage national mené entre 2013 et 2019, utilisant des outils standardisés, n'a pas réussi à prouver une augmentation de la violence.

      Au contraire, il a révélé de "légères améliorations".

      Situation actuelle : Il n'existe pas de portrait national récent pour confirmer ou infirmer une hausse depuis 2019-2020.

      Il est donc crucial de garder un esprit critique face aux discours ambiants.

      La volatilité des données locales : Le suivi de certaines écoles a montré que la situation peut évoluer rapidement.

      Un établissement peut voir son taux de violence augmenter en quelques années, tandis qu'un autre peut s'améliorer.

      Cela démontre que les moyennes nationales ne sont pas représentatives de la réalité de chaque milieu.

      3. Le Concept Clé : L'Effet Établissement

      Face à l'incertitude des données nationales et à l'influence des facteurs externes, la professeure Baumont met en avant le concept de « l'effet établissement » (ou « effet école »).

      Définition : Il s'agit de se concentrer sur les composantes et les interventions sur lesquelles le personnel scolaire a un pouvoir d'action direct au sein de son propre établissement.

      Principe d'action : La première étape est d'ajuster les interventions sur la base de ce qui est réellement observé dans l'école, et non sur des perceptions externes.

      Autonomisation : Cette approche permet aux intervenants de se centrer sur des solutions concrètes et de ne pas se laisser démoraliser par des facteurs hors de leur contrôle.

      Elle place l'intervenant comme le "premier décideur" de ses actions avec les ressources dont il dispose.

      4. Les Dimensions de la Violence Scolaire

      La violence en milieu scolaire est un phénomène complexe et multifactoriel, dont les manifestations dépassent les agressions entre élèves.

      4.1. Une Problématique Multifactorielle

      La violence s'explique par une interaction de facteurs à plusieurs niveaux :

      Globaux : Les conflits mondiaux et les guerres (une personne sur huit sur la planète serait en situation de guerre en décembre 2024) contribuent à un sentiment d'insécurité généralisé.

      Sociétaux : Les différences culturelles et religieuses peuvent être des sources de tension.

      Communautaires : La vie dans le quartier et la situation familiale des élèves influencent leurs comportements à l'école.

      Institutionnels : La formation du personnel scolaire joue un rôle.

      Malgré ces multiples facteurs, l'effet établissement demeure le levier d'action le plus pertinent pour les intervenants.

      4.2. Les Comportements d'Agression : Au-delà des Élèves

      L'analyse des types de violence révèle une réalité souvent sous-estimée : l'impact du comportement des adultes.

      Violence des adultes envers les élèves : Selon des données de 2024, les comportements d'humiliation et les regards méprisants de la part des adultes se classent en 3e ou 4e position des agressions les plus rapportées par les élèves, surtout au secondaire.

      Ces actes incluent les cris et les punitions humiliantes.

      Violence entre adultes : Le personnel scolaire rapporte également subir des agressions de la part de collègues.

      Les insultes et l'exclusion des réunions se classent aussi en 3e ou 4e position des comportements d'agression subis par les enseignants.

      Un constat surprenant : Ces formes de violence relationnelle et psychologique sont rapportées bien plus fréquemment que la cyberintimidation, qui est souvent perçue comme un problème majeur.

      L'impact de ces comportements d'adultes sur le climat scolaire et la qualité de l'enseignement est considérable.

      5. Stratégies d'Intervention : Évolution et Bonnes Pratiques

      Les approches pour prévenir et gérer la violence ont évolué au cours des 50 dernières années.

      | Étape d'Évolution | Approche Principale | Limites et Constats | | --- | --- | --- | | Approches initiales | Programmes ciblés sur les agresseurs, basés sur la punition. | Inefficaces. "On s'est rendu compte que les punitions ça la prenait pas aux enfants de bons comportements." | | Développement | Approches globales et systémiques axées sur l'amélioration du climat scolaire. | Plus efficaces, mais peuvent être complétées. | | Approches récentes | Focalisation sur le bien-être des élèves, puis sur celui des élèves ET du personnel scolaire. | Agir sur les sources du mal-être pour prévenir la violence. | | Approche actuelle | Développement des compétences socio-émotionnelles pour tous (élèves et personnel). | Apprendre l'autorégulation, l'expression des désaccords et le savoir-être. Le personnel adulte agit comme un modèle essentiel. |

      Le modèle actuel met l'accent sur le rôle crucial des adultes.

      La relation qu'ils établissent avec les jeunes, basée sur leurs propres compétences socio-émotionnelles, est un facteur déterminant pour un climat scolaire positif.

      6. Recommandations Finales pour une Action Efficace

      Pour intervenir de manière constructive, la professeure Baumont propose une série de principes directeurs :

      1. Baser les interventions sur des faits observés localement : Se concentrer sur les dynamiques propres à son établissement pour un maximum d'impact (« effet établissement »).

      2. Impliquer les élèves et le personnel : Faire participer l'ensemble de la communauté scolaire aux décisions favorise le sentiment d'appartenance, l'engagement, l'entraide et la collaboration.

      3. Agir avec les ressources disponibles : Plutôt que d'attendre des décisions ou des ressources gouvernementales, il est essentiel d'agir proactivement avec les moyens à disposition.

      "Je suis la première personne qui peut décider de ce que je fais avec ce que j'ai."

      4. Privilégier la fréquence à l'intensité : Le plus important n'est pas de réaliser de grandes activités ponctuelles, mais de poser de petits gestes significatifs au quotidien.

      Il faut "savoir-faire souvent" pour renforcer durablement les relations entre adultes et élèves.

    1. Jak usunąć MIKROPLASTIK i BPA z organizmu? Toksykolog dr hab. Aleksandra Rutkowska

      1. Understanding the "Toxic Cocktail" (Chemical Types)

      The expert emphasizes that we are exposed to a mixture of substances that act together. Key chemicals include: * Bisphenols (BPA, BPS, BPF, etc.): BPA (Bisphenol A) is a major endocrine disruptor used in hard plastics and can linings. Crucially, the expert warns against "BPA-Free" labels, noting they are often a form of Greenwashing. Manufacturers frequently replace BPA with BPS (Bisphenol S) or BPF (Bisphenol F), which are structurally similar and potentially just as harmful [00:28:38]. * Phthalates: Used to make plastics flexible (like PVC). Found in flooring, food wraps, and cosmetics, they interfere with reproductive and metabolic health [00:07:03]. * PFAS ("Forever Chemicals"): Used in non-stick pan coatings. These do not break down easily and can stay in the human body for many years [00:14:37], [00:43:13]. * Alkylphenols & Flame Retardants: Chemicals used in detergents and furniture that accumulate in household dust and disrupt thyroid function [00:08:10], [00:15:48].

      2. Health Impacts: The "Grandchild Method"

      • Hormonal Mimicry: These chemicals trick the body into treating them like natural hormones (mimicking estrogen). They block receptors and can "program" fat cells to store more fat, leading to obesity [00:08:43], [00:09:18].
      • Diseases: Long-term exposure is linked to Type II diabetes, infertility, and hormone-dependent cancers like breast and prostate cancer.
      • Inflammation: Microplastic particles act as foreign bodies, causing chronic internal inflammation—the root cause of most civilization diseases [00:02:42].

      3. Fish and Food Packaging: Best vs. Worst

      • The Danger of Cans: Canned fish is ranked as the worst source of bisphenols because the fat causes chemicals from the can's lining to leach into the food [00:27:00].
      • Trout (Pstrąg): The healthiest choice. It lives a short life in clean, moving water, accumulating minimal toxins [00:27:50].
      • Tuna & Flounder: Recommended to avoid. Tuna lives too long (accumulating chemicals), and Flounder lives at the bottom where pollutants settle [00:27:27], [00:27:33].
      • Recommendation: Buy fish in glass jars or fresh rather than in metal cans [00:27:10].

      4. Hidden Exposure Sources: Imports and Interior

      • Asian Imports & Clothing: Products from Asian platforms often bypass EU safety standards and contain higher toxic concentrations. Synthetic clothes from Asia are heavily impregnated with chemicals to survive weeks in transport containers. The expert strongly advises washing new clothes at least twice before the first wear to reduce skin absorption of these toxins [00:17:32].
      • Tea Bags: Certain bags containing plastic mesh or glue can release billions of microplastic particles into a single cup [00:00:00].
      • Home Interiors: The combination of underfloor heating + vinyl or laminate panels is highly toxic; heat "bakes" chemicals into the air you breathe [00:32:08], [00:37:12].

      5. Practical Detox and Prevention

      • Liver Support: The liver can clear most bisphenols in a week if you stop exposure. Warning: Avoid aggressive "juice cleanses" that cause rapid weight loss, as this floods the blood with toxins previously stored in your fat tissue [00:20:26], [00:30:02].
      • The "First Step" Strategy: Start by wet-dusting your home and creating a 5-minute draft (intensive ventilation) twice a day [00:48:11].
      • Kitchen Changes: Switch to glass or stainless steel for storage, stop cooking rice/grains in plastic bags (cook them loose), and use cast iron or stainless steel pans instead of non-stick [00:21:22], [00:41:48], [00:43:13].
    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2025-03280

      Corresponding author(s): Stephan Gruber

      1. General Statements [optional]

      First, we would like to thank the editor at Review Commons for the efficient handling of our manuscript. We also apologize for our delayed response.

      We are grateful to all three reviewers for their careful evaluation of our work and for their constructive feedback, which will provide a valuable basis for improving the figures and the text, as described below. We expect to be able to complete the revision following the plan described below quickly.

      We note that the reviewer reports (Rev. #1 and Rev. #3) made us realize that the manuscript text was misleading on the following point. Although we used the purified ATP hydrolysis–deficient Smc protein for sybody isolation, this does not restrict the selection to a specific conformation. As described in detail in Vazquez-Nunez et al. (Figure 5), this mutant displays the ATP-engaged conformation only in a smaller fraction of complexes (~25% in the presence of ATP and DNA), consistent with prior in vivo observations reported by Diebold-Durand et al. (Figure 5). Rather than limiting the selection to a particular configuration, our aim was to reduce the prevalence of the predominant rod state in order to broaden the range of conformations represented during sybody selection. Consistent with this interpretation, only a small number of isolated sybodies show strong conformation-specific binding in the presence or absence of ATP/DNA, as observed by ELISA (now included in the manuscript). We will revise the manuscript text accordingly to clarify this point.

      2. Description of the planned revisions

      Insert here a point-by-point reply that explains what revisions, additional experimentations and analyses are planned to address the points raised by the referees.

      • *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments: 1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.

      We agree with the reviewer that adding such data would be reassuring and that obtaining solid data using purified components is not easy even for a smaller selection of sybodies. We have data that show direct binding of Smc to sybodies by various methods including ELISA, pull-downs and by biophysical methods (GCI). Initially, we omitted these data from the manuscript as we are convinced that the mapping data obtained with chimeric SMC proteins is more definitive and relevant. During the revision we will incorporate the ELISA data showing direct binding and also indicating a lack of preference for a specific state of Smc.

      Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?

      We have attempted to map the binding by structure prediction, however, so far, even the latest versions of AlphaFold are not able to clearly delineate the binding interface. Indeed, many ways of binding are possible, including disruption of ScpAB interaction. However, since the main binding site is located on the SMC coiled coils, the later scenario would likely be an indirect consequence of altered coiled coil configuration, consistent with our current interpretation.

      1. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?

      We have tagged selected sybodies with gfp and performed live cell imaging. This showed that they are all roughly equally expressed and that they localize as foci in the cell presumably by binding to Smc complexes loaded onto the chromosome at ParB/parS sites. We will include this data in the revised version of the manuscript.

      1. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      As eluded to above, we think that our selection gave rise to sybodies that bind various, possibly multiple Smc conformations. Consistent with this idea, the phenotypes are similar to null mutant rather than the ATP-hydrolysis defective EQ mutant, which display even more severe growth phenotypes. We will add the following notes to the text:

      “These conditions favour ATP-engaged particles alongside the typically predominant ATP-disengaged rod-shaped state (add Vazquez Nunez et al., 2021).”

      “ELISA data confirm that nearly all clones bind Smc-ScpAB; however, their binding shows little or no dependence on the presence of ATP or DNA.”

      Minor comments: 1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?

      It is rather common that sybodies (like antibodies and nanobodies) exhibit strong affinity differences between highly conserved proteins (> 90 % identity). The underlying reasons for such strong discrimination are i) location of less conserved residues primarily at the target protein surface and ii) the large interaction interface between sybody and target which offers multiple vulnerabilities for disturbance, in particular through bulky side chains resulting in steric clashes. Another frequently observed phenomenon is sybody binding to a dominant epitope, which also often applies to nanobodies and antibodies. A great example for this are the dominant epitopes on SARS-CoV-2 RBDs.

      Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?

      We suppose that this recovery represents suppressor mutants and/or (more likely) improved growth in the absence of functional Smc during nutrient limitation (see Gruber et al., 2013 and Wang et al., 2013). We will add this statement to the text.

      L41- Sentence correction: Loop can be removed. Ah, yes, sorry for this confusing error. Thank you. 4. L525 - bsuSmc 'E' :extra E can be removed. To do. Thank you. 5. References need to be properly formatted. To do. Thank you. 6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop. To do. 7. How many cells were analysed in the cell biological assays? Legends should include these information. To Be Included.

      Reviewer #1 (Significance (Required)):

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Review: "Single Domain Antibody Inhibitors Target the Coiled Coil Arms of the Bacillus subtilis SMC complex" by Ophélie Gosselin et al, Review Commons RC-2025-03280 Structural Maintenance of Chromosome proteins (SMCs), a family of proteins found in almost all organisms, are organizers of DNA. They accomplish this by a process known as loop extrusion, wherein double-stranded DNA is actively reeled in and extruded into loops. Although SMCs are known to have several DNA binding regions, the exact mechanism by which they facilitate loop extrusion is not understood but is believed to entail large conformational changes. There are currently several models for loop extrusion, including one wherein the coiled coil (CC) arms open, but there is a lack of insightful experimentation and analysis to confirm any of these models. The work presented aims to provide much-needed new tools to investigate these questions: conformation-selective sybodies (synthetic nanobodies) that are likely to alter the CC opening and closing reactions. The authors produced, isolated, and expressed sybodies that specifically bound to Bacillus subtilis Smc-ScpAB. Using chimeric Smc constructs, where the coiled coils were partly replaced with the corresponding sequences from Streptococcus pneumoniae, the authors revealed that the isolated sybodies all targeted the same 4N CC element of the Smc arms. This region is likely disrupted by the sybodies either by stopping the arms from opening (correctly) or forcing them to stay open (enough). Disrupting these functional elements is suggested to cause the Smc-dependent chromosome organization lethal phenotype, implying that arm opening and closing is a key regulatory feature of bacterial Smc-ScpAB. In summary, the authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Some specific comments: Line 75: "likely stabilizing otherwise rare intermediates of the conformational cycle." - sorry, why is that being concluded? Why not stabilizing longer-lived oncformations? We will clarify this statement!

      Line 89: Sorry, possibly our lack of understanding: why first ribosome and then phage display?

      Ribosome display offers to screen around 10^12 sybodies per selection round (technically unrestricted library size), while for phage display, the library size is restricted to around 10^9 sybodies due to the fact that production of a phage library requires transformation of the phagemid plasmid into E. coli, thereby introducing a diversity bottleneck. This is why the sybody platform starts off with ribosome display. It switches to phage display from round 2 onwards because the output of the initial round of ribosome display is around 10^6 sybodies, which can be easily transferred into the phage display format. Phage display is used to minimize selection biases. For more information, please consult the original sybody paper (PMID: 29792401).

      Line 100: Why was only lethality selected? Less severe phenotypes not clear enough?

      Yes, colony size is more difficult to score robustly, as the sizes of individual transformant colonies can vary quite widely. The number of isolated sybodies was at the limit of further analysis.

      Line 106: Could it be tested somehow if convex and concave library sybodies fold in Bs?

      We did not focus on the non-functional sybody candidates and only sybodies of the loop library turned out to cause functional consequences at the cellular level. Notably, we will include gfp-imaging showing that non-lethal sybodies are expressed to similar levels that toxic sybodies. Given the identical scaffold of concave and loop sybodies (they only differ in their CDR3 length), we expect that the concave sybodies fold in the cytoplasm of B. subtilis. For the convex sybodies exhibiting a different scaffold, this will be tested.

      Line 125: Could Pxyl be repressed by glucose?

      To our knowledge and experience, repression by glucose (catabolite repression) does not work well in this context in B. subtilis.

      Line 131: The SMC replacement strain is a cool experiment and removes a lot of doubts!

      Thank you! (we agree 😊)

      Line 141: The mapping is good and looks reliable, but looks and feels like a tour de force? Of course, some cryo-EM would have been lovely (lines 228-229 understood, it has been tried!).

      Yes, we have made several attempts at structural biology. Unfortunately, Smc-ScpAB is not well suited for cryo-EM in our hands and crystallography with Smc fragments and sybodies did not yield well-diffracting crystals.

      Line 179: Mmmh. Do we not assume DNA binding on top of the dimerised heads to open the CC (clamp)?

      We will clarify the text here.

      Line 187: Having sybodies that presumably keep the CC together (closing) and some that do not allow them to come together correctly (opening) is really cool and probably important going forward.

      Thank you!

      Figure 1 Ai is not very colour-blind friendly.

      We are sorry for this oversight. We will try to make the color scheme more inclusive. Thank you for the notification.

      Optional: did the authors see any spontaneous mutations emerge that bypass the lethal phenotype of sybody expression?

      No, we did not observe spontaneous mutations suppressing the phenotype, possibly due to the limited number of cell generations observed. We tried to avoid suppressors by limiting growth, but this may indeed be a good future approach for further fine map the binding sites and to obtain insights into the mechanism of inhibition.

      Optional: we think it would be nice to try some biochemical experiment with BMOE/cysteine-crosslinked B. subtilis Smc in the mid-region (4N or next to it) of the Smc coiled coils to try to further strengthen the story. Some of the authors are experts in this technique and strains might already exist?

      We have indeed tried to study the impact of sybody binding on Smc conformation by cysteine cross-linking. However, we were not convinced by the results and thus prefer not to draw any conclusions from them. We will add a corresponding note to the text.

      Reviewer #2 (Significance (Required)):

      The authors present a new method for trapping bacterial Smc's in certain conformations using synthetic antibodies. Using these antibodies, they have pinpointed the (previously suggested) 4N region of the coiled coils as an essential site for the opening and closing of the Smc coiled coil arms and that hindering these reactions blocks Smc-driven chromosomal organization. The work has important implications for how we might elucidate the mechanism of DNA loop extrusion by SMC complexes. Thank you!

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Gosselin et al. use the sybody technology to study effects of in vivo inhibition oft he Bacillus subtilis SMC complex. Smc proteins are central DNA binding elements of several complexes that are vital for chromosome dynamics in almost all organisms. Sybodies are selected from three different libraries of the single domain antibodies, using the „transition state" mutant Smc. They identify 14 such mutant sybodies that are lethal when expressed in vivo, because they prevent proper function of Smc. The authors present evidence suggesting that all obtained sybodies bind to a coiled-coil region close to the Smc „neck", and thereby interfere with the Smc activity cycle, as evidenced by defective ATPase activity when Smc is bound to DNA. The study is well done and presented and shows that the strategy is very potent in finding a means to quickly turn off a protein's function in vivo, much quicker than depleting the protein.

      The authors also draw conclusions on the molecular mode of action of the SMC complex. The provide a number of suggestive experiments, but in my view mostly indirect evidence for such mechanism.

      My main criticism ist hat the authors have used a single - and catalytically trapped form of SMC. They speculate why they only obtain sybodies from one library, and then only idenfity sybodies that bind to a rather small part oft he large Smc protein. While the approach is definitely valuable, it is biassed towards sybodies that bind to Smc in a quite special way, it seems. Using wild type Smc would be interesting, to make more robust statements about the action of sybodies potantially binding to different parts of Smc.

      As explained above, we are quite confident the Smc ATPase mutation did not bias the selection in an obvious way. The surprising bias towards coiled coil binding sites has likely other explanations, as they likely form a preferred epitope recognized by sybodies.

      Line 105: Alternatively, the other libraries did not produce good binders or these sybodies were 106 not stably expressed in B. subtilis. This could be tested using Western blotting - I am assuming sybody antibodies are commercially avalable. However, this test is not important for the overall study, it would just clarify a minor point.

      While there are antibody fragments available to augment the size of sybodies (PMID: 40108246), these recognize 3D-epitopes and are thus not suited for Western blotting. We did not follow up on the negative results much, but would like to point out again that there are several biases that likely emerge for the same reason (bias to library, bias to coiled coil binding site). If correct, then likely few other sybodies are effectively lethal in B. subtilis, with the exception of the ones isolated and characterized. We have added this notion to the manuscript. We have also tested the expression of non-lethal sybodies by gfp-tagging and imaging. These results will be included in the revision.

      Fig. 2B: is is odd to count Spo0J foci per cells, as it is clear from the images that several origins must be present within the fluorescent foci. I am fine with the „counting" method, as the images show there is a clear segregation defect when sybodies are expressed, I believe the authors should state, though, that this is not a replication block, but failure to segregate origins.

      We agree that this is an important point and will add a corresponding comment to the text.

      Testing binding sites of sybodies tot he SMC complex is done in an indirect manner, by using chimeric Smc constructs. I am surprised why the authors have not used in vitro crosslinking: the authors can purify Smc, and mass spectrometry analyses would identify sites where sybodies are crosslinked to Smc. Again, I am fine with the indirect method, but the authors make quite concrete statements on binding based on non-inhibition of chimeric Smc; I can see alternative explanations why a chimera may not be targeted.

      We have made several attempts of testing direct binding with mixed outcomes and decided to not include those results in the light of the stronger and more relevant in vivo mapping. However, we will add ELISA results and briefly discuss grating coupled interferometry (GCI) data and pull-downs.

      Smc-disrupting sybodies affect the ATPase activity in one of two ways. Again, rather indirect experiments. This leads to the point Revealing Smc arm dynamics through synthetic binders in the discussion. The authors are quite careful in stating that their experiments are suggestive for a certain mode of action of Smc, which is warranted.

      In line 245, they state More broadly, the study demonstrates how synthetic binders can trap, stabilize, or block transient conformations of active chromatin-associated machines, providing a powerful means to probe their mechanisms in living cells. This is off course a possible scenario for the use of sybodies, but the study does not really trap Smc in a transient conformation, at least this is not clearly shown.

      We agree and will carefully rephrase this statement. Thank you.

      Overall, it is an interesting study, with a well-presented novel technology, and a limited gain of knowledge on SMC proteins. We respectfully disagree with the last point, since our unique results highlight the importance of the Smc coiled coils, which are otherwise largely neglected in the SMC literature, likely (at least in part) due the mild effect of single point mutations on coiled coil dynamics.

      Reviewer #3 (Significance (Required)):

      The work describes the gaining and use of single-binder antibodies (sybodies) to interfere with the function of proteins in bacteria. Using this technology for the SMC complex, the authors demonstrate that they can obtain a significant of binders that target a defined region is SMC and thereby interfere with the ATPase cycle.

      The study does not present a strong gain of knowledge of the mode of action of the SMC complex.

      As pointed out above, we respectfully disagree with this assertion.

      • *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      As pointed out above, there are a few minor points that we prefer not to experimentally address. In particular, we do not consider it as necessary to determine the expression levels of sybodies which were non-inhibitory. We also wish to note that we attempted to obtain structural additional biochemical data and to that end performed cryo-EM, crystallography and cysteine cross-linking experiments. Unfortunately, we did not obtain sybody complex structures and the cross-linking data were unfortunately not conclusive. We also wish to note that the first author has finished her PhD and left the lab, which limits our capacity to add additional experiments. However, as the reviewers also pointed out, the main conclusions are well supported by the data already.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Gosselin et al., develop a method to target protein activity using synthetic single-domain nanobodies (sybodies). They screen a library of sybodies using ribosome/ phage display generated against bacillus Smc-ScpAB complex. Specifically, they use an ATP hydrolysis deficient mutant of SMC so as to identify sybodies that will potentially disrupt Smc-ScpAB activity. They next screen their library in vivo, using growth defects in rich media as a read-out for Smc activity perturbation. They identify 14 sybodies that mirror smc deletion phenotype including defective growth in fast-growth conditions, as well as chromosome segregation defects. The authors use a clever approach by making chimeras between bacillus and S. pnuemoniae Smc to narrow-down to specific regions within the bacillus Smc coiled-coil that are likely targets of the sybodies. Using ATPase assays, they find that the sybodies either impede DNA-stimulated ATP hydrolysis or hyperactivate ATP hydrolysis (even in the absence of DNA). The authors propose that the sybodies may likely be locking Smc-ScpAB in the "closed" or "open" state via interaction with the specific coiled-coil region on Smc. I have a few comments that the authors should consider:

      Major comments:

      1. Lack of direct in vitro binding measurements: The authors do not provide measurements of sybody affinities, binding/ unbinding kinetics, stoichiometries with respect to Smc-ScpAB. Additionally, do the sybodies preferentially interact with Smc in ATP/ DNA-bound state? And, do the sybodies affect the interaction of ScpAB with SMC? It is understandable that such measurements for 14 sybodies is challenging, and not essential for this study. Nonetheless, it is informative to have biochemical characterization of sybody interaction with the Smc-ScpAB complex for at least 1-2 candidate sybodies described here.
      2. Many modes of sybody binding to Smc are plausible The authors provide an elaborate discussion of sybodies locking the Smc-ScpAB complex in open/ closed states. However, in the absence of structural support, the mechanistic inferences may need to be tempered. For example, is it also not possible for the sybodies to bind the inner interface of the coiled-coil, resulting in steric hinderance to coiled-coil interactions. It is also possible that sybody interaction disrupts ScpAB interaction (as data ruling this possibility out has not been provided). Thus, other potential mechanisms would be worth considering/ discussing. In this direction, did AlphaFold reveal any potential insights into putative binding locations?
      3. Sybody expression in vivo Have the authors estimated sybody expression in vivo? Are they all expressed to similar levels?
      4. Sybodies should phenocopy ATP hydrolysis mutant of Smc The sybodies were screened against an ATP hydrolysis deficient mutant of Smc, with the rationale that these sybodies would interfere this step of the Smc duty cycle. Does the expression of the sybodies in vivo phenocopy the ATP hydrolysis deficient mutant of Smc? Could the authors consider any phenotypic read-outs that can indicate whether the sybody action results in an smc-null effect or specifically an ATP hydrolysis deficient effect?

      Minor comments:

      1. It was surprising that no sybodies were found that could target both bacillus and spneu Smc. For example, sybodies targeting the head regions of Smc that might work in a more universal manner. Could the authors comment on the coverage of the sybodies across the protein structure?
      2. Growth curves (Fig. S3) show a large jump in recovery in growth under sybody induction conditions. Could the authors address this observation here and in the text?
      3. L41- Sentence correction: Loop can be removed.
      4. L525 - bsuSmc 'E' :extra E can be removed.
      5. References need to be properly formatted.
      6. The authors should add in figure legend for Fig 1i) details on representation of the purple region, and explain the grey strokes for orientation of the loop.
      7. How many cells were analysed in the cell biological assays? Legends should include these information.

      Significance

      Overall, this is an impressive study that uses an elegant strategy to find inhibitors of protein activity in vivo. The manuscript is clearly written and the experiments are logical and well-designed. The findings from the study will be significant to the broad field of genome biology, synthetic biology and also SMC biology. Specifically, the coiled coil domain of SMC proteins have been proposed to be of high functional value. The authors have elegantly identified key coiled-coil regions that may be important for function, and parallelly exhibited potential of the use of synthetic sybody/designed binders for inhibition of protein activity.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Response to Reviewers

      We thank the Reviewers for their appreciative comments (Reviewer 1: “first time that a well-established existing mathematical model of signaling response extended and applied to heterogeneous ligand mixtures”)and constructive suggestions for improvement. In this extensive revision, we have not only addressed the suggestions comprehensively but also extended our analysis of signaling antagonism to all doses and at the single-cell level using novel computational workflows. This resulted in the discovery of several mechanismsof antagonism and synergy that are dose-dependent, and dependent on the cell-specific state of the signaling network, thereby manifesting in only a subset of cells.

      We have addressed Reviewer comments: we have made substantial revisions to improve clarity, rigor, and biological interpretation. Below we briefly summarize the main concerns raised by Reviewers 1-3 and how we have addressed them.

      • We have rewritten the Methods section to clarify our approaches. We have also added the explanation of methodology and the rationale in the main text to improve readability and comprehensiveness (Addressing Reviewer #1 comments). This includes explaining and justifying the signaling codon approaches (Reviewer 1), our core-module parameter matching methodology and discussion (Reviewer #1, point 11, Reviewer #2, point 1), and the model schematic (Reviewer #1, point 5).
      • For one of our major conclusions – that macrophages may distinguish stimuli in the context of ligand mixtures – we have validated these results with experiments, which increases confidence in this conclusion (Reviewer #2, point 3, Reviewer #3, point 2).
      • We have updated the model for CpG-pIC competition using Michaelis–Menten kinetics without any additional parameters, rather than introducing new free parameters. This change removes parameter freedom for fitting combinatorial conditions, leading to a more constrained and mechanistically grounded model whose predictions align better with experimental data (Updated Figures 2 and S2; Reviewer #2, point 2).
      • We have addressed all other editorial and clarification-related concerns as well, as detailed in our point-by-point response below. In addition, we have extended the scope of the manuscript. We have extended our analysis of ligand combinations across a broad dose range, from non-responsive to saturated conditions. This led to several additional discoveries. For example, we show that ultrasensitive IKK activation can underlie synergistic combinations of ligands at low doses. In contrast, beyond the CpG-poly(I:C) antagonism, we identify that competition for CD14 uptake by LPS and Pam can generate antagonism between these ligands within specific dose ranges.

      Importantly, such antagonism or synergy is not evident in all cells in the population. It may also not be picked up by studies of the mean behavior. With our new computational workflow that allows for single-cell resolution we identify the conditions that must be met by the signaling network state, for antagonism or synergy to take place.

      Further, we examine the hypothesis that such signaling pathway interactions affect stimulus-response specificity in combinatorial stimulus conditions. By comparing models with and without this antagonism, we demonstrate that antagonistic interactions can improve stimulus-response specificity in complex ligand mixtures.

      These additional analyses provide a new mechanistic understanding of cellular information processing and elucidate how synergy and antagonism can mechanistically shape signaling fidelity in response to complex ligand mixtures.

      Point-by-Point Response

      Reviewer #1

      Evidence, reproducibility and clarity

      The authors extend an existing mathematical model of NFkB signalling under stimulation of various single receptors, to model that describes responses to stimulation of multiple receptors simultaneously. They compare this model to experimental data derived from live-cell imaging of mouse macrophages, and modify the model to account for potential antagonism between TLR3 and TLR9 response due to competition for endosomal transport. Using this framework they show that, despite distinguishability decreasing with increasing numbers of heterogenous stimuli, macrophages are still able in principle to distinguish these to a statistically significant degree. I congratulate the authors on an interesting approach that extends and validates an existing mathematical model, and also provides valuable information regarding macrophage response.

      Response: We thank the reviewer for this appreciative assessment and for the careful reading of our work. The constructive comments helped us substantially improve the rigor and clarity of the manuscript.

      In addition to revising the text for clarity, we have extended our analysis to systematically investigate dose-response behavior for each pair of ligand combination. Using the experimentally validated model, we explored 10 ligand pairs across a range of doses from non-responsive to saturating. This allowed us to identify mechanistic regimes in which synergy and antagonism arise at the single-cell level. In particular, we found that low-dose synergy can be explained by ultrasensitive IKK activation (Figure 4 and corresponding supplementary figures), while antagonism can emerge from competition for shared components such as CD14 (Figure 5 and corresponding supplementary figures). We further show that antagonism can enhance condition distinguishability in ligand mixtures, thereby contributing to stimulus-response specificity (Figure 5 and corresponding supplementary figures).

      There are no major issues affecting the scientific conclusions of the paper, however the lack of detail surrounding the mathematical model and the 'signaling codons' that are used throughout the paper make it difficult to read. This is exacerbated by the fact that I was unable to find Ref 25 which apparently describes the model, however I was able to piece together the essential components from the description in Ref 8 and the supplementary material.

      Response: This comment helped us to improve the writing. We apologize that the key reference 25 was still not publicly available. It is now published in Nature Communications. In addition, we have added more details to clarify the mathematical model as well as the signaling codons, in results and in methods. Please see below for details.

      Lots of the minor comments below stem from this, however there are also a few other places that could benefit from some additional clarification and explanation.

      Significance: 1. '...it remains unclear complex...' -> '...it remains unclear whether complex...' Response: We have rewritten the Significance (now it is Synopsis).

      Introduction: 2. 'temporal dynamics of NFkB' - it would be good to be more concrete regarding the temporal dynamics of what aspect of this (expression, binding, conformation, etc), if possible. Response: It refers to the presence of NFκB into nucleus, which represents active NFκB capable of activating gene expression. We have clarified this (Lines 59-61 in introduction paragraph 2). “Upon stimulation, NFκB translocates into the nucleus, … activating immune gene expression (10, 15–19).

      'signaling codons' - the behaviour of these is key to the entire paper, so even if they are well described in the reference, it would be good to have a short description as early as possible so that the reader can get an idea in their mind what exactly is being discussed here. Later, it would be good to have concrete description of exactly what these capture.

      Response: We thank the reviewer for this comment. We have added one whole paragraph in the early introduction to describe the concept of Signaling Codons which allow quantitative characterization of NFkB stimulus-response-specific dynamics (Lines 60-67). We have also added more concrete description of Signaling Codons in the results as well as adding an illustration for the signaling codons (Lines 169-175, Figure S2B).

      'This challenge...population of macrophages' - this seems a bit out of place, and is a bit of a run on sentence, so I suggest moving this to the next paragraph and working it into the first sentence there '...regulatory mechanisms, and this challenge could be addressed with a model parameterised to account for heterogeneous...Early models ...', or something similar.

      Response: We thank the reviewer for this suggestion, we have revised this as suggested. This improves the logic flow (Lines 87-88).

      Ref 25: I can't find a paper with this title anywhere, so if it's an accepted preprint then it would be good to have this available as well. That said, I still think it would be difficult to grasp the work done in this paper without some description of the mathematical model here, at least schematically, if not the full set of ODEs. For example, there are numerous references to how this incorporates heterogeneous responses, the 'core module', etc, and the reader has no context of these if they aren't familiar with the structure of the model. Response: We apologize that Ref 25 was not on PubMed. Now it’s published, and we have updated the corresponding information. This comment also helped us to improve the writing by adding a description of the mathematical model in the Introduction (Lines 95-105), the results (Lines 129-141), and a detailed description of the model in the Methods (Simulation of heterogenous NFκB dynamical responses.)

      We have also added the schematic of the model topology in Figure S1 (adapted from previous publications Guo et al 2025, Adelaja et al 2021) to make sure the paper is self-contained.

      'A key challenge which is...' -> 'A key challenge is...' Response: We have revised the Introduction and removed this sentence.

      'With model simulation ...' -> a bit of a run on sentence, I suggest breaking after 'conditions'. Response: We have revised the introduction and removed this sentence.

      Results:

      1. This section would benefit from a more in-depth description of the model and experimental setup. In particular for the experiment, the reader never really knows what this workflow for this is, nor what the model ingests as input, and what the predictions are of. Response: This comment helped us to improve clarity by adding an in-depth description of the model and experimental setup. We have revised the Results as suggested (Lines 129-141). We also appended the corresponding revision here for reviewer reference.

      This mechanistic model was trained on single-ligand response experimental datasets, capturing the single-ligand stimulus-response specificity of the population of macrophages while accounting for cellular heterogeneity. Specifically, quantitative NFκB dynamic trajectory data from hundreds of single macrophages responding to five single ligands (TNF, pIC, Pam, CpG, LPS) at 3-5 doses was obtained from live cell imaging experiments. The mathematical model (Figure S1) consists of a 52-dimensional system of ordinary differential equations, including 52 intracellular species, 101 reactions and 133 parameters, and is divided into five receptor modules, which respond to the corresponding ligands respectively, and the IKK-NFκB core module that contains the prominent IκBα negative feedback loop. By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single-cell population were inferred. Analyzing the resulting simulated NFκB trajectories with Information theoretic and machine learning classification analyses confirmed that the virtual cell model simulations reproduced key SRS performance characteristics of live macrophages.”

      '..mechanistic model was trained...' - trained in this study, or in the previous referenced study? Response: The mechanistic model was trained in a previous study (Guo et al 2025 Nature Comm), and we have clarified this in the revision (Lines 127 - 129).

      1. 'determined parameter distributions' - this is where it would be good to have more background on the model. What parameters are these, and what do they correspond to biologically? It would also be nice to see in the methods or supplementary material how this is done (maximum likelihood, etc). Response: This comment helps us to clarify the predetermined parameter distributions. We have revised the methods to include this information (Simulation of heterogenous NFκB dynamical responses, paragraph 3). We have appended the corresponding text here for reviewer’s convenience.

      “The ODE model was then fitted to the population of single-cell trajectories to recapitulate the cell-to-cell heterogeneity in the experimental data (2). This is achieved by solving the non-linear mixed effects model (NLME) through stochastic approximation of expectation maximation algorithm (SAEM) (3–6). Seventeen parameters were estimated. Within the core module, the estimated parameters included the rates governing TAK1 activation (k52, k65), the time delays of IκBα transcription regulated by NFκB (k99, k101), and the total cellular NFκB abundance (tot NFκB). Within the receptor module, receptor synthesis rates (k54 for TNF, k68 for Pam, k85 for CpG, k35 for LPS, k77 for pIC), degradation rates of the receptor–ligand complexes (k56, k61, k64 for TNF; k75 for Pam; k93 for CpG; k44 for LPS; k83 for pIC), and endosomal uptake rates (k87 for CpG; k36 and k40 for LPS; k79 for pIC) were fitted. All remaining parameters were fixed at literature-suggested values (1). The single-cell parameters inferred from experimental individualcell trajectories then served as empirical distributions for generating the new dataset (see SupplementaryDataset2).”

      'matching cells with similar core model...' - it's difficult to follow the logic as to why this is done, so I think this needs to be a little clearer. My guess would be that the assumption is that simulated cells with similar 'core' parameters have a similar downstream signalling response, and therefore the receptors can be 'transplanted'. So it would be nice to see exactly what these distributions are and what the effect of a bad match would be. Response: We thank the reviewer for this comment. In the revision, we have explained the rationale for matching cells with similar core module (Lines 145-152).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the relevant single ligand experimental data (Figure 1A, Step 1), but other receptor modules’ parameter values were not determined. To simulate stimulus responses to more than two ligands, we imputed the other ligand-receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions.

      This nearest-neighbor hot-deck imputation approach (the core module matching method) was shown to outperform other approaches, including random matching and rescaled-similarity matching (Guo et al. 2025, Supplementary Figure S11). For the reviewer’s convenience, we have also appended the corresponding figure below.

      Figure S11 from (Guo et al., 2025). Assessment of matching techniques for predicting single-cell responses to various ligand stimuli (a-d). Heatmaps illustrating the Wasserstein distance between the signaling codon distributions predicted by the model and those observed in experiments. The analysis employs four distinct matching methods to align the five ligand-receptor module parameters: (a) “Random Matching”, (b) “Similarity Matching” (the method used in our study), (c) “Rescaled-Similarity Matching”, and (d) “Sampling Approximated Distribution”. In the heatmaps, rows represent signaling codons, columns denote ligands, and the color intensity indicates the Wasserstein distance, providing a visual metric of similarity between model predictions and experimental data. e-f. Histogram of the average Wasserstein distance between the model-predicted and experimentally observed signaling codon distributions, summarized across signaling codons (e) and ligands (f).

      Some explanation of how this relates to the experimental data the parameters are fit on would also be useful. (a) Is there a correspondence between individual simulated cells and the experimental data for the single ligand stimulation, and then the smallest set of these is taken? Is there also a matching from the simulated multi-receptor modules and the multi-receptor data, and if so, is this done in the same way? Response: This comment to help us clarify the correspondence relationship between model simulations and experimental data.

      Yes—there is a correspondence between individual simulated cells and the previously published experimental data (Guo et al., 2025b) for single-ligand stimulation. We have revised the first paragraph of the Results (Lines 136–148) and the Methods (Lines 544-557) to clarify how the model simulations were fit to the previous experimental dataset. See Reviewer 1, Comments 10 for the updates in Methods. We have pasted in the revised Results section below for the reviewer’s reference.

      By fitting the single-cell experimental data set with a non-linear mixed effect statistical model (coupling with 52-dimensional NFκB ODE model), the parameter distributions for the single cell population were inferred.

      'six signaling codons' - here it would be good to recapitulate what these represent, but also what the 'strength' and 'activity' correspond to (total integrated value, maximum value, etc) Response: We thank the reviewer for the suggestion and have clarified this point (Lines 169-175, Figure S2B).

      'pre-defined thresholds' - no need to state these numerically in the text (although giving some sense of how/why these were chosen would give some context), but I couldn't find the values of these, nor values corresponding to the signaling codons. Response: We appreciate the reviewer’s comment. We have added this information in the figure legend (Figure 1B-C) and Method -- “Responder fraction” (Lines 666-672). Specifically, for the model simulation data, the integral thresholds are 0.4 (µM·h), 0.5 (µM·h), and 0.6 (µM·h). The peak thresholds are 0.12 (µM), 0.14 (µM), and 0.16 (µM). For the experimental data, the integral thresholds are 0.2 (A.U.·h), 0.3 (A.U.·h), and 0.4 (A.U.·h). The peak thresholds are 0.14 (A.U.), 0.18 (A.U.), and 0.22 (A.U.). Thresholds were selected so that the medium threshold yields 50% responder cells under single-ligand conditions, while the responder ratio remains unsaturated under three-ligand stimulation.

      'non-responder cells are likely a result of cellular heterogeneity in receptor modules rather than the core module' - is this the 'ill health' referenced earlier? If so make this clear. Response: Yes, this is the ‘ill health’ referenced earlier, and we have clarified this (Lines 198-199).

      It's also very difficult to follow this chain of logic, given that the reader at this point doesn't have any knowledge of what the 'core' module is, nor the significance of the thresholds on the signaling codons. I would suggest making this much clearer, with reference to each of these. Response: We apologize for the poor explanation. We have now explained in the Introduction (Lines 95-106) and the results (Lines 129-141) how the model is structured into receptor-proximal modules that converge on the common core module. We have also added a schematic for clarity (Figure S1). For further clarification of the math models, we have significantly revised the Methods (Simulation of heterogenous NFκB dynamical responses). The defined thresholds are clarified in the Methods -- “Responder fraction”.

      '...but the model represented these as independent mass action reactions' - the significance of this may not be clear to someone not familiar with biophysical models, so probably better to make it explicit. Response: We thank the reviewer for this reminder, and we have added a description of the significance of this point (Lines 225-227).

      '...we trained a random forest classifier...' - is this trained on the 'raw' experimental time series data, or on the signaling codons? Response: It is trained on the signaling codons calculated from model simulations of NFκB trajectories. We have clarified this (Lines 260-261).

      'We also applied a Long Short-Term Memory (LSTM) machine learning model...' - it might be good to reference these three approaches at the beginning of this section, otherwise they seem to come out of the blue a little. Response: We have added the references of these three approaches in the beginning of this section (Lines 242-246).

      'We then used machine learning classifiers...' - random forests, LSTMs, or a different model? Response: We have clarified that this as random forest classifier (Line 276).

      Discussion:

      1. '...over statistical models...' - suggest maybe 'purely statistical models' Response: We thank the reviewer for this suggestion. We have rewritten the whole Discussion to include the new insights of antagonism and synergy and their roles in maintaining unexpectedly high SRS performance. Thus, this sentence was removed.

      'We found that endosomal transport...' - A paper by Huang, et. al. (https://www.jneurosci.org/content/40/33/6428) observed a synergistic phagocytic response between CpC and pIC stimulation in microglia. This is still consistent with a saturation effect dependent on dose, but may be worth a mention. Response: We thank the reviewer for referring this interesting paper to us, and this comment helps us to improve the Discussion of inflammatory signaling pathways besides NFκB. This paper demonstratessynergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production(Huang et al., 2020), such as IFN-β and TNF-α, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). This finding does not contradict our findings that CpG and pIC act antagonistically in the NFκB signaling pathway because of the combinatorial pathways that act on gene expression: CpG can activate the MAPK signaling pathway (Luecke et al., 2024) but not the IRF signaling pathway, whereas pIC activates the IRF signaling pathway (Akira and Takeda, 2004) but only weakly the MAPK pathway. Therefore, their combination can synergistically regulate inflammatory responses. We have added this to the discussion (Lines 515-522).

      '...features termed...' -> 'features, termed' Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      '...we applied a Long Short-Term Memory (LSTM) machine learning model..' - maybe make clear that this is on the time-series data (also LSTM has already been defined). Response: We thank the reviewer for their carefully reading, and we have rewritten the Discussion.

      Materials and methods:

      1. The descriptions in this section are quite vague, so I would suggest expanding this with more detail from the supplementary material, where things are quite well explained. Response: We thank the reviewer for this suggestion, and we have rewritten the whole Methods as suggested.

      'sampling distribution' - not clear what this refers to in this context Response: We have clarified this in the revision (Methods -- Simulation of heterogenous NFκB dynamical responses, paragraph 3). The single-cell signaling-pathway parameter values used for bootstrapping sampling to generate model simulations are given in Supplementary dataset 2.

      'RelA-mVenus mouse strain' - it would be good to mention the relevance of the reporter for NFkB signaling Response: We have added the relevance of the reporter for NFkB signaling (Methods, Lines 624-626).

      '...A random forest classifier...' -> a random forest classifier

      Response: We have rewritten the methods.

      Significance

      This study provides mechanistically interpretable insight on the important question of how immune cells perform target recognition in realistic scenarios, and also provides validation of existing mathematical models by extending these beyond their original domain. The paper uses 'signaling codons' as a proxy for information processing, however in this instance it is cross-validated with an LSTM model that is applied directly to the time series data. Nevertheless, the scope of the paper is such that it does not deal with the question of how these signals are transmitted or used in a downstream immune response. To my knowledge, this is the first time that a well established existing mathematical model of signalling response has been extended and applied to heterogeneous ligand mixtures. These results will be of interest to those studying immune cell responses, and to those interested in basic research on mathematical models of signaling and cellular information processing more generally.

      My background is in biophysical models, machine learning, and signaling in cancer. I have a basic understanding of immunology, but no experience in experimental cell biology.

      Response: We thank the reviewer for highlighting the novelty of our study. We appreciate the reviewer’s recognition that our work advances the understanding of cellular information processing in the context of ligand mixtures, particularly as the first to extend computational models to investigate signaling fidelity under mixed-ligand conditions.

      We agree that this work will interest computational biologists focused on signaling network modeling and information processing. In addition, we believe it will also be valuable for all signaling biologists, as we provide fundamental insights. For experimental biologists in particular, our model provides an efficient, quantitative framework for exploring and generating testable hypotheses.

      We would also like to gently emphasize that evaluating specificity within signaling pathways is as essential as studying downstream functional responses. While immune function outcomes are certainly important, they rely on the upstream signaling pathways that first respond to environmental cues. Understanding how these signaling pathways achieve specificity and discriminability is therefore crucial. For example, this is particularly relevant for drug development targeting pathways such as NFκB, where assessing the direct signaling output—NFκB activation dynamics—can provide valuable insight into the effects of pharmacological interventions.

      Reviewer #2

      Evidence, reproducibility and clarity

      Guo et al. developed a heterogeneous, single-cell ODE model of NFκB signaling parameterized on five individual ligands (TNF, Pam, LPS, CpG, pIC) and extended it, via core-module parameter matching, to predict responses to all 31 combinations of up to five ligands. They found that simulated responder fractions and signaling codon features generally agreed with live-cell imaging data. A notable discrepancy emerged for the CpG (TLR9) + pIC (TLR3) pair: experiments exhibited non-integrative antagonism unpredicted by the original model. This issue was resolved by incorporating a Hill-type term for competitive, limited endosomal trafficking of these ligands. Finally, by decomposing NFκB trajectories into six "signaling codons" and applying Wasserstein distances plus random-forest and LSTM classifiers, the authors showed that stimulus-response specificity (SRS) declines with ligand complexity but remains statistically significant even for quintuple mixtures. This is a well written and scientifically sound manuscript about complexities of cellular signaling, especially considering the limitations of in vitro experiments in recapitulating in vivo dynamics.

      Response: We thank the reviewer for carefully reading the manuscript and for this endorsement. We have significantly improved the manuscript thanks to the reviewer’s insightful comments (see below for point-to-point responses).

      Besides addressing the reviewer’s questions, we have further extended our work to investigate how ligand pairs interact across all doses and how those interactions affect stimulus-response specificity. As the reviewer pointed out, experimental studies are limited in recapitulating the multitude of complex physiological contexts. The model is helpful to explore more complex scenarios beyond the feasibility of in-vitro experimental setups. Using computational simulations, we have further explored 360 conditions generated from 10 ligand pairs, each evaluated at 6 doses spanning non-responsive to saturating levels, and with each condition considered 1000 cells to capture the heterogeneity of the population.

      From this extended analysis, we identified the mechanistic bases for observations of both synergy and antagonism. Synergy for certain low-dose ligand combinations can be explained by ultrasensitive IKK activation (Figure 4), while antagonism between LPS and Pam arises from competition for the cofactor CD14 (Figure 5). We show that these phenomena are dependent on the signaling network state and therefore are not observed in all cells of the population. We define the network conditions that must be met for antagonism and synergy to occur. Importantly, we then show that antagonism can contribute to stimulus-response specificity in ligand mixtures (Figure 5).

      Here are a few comments and recommendations:

      1. The modeling approach used in this manuscript, while interesting, might need further validation. Inferring multi-ligand receptor parameters by matching single-ligand cells on core-module similarity may not capture true co-variation in receptor expression or adaptor availability. Single cell measurements of receptor expressions could be done (e.g. via flow cytometry) to ground this assumption in real data. If the authors think this is out of scope for this manuscript, they could fit core-matched single cell models with two receptor modules from scratch to the two-ligand experimental data. Would this fitted model produce similar receptor parameters compared to the presented approach? At least the authors should add a bit more explanation for why their modeling approach is better (or valid) than fitting the models with 2/3/4/5 receptor modules from scratch to the experimental data.

      Response: We thank the reviewer for this comment, this helped us improve the explanation of the methodology, the rationale, and the validation. The methodology is based on the well-established statistical method of nearest-neighbor hot-deck imputation (Andridge and Little, 2010). In this implementation, the core module functions as a stabilizing “anchor” (common variables) to harmonize various receptor-specific parameter distributions. Similar methodologies have been successfully applied to correct batch effects or integrate single-cell RNAseq datasets using anchor cell types (Stuart et al., 2019). Our workflow has been validated on single-ligand stimuli conditions in a previous study (Guo et al., 2025) (See below 3rdparagraph). Here, we used this method to generate predictions for ligand mixtures and have validated them with experimental studies of the dual-ligand stimuli, and we found that our predictions align well with the experimental data. As the reviewer suggested in point 3, in the revision, we also added experimental validation on the binary classifiers of macrophage determines whether specific stimuli are presented in the ligand mixture. The question we are interested in in this work is how macrophage process ligand-specific information in the context of ligand mixtures. For this question, the experimental results align with the model predictions, reaching consistent conclusions.

      In the revision, we have explained the rationale for using the nearest-neighbor hot-deck imputation by matching cells with similar core module (Lines 143-150).

      Previous work determined parameter distributions for only the cognate receptor module (and the core module) that provided the best fit for the single ligand experimental data (Figure 1A, Step 1), and other receptor modules parameter information is missing. To simulate stimulus responses to more than two ligands, we imputed the other ligand–receptor module parameters using shared core-module parameters as common variables and employing nearest-neighbor hot-deck imputation (35). In this setup, the core module functions as an “anchor” to harmonize two or more receptor-specific parameter distributions. This was achieved by by minimizing Euclidean distance between the core module parameters associated with the independently parameterized single-ligand models (Figure 1A, Step 2).

      In Guo et al. (2025) (see Supplementary Figure S11), the nearest-neighbor hot-deck imputation approach (core module similarity matching method) was compared with other approaches, including random matching and rescaled-similarity matching. The results show that, after matching, the core module method best preserves the single-ligand stimulus signaling codon distributions. For the reviewer’s convenience, we have also appended the figure in the response to Reviewer 1, Comment 11.

      The advantage of our workflow is that it does not need to be fit to new experimental data and still gives reliable predictions on signaling dynamics. For the reviewer’s interest, we have tried to fit core-matched single cell models with two receptor modules. As fitting parameters require sufficiently large and high-quality datasets, single-ligand stimulation data with more than 1,000 cells can be adequate to estimate 6~7 parameters (Guo et al., 2025) (approx. 1400 cells to 2000 cells per ligand). However, our current experimental dataset for combinatorial-ligand conditions contains only 500~1,000 cells, and we have tested these datasets but results show a poor fit of heterogeneous signaling dynamics. This is due to an insufficient number of cells for estimating 8~10 parameters. We estimate that at least ~1,500 cells would be needed for reliable parameter estimation under dual-ligand stimulation (and more cells may be needed for combinatorial ligand stimuli involving more ligands). This is currently not feasible to obtain for mixed ligands given the large number of combinatorial conditions.

      Overall, in this paper, the nearest-neighbor hot-deck imputation approach is presented as a feasible and acceptable approach that best reflects our current understanding of the signaling network. Importantly, it helps identify potential gaps by highlighting discrepancies between model predictions and experimental observations.

      (a) The refined model posits competitive, saturable endosomal transport for CpG and pIC, but no direct measurements of endosomal uptake rates or compartmental saturation thresholds are provided, leaving the Hill parameters under-constrained. The authors could produce dose-response curves for CpG and pIC individually and in combination across a range of concentrations to fit the Hill parameters for competitive uptake. (b) If this is out of scope for this paper, the authors should at least comment on why the endosome hypothesis is better than others e.g. crosstalks and other parallel pathway activations. Especially given that even the refined model simulations with Hill equations for CpG and pIC do not quite match with the experimental data (Fig 2 B,E).

      Response: (a) The reviewer’s comments helped us to improve our work by employing the Michaelis-Menten Kinetics for substrate competition reactions, which increases the mathematic rigor of the CpG-pIC competition model. In this updated model, there is no free parameters to tune, as all the Vmax, Kd, should be consistent with the single-ligand scenario. And the Hill is same as single-ligand case, equal to 1.

      The comments on examining dose-response curves for CpG and pIC inspired us to extend the dose-response curves for all ligand pair combination, allowing us to identify the synergy in low-dose ligand pairs and antagonism for high-dose LPS-Pam, besides CpG-pIC (new Figure 4 & 5).

      (b) Regarding alternative hypotheses for antagonism—such as crosstalk or parallel-pathway activation: any antagonistic effect would have to arise from negative regulation acting within the first 30 min. However, IκBα-mediated feedback only becomes appreciable after ~30 min (Hoffmann et al., 2002), and A20-dependent attenuation requires ≥2 h (Werner et al., 2005). Beyond these delayed feedback, NFκB activation depends primarily on phosphorylation and K63-linked ubiquitination, for which no mechanism produces true antagonism; at most, combinatorial inputs saturate the response to the level of the strongest single ligand. We have added this rationale to the Discussion to explain why we favor the endosome saturation hypothesis over other mechanisms (Lines 459-465). While this may not capture every nuance, it represents the simplest model extension capable of reproducing the observed antagonism.

      Authors asses the distinguishability of single-ligand stimuli and combinatorial ligands stimuli using the simulations from the refined model. While this is informative, the simulated data could propagate deviations from the experimental data to the classifiers. How would the classifiers fare when the experimental data is used to assess the single-stimulus distinguishability? The authors could use the experimental data they already have and confirm their main claim of the paper, that cells retain stimulus-response specificity even with multiple ligand exposure. In short, how would Fig 3E look when trained/validated on available experimental data?

      Response: We thank the reviewer’s valuable comments, and they helped us strengthen the rigor of our analysis by incorporating cross-model testing. Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also pasted in the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracellular ligands even in complex stimulus mixtures.”

      While the approach of presented here with multiple simultaneous ligand exposures is a major step towards the in vivo-like conditions, the temporal aspect is still missing. That is, temporal phasing i.e. sequential exposure to multiple ligands as one would expect in vivo rather than all at once. This is probably out of scope for this paper but the authors could comment how how their work could be taken forward in such direction and would the SRS be better or worse in such conditions. Response: We thank the reviewer for this insightful comment. We have added “the temporal aspect of multiple ligand exposures” to the discussion (Lines 503-510), and we pasted the corresponding paragraph here for reviewer’s references (black fonts are previous version, and blue fonts is the revised new texts):

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (58, 59). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (58), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (15, 60). **

      There is no caption for Figure 3F in the figure legend nor a reference in the main text.

      Response: In the revised manuscript we actually removed Figure 3F.

      Significance

      General assessment: This is a good manuscript in it's present form which could get better with revision. There needs more supporting data and validation to back the main claim presented in the manuscript.

      Significance/impact/readership: When revised this manuscript could be of interest to a broad community involving single cells biology, cell and immune signaling, and mathematical modeling. Especially the models presented here could be used a starting point to more complex and detailed modeling approaches.

      Response: We thank the reviewer for this endorsement. The reviewer’s constructive suggestion helped us significantly improve the clarity and rigor of our main conclusion.

      In summary, we have strengthened the computational framework in several ways. We improved the model’s fit to experimental single-ligand training data and reformulated the antagonistic CpG-pIC model using Michaelis–Menten kinetics, thereby reducing parameter arbitrariness and increasing mechanistic interpretability. These changes led to better agreement between model predictions and experimental observations for combinatorial ligand responses (Updated Figure 2 and Figure S2), which we hope will further increase experimentalists’ confidence in the modeling results. We have also validated one key conclusion (“cells retain stimulus-response specificity even with multiple ligand exposure”) using the experimental dataset, and it aligns with the model predictions.

      In addition, we have further extended our analysis and the scope. Inspired by the reviewer’s advice (and Reviewer 3’s comment 1b) on dose-combination study for CpG-pIC pair, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to low-dose ligand combinations synergistic response for single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental finding of non-integrative response for LPS and Pam combination (Kellogg et al., 2017), and this independent experimental study validated our model prediction.

      We further analyzed stimulus-response specificity under conditions predicted to exhibit synergy or antagonism. Our results indicate that antagonistic combinations of ligands can increase stimulus-response specificity in the context of ligand mixtures.

      Reviewer #3

      Evidence, reproducibility and clarity

      The authors investigate experimentally single macrophages' NF-kB responses to five ligands, separately and to 3 pairs of ligands. Using the single ligand stimulations, they train an existing mathematical model to replicate single-cell NF-kB nuclear trajectories. From what I understand, for each single cell trajectory in response to a given ligand, the best fit parameters of the core module and the receptor module (specific for the given ligand) are found.

      Then (again, from what I understand), single ligand models are used to generate responses to combinations of ligands. The parametrizations of single ligand models (to be combined) are chosen to have the most similar core modules. It is not described how the responses to more than one ligand are calculated - I expect that respective receptor modules work in parallel, providing signals to the core module. After observing that the response to CpG+pIC is lower (in terms of duration and total) than for CpG alone, the model is modified to account for competition for endosomal transport required by both ligands.

      Having the trained model, simulations of responses to all 31 combinations of ligands are performed, and each NF-κB trajectory is described by six signaling codons-Speed, Peak, Duration, Total, Early vs. Late, and Oscillations. Next, these codons are used to reconstruct (using a random forest model) the stimuli (which may be the combination of ligands). The single and even the two ligand stimuli are relatively well recognized, which is interpreted as the ability of macrophages to distinguish ligands even if present in combination.

      We thank the reviewer for careful reading of the manuscript.

      Major comments

      1) The demonstrated ability to recognize stimuli is based on several key assumptions that can hardly be met in reality.

      Response: We thank the reviewer for this comment, which prompted us to carefully reflect on the rigor of our work, inspired us to extend our analysis to a broad range of ligand-dose combinations, and helped us improve clarifying the limitations of our approach. Please see our detailed responses below.

      a) The cell knows the stimulation time, and then it can use speed as a codon. Look on fig. S4A: The trajectories in response to plC are similar to those in response to TNF, but just delayed. Response: We thank the reviewer for this comment. We updated the model parameterization to better fit to the single-ligand pIC condition (Lines 557-559). In the updated model, the simulated responses to TNF and pIC are quite different (Fig. S2A-B, Fig. S5A-B). Specifically, the Peak, Duration, EarlyVsLate, and Total signaling codons have different values. In addition, the literature suggests that timing difference of NFκB activation are sufficient to elicit differences in downstream gene expression responses, especially for the early response genes (ERG) and intermediate response genes (ING) (Figure 1 in Ando, et al, 2021). For reviewer’s convenience, we have also appended the figures. Specifically, within the first 60 minutes, ctrl exhibit higher Speed of NFκB activation, and the NFκB regulated ERG and ING show differences in the first 60 minutes (Below Fig 1a,b). Ando et al then identified the gene regulatory mechanism that is able to distinguish between differences in the Speed codon. Importantly, this mechanism does not require knowledge of t=0, i.e. when the timer was started.

      The signaling codon Speed, which is based on derivatives, is one way to quantify such timing differences in activation. It was selected from a library of more than 900 different dynamic features using an information maximizing algorithm (Adelaja et al., 2021). It is possible that other ways of measuring time, e.g. time to half-max, might not be distinguished that well by these regulatory mechanisms.

      b) The increase of stimulus concentration typically increases Peak, Duration, and Total, so a similar effect can be achieved by changing the ligand or concentration. Response: This (“the increase of stimulus concentration typically increases Peak, Duration, and Total”) is not an assumption. What the reviewer described (“a similar effect can be achieved by changing the ligand or concentration”) may occur or may not. The six informative signaling codons can vary under different ligands or doses. For example, with increasing doses of Pam, the NFκB response shows a higher peak, potentially making it appear more like LPS stimulation. However, as the Pam dose increases, the response duration decreases, which distinguishes it from LPS stimulation (See experimental data shown in Figure 4A, second row, and Figure 3A, second row in Luecke et al., (2024), we also pasted the corresponding figure below for reviewer’s convenience).

      Figure 4A and Figure 3A from Luecke et al., (2024). Figure 4A: NFκB activity dynamics in the single cells in response to 0, 0.01, 0.1, 1, 10, and 100 ng/ml P3C4 stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the p38 or NFκB signaling trajectory of one cell. Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted. Total # of cells: 898, 834, 827, 787, 778, and 923. Figure 3A: NFκB activity dynamics in the single cells in response to 100 ng/ml LPS stimulation. Eight hours were measured by fluorescence microscopy of reporter hMPDMs. Each row of the heatmap represents the NFκB signaling trajectory of one cell (with p38 measured shown in the original paper). Trajectories are sorted by the maximum amplitude of p38 activity. Data from two pooled biological replicates are depicted.

      Inspired by the reviewer’s comment (and also Reviewer 2’s comments), in the revision, we expanded our research to dose-response relationships for all dual-ligand combinations (Lines 302-406, Figure 4-5). This additional comprehensive analysis allowed us to identify the mechanism of synergistic and antagonistic effects in single-cell responses and to pinpoint the corresponding dose ranges among different ligand pairs.

      Interestingly, we found that IKK ultrasensitive activation may lead to synergistic responses to low-dose ligand combinations but only in a subset of single cells. We also found that CD14 uptake competition between LPS and Pam may lead to antagonistic/non-integrative combination. Our simulation-based finding of non-integrative combination of LPS-Pam stimuli aligns with previous independent experimental findings of non-integrative response for LPS and Pam combination (Kellogg et al., 2017).

      c) Distinguishing a given ligand in the presence of some others, even stronger bases, on the assumption that these ligands were given at the same time, which is hardly justified. Response: We agree with the reviewer that ligands could be given at different times. Considering time delays between ligands (the inset and also removal) dramatically adds to the combinatorial complexity. Some initial studies by the Tay lab are beginning to explore some scenarios of time-shifted ligand pairs (Wang et al 2025). Here we focus on a systematic exploration of all ligand combinations at 6 different doses. The fact that we do not consider time delays is not an assumption but admittedly a limitation that may well be addressed in future studies. We have included a brief discussion of this issue in the discussion (Lines 503-514). We’ve appended here for reviewer’s convenience.

      Cells may be expected to interpret not only the combination of signals but also their timing and duration to mount appropriate transcriptional responses (Kumar et al., 2004; Son et al., 2023). For example, acute inflammation integrates pathogen-derived cues with pro- and anti-inflammatory signals over a timeframe of hours to days (Kumar et al., 2004), to coordinate the pathogen removal and tissue repairing process. Investigating sequential stimulus combinations in our model is therefore crucial for understanding how cells process complex physiological inputs. Simulations that account for longer timescales may require additional feedback mechanisms, as described in some of our previous studies for NFκB (Werner et al., 2008, 2005).

      We would like to suggest that despite (or maybe because) limiting our study to coincident stimuli, we made some noteworthy discoveries.

      2) For single ligands, it would be nice to see how the random forest classifier works on experimental data, not only on in silico data (even if generated by a fitted model).

      Response: This comment and Reviewer 2 comment 3 have helped us strengthen the rigor of our analysis by incorporating cross-model testing. We pasted the response below.

      Specifically, we refined our analysis of ligand presence/absence classification by including ROC AUC and balanced accuracy metrics. This adjustment accounts for the fact that the experimental data did not cover all combinatorial conditions, thereby mitigating potential biases from data imbalance and threshold choice. The experimental results are qualitatively consistent with the simulations, though—as expected—they show somewhat lower ligand distinguishability compared to the noise-free simulated dataset. We have updated Figures 3E–F (previously Figure 3E), added Figure S8, and revised the manuscript accordingly (Lines 292–301). For the reviewer’s convenience, we have also included the revised manuscript text below.

      “Classifiers trained to distinguish TNF-present from TNF-absent conditions achieved a Receiver Operating Characteristic-Area Under the Curve (ROC AUC) of 0.96, significantly above the 0.5 baseline (Figure 3D, Figure S8A). Extending this analysis to other ligands, cells detected LPS (0.85), Pam (0.84), pIC (0.73), and CpG (0.63) in mixtures (Figure 3D, S8A). Using experimental data from double- and triple-ligand stimuli (Figure 1D), ROC AUC values were TNF 0.74, LPS 0.74, Pam 0.66, pIC 0.75, and CpG 0.66 (Figure 3E, S8B). Classifier accuracies yielded consistent results (Figure S8C-D). These results indicated a remarkable capability of preserving ligand-specific dynamic features within complex NFκB signal trajectories that enable nuclear detection of extracelular ligands even in complex stimulus mixtures.”

      3) My understanding of ligand discrimination is such that it is rather based on a combination of pathways triggered than solely on a single transcription factor response trajectory, which varies with ligand concentration and ligand concentration time profile (no reason to assume it is OFF-ON-OFF). For example, some of the considered ligands (plC and CpG) activate IRF3/IRF7 in addition to NF-kB, which leads to IFN production and activation of STATs. This should at least be discussed.

      Response: We thank the reviewer for this comment and fully agree. In the previous version, we discussed different signaling pathways combinatorically distinguishing stimulus. In the revision, we have extended this discussion to include the example of pIC and CpG activation, as suggested (Lines 515-522). We pasted the corresponding text below.

      Furthermore, innate immune responses do not solely rely on NFκB but also involve the critical functions of AP1, p38, and the IRF3-ISGF3 axis. The additional pathways are likely activated in a coordinated manner and provide additional information (Luecke et al., 2021). This is exemplified by the studies demonstrating synergistic effects between CpG and pIC in inhibiting tumor growth and promoting cytokine production (Huang et al., 2020), such as IFNβ and TNFα, whose expression is also regulated by the IRF and MAPK signaling pathways (Luecke et al., 2021; Sheu et al., 2023). Therefore the inclusion of parallel pathways of AP1 and MAPK, as well as the type I interferon network (Cheng et al., 2015; Davies et al., 2020; Hanson and Batchelor, 2022; Luecke et al., 2024; Paek et al., 2016; Peterson et al., 2022) are next steps for expanding the mathematical models presented here.”

      Technical comments

      1) Reference 25: X. Guo, A. Adelaja, A. Singh, W. Roy, A. Hoffmann, Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications (2025) does not lead to any paper with the same or a similar title and author list. This Ref is given as a reference to the model. Fortunately, Ref 8 is helpful. Nevertheless, authors should include a schematic of the model.

      Response: We apologize for the paper not being accessible on time. It is now. We have also added a schematic of the model as suggested (Figure S1) and have added detailed description of the model and simulations in introduction (Lines 95-106), results (Lines 129-141), and methods (Simulation of heterogenous NFκB dynamical responses).

      2) Also Mendeley Data DOI:10.17632/bv957x6frk.1 and GitHub https://github.com/Xiaolu-Guo/Combinatorial_ligand_NFkB lead to nowhere.

      Response: We thank the reviewer for this comment, and we have made the GitHub codes public. Mendeley Data DOI:10.17632/bv957x6frk.1 can be accessed via the shared link: https://data.mendeley.com/preview/bv957x6frk?a=6d56e079-d7b0-482e-951f-8a8e06ee8797

      and will be public once the paper accepted.

      3) Dataset 1 is not described. Possibly it contains sets of parameters of receptor modules (different numbers of sets for each module, why?), but the names of parameters never appear in the text, which makes it impossible to reproduce the data.

      Response: We thank the reviewer for this comment, and we have added the description of the dataset (S3 SupplementaryDataset2_NFkB_network_single_cell_parameter_distribution.xlsx) and added the parameter names in the methods (Simulation of heterogenous NFκB dynamical responses).


      4) It is difficult to understand how the simulations in response to more than one ligand are performed.

      Response: We thank the reviewer for this comment, and we have improved the explanation of the methods (Results, Lines 145-152) and included a detailed description of the model and simulations for combinatorial ligands (Methods, Predicting heterogeneous single-cell responses to combinatorial-ligand stimulation).

      Significance

      A lot of work has been done, the methodology is interesting, but the biological conclusions are overstated.

      Response: We thank the reviewer for their interest in the methodology. We have revised the title, the abstract, and added the discussion about our finding to more accurately document what we have found. In the revision, we have increased the clarity and rigor of the work. For the key conclusion that macrophages maintain some level of NFκB signaling fidelity in response to ligand mixtures, we have validated the binary classifier results on experimental data as reviewer suggested.

      In the revision, we have also extended our methodology to explore further, the dose-response curves for different dosage combination for ligand pairs. This further work allowing us identified the synergistic and antagonistic regimes. By comparing the stimulus response specificity for antagonistic model vs the non-antagonistic model, we demonstrated that signaling antagonism may increase the distinguishability of presence or absence of specific ligands within complex ligand mixtures. This provides a mechanism of how signaling fidelity is maintained to the surprising degree we reported.

      REFERENCES

      Adelaja, A., Taylor, B., Sheu, K.M., Liu, Y., Luecke, S., Hoffmann, A., 2021. Six distinct NFκB signaling codons convey discrete information to distinguish stimuli and enable appropriate macrophage responses. Immunity 54, 916-930.e7. https://doi.org/10.1016/j.immuni.2021.04.011

      Akira, S., Takeda, K., 2004. Toll-like receptor signalling. Nat Rev Immunol 4, 499–511. https://doi.org/10.1038/nri1391

      Andridge, R.R., Little, R.J.A., 2010. A Review of Hot Deck Imputation for Survey Non-response. Int Stat Rev 78, 40–64. https://doi.org/10.1111/j.1751-5823.2010.00103.x

      Cheng, Z., Taylor, B., Ourthiague, D.R., Hoffmann, A., 2015. Distinct single-cell signaling characteristics are conferred by the MyD88 and TRIF pathways during TLR4 activation. Sci Signal 8, ra69. https://doi.org/10.1126/scisignal.aaa5208

      Davies, A.E., Pargett, M., Siebert, S., Gillies, T.E., Choi, Y., Tobin, S.J., Ram, A.R., Murthy, V., Juliano, C., Quon, G., Bissell, M.J., Albeck, J.G., 2020. Systems-Level Properties of EGFR-RAS-ERK Signaling Amplify Local Signals to Generate Dynamic Gene Expression Heterogeneity. Cell Systems 11, 161-175.e5. https://doi.org/10.1016/j.cels.2020.07.004

      Guo, X., Adelaja, A., Singh, A., Roy, W., Hoffmann, A., 2025a. Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications.

      Guo, X., Adelaja, A., Singh, A., Wollman, R., Hoffmann, A., 2025b. Modeling heterogeneous signaling dynamics of macrophages reveals principles of information transmission in stimulus responses. Nat Commun 16, 5986. https://doi.org/10.1038/s41467-025-60901-3

      Hanson, R.L., Batchelor, E., 2022. Coordination of MAPK and p53 dynamics in the cellular responses to DNA damage and oxidative stress. Molecular Systems Biology 18, e11401. https://doi.org/10.15252/msb.202211401

      Huang, Y., Zhang, Q., Lubas, M., Yuan, Y., Yalcin, F., Efe, I.E., Xia, P., Motta, E., Buonfiglioli, A., Lehnardt, S., Dzaye, O., Flueh, C., Synowitz, M., Hu, F., Kettenmann, H., 2020. Synergistic Toll-like Receptor 3/9 Signaling Affects Properties and Impairs Glioma-Promoting Activity of Microglia. J. Neurosci. 40, 6428–6443. https://doi.org/10.1523/JNEUROSCI.0666-20.2020

      Kellogg, R.A., Tian, C., Etzrodt, M., Tay, S., 2017. Cellular Decision Making by Non-Integrative Processing of TLR Inputs. Cell Rep 19, 125–135. https://doi.org/10.1016/j.celrep.2017.03.027

      Kumar, R., Clermont, G., Vodovotz, Y., Chow, C.C., 2004. The dynamics of acute inflammation. Journal of Theoretical Biology 230, 145–155. https://doi.org/10.1016/j.jtbi.2004.04.044

      Luecke, S., Guo, X., Sheu, K.M., Singh, A., Lowe, S.C., Han, M., Diaz, J., Lopes, F., Wollman, R., Hoffmann, A., 2024. Dynamical and combinatorial coding by MAPK p38 and NFκB in the inflammatory response of macrophages. Molecular Systems Biology 20, 898–932. https://doi.org/10.1038/s44320-024-00047-4

      Luecke, S., Sheu, K.M., Hoffmann, A., 2021. Stimulus-specific responses in innate immunity: Multilayered regulatory circuits. Immunity 54, 1915–1932. https://doi.org/10.1016/j.immuni.2021.08.018

      Paek, A.L., Liu, J.C., Loewer, A., Forrester, W.C., Lahav, G., 2016. Cell-to-Cell Variation in p53 Dynamics Leads to Fractional Killing. Cell 165, 631–642. https://doi.org/10.1016/j.cell.2016.03.025

      Peterson, A.F., Ingram, K., Huang, E.J., Parksong, J., McKenney, C., Bever, G.S., Regot, S., 2022. Systematic analysis of the MAPK signaling network reveals MAP3K-driven control of cell fate. Cell Systems 13, 885-894.e4. https://doi.org/10.1016/j.cels.2022.10.003

      Sheu, K.M., Guru, A.A., Hoffmann, A., 2023. Quantifying stimulus-response specificity to probe the functional state of macrophages. Cell Systems 14, 180-195.e5. https://doi.org/10.1016/j.cels.2022.12.012

      Son, M., Wang, A.G., Keisham, B., Tay, S., 2023. Processing stimulus dynamics by the NF-κB network in single cells. Exp Mol Med 55, 2531–2540. https://doi.org/10.1038/s12276-023-01133-7

      Stuart, T., Butler, A., Hoffman, P., Hafemeister, C., Papalexi, E., Mauck, W.M., Hao, Y., Stoeckius, M., Smibert, P., Satija, R., 2019. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902.e21. https://doi.org/10.1016/j.cell.2019.05.031

      Werner, S.L., Barken, D., Hoffmann, A., 2005. Stimulus Specificity of Gene Expression Programs Determined by Temporal Control of IKK Activity. Science 309, 1857–1861. https://doi.org/10.1126/science.1113319

      Werner, S.L., Kearns, J.D., Zadorozhnaya, V., Lynch, C., O’Dea, E., Boldin, M.P., Ma, A., Baltimore, D., Hoffmann, A., 2008. Encoding NF-kappaB temporal control in response to TNF: distinct roles for the negative regulators IkappaBalpha and A20. Genes Dev 22, 2093–2101. https://doi.org/10.1101/gad.1680708

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The authors investigate experimentally single macrophages' NF-kB responses to five ligands, separately and to 3 pairs of ligands. Using the single ligand stimulations, they train an existing mathematical model to replicate single-cell NF-kB nuclear trajectories. From what I understand, for each single cell trajectory in response to a given ligand, the best fit parameters of the core module and the receptor module (specific for the given ligand) are found. Then (again, from what I understand), single ligand models are used to generate responses to combinations of ligands. The parametrizations of single ligand models (to be combined) are chosen to have the most similar core modules. It is not described how the responses to more than one ligand are calculated - I expect that respective receptor modules work in parallel, providing signals to the core module. After observing that the response to CpG+pIC is lower (in terms of duration and total) than for CpG alone, the model is modified to account for competition for endosomal transport required by both ligands.

      Having the trained model, simulations of responses to all 31 combinations of ligands are performed, and each NF-κB trajectory is described by six signaling codons-Speed, Peak, Duration, Total, Early vs. Late, and Oscillations. Next, these codons are used to reconstruct (using a random forest model) the stimuli (which may be the combination of ligands). The single and even the two ligand stimuli are relatively well recognized, which is interpreted as the ability of macrophages to distinguish ligands even if present in combination.

      Major comments

      1. The demonstrated ability to recognize stimuli is based on several key assumptions that can hardly be met in reality.

      a) The cell knows the stimulation time, and then it can use speed as a codon. Look on fig. S4A: The trajectories in response to plC are similar to those in response to TNF, but just delayed.

      b) The increase of stimulus concentration typically increases Peak, Duration, and Total, so a similar effect can be achieved by changing the ligand or concentration.

      c) Distinguishing a given ligand in the presence of some others, even stronger bases, on the assumption that these ligands were given at the same time, which is hardly justified. 2. For single ligands, it would be nice to see how the random forest classifier works on experimental data, not only on in silico data (even if generated by a fitted model). 3. My understanding of ligand discrimination is such that it is rather based on a combination of pathways triggered than solely on a single transcription factor response trajectory, which varies with ligand concentration and ligand concentration time profile (no reason to assume it is OFF-ON-OFF). For example, some of the considered ligands (plC and CpG) activate IRF3/IRF7 in addition to NF-kB, which leads to IFN production and activation of STATs. This should at least be discussed.

      Technical comments

      1. Reference 25: X. Guo, A. Adelaja, A. Singh, W. Roy, A. Hoffmann, Modeling single-cell heterogeneity in signaling dynamics of macrophages reveals principles of information transmission. Nature Communications (2025) does not lead to any paper with the same or a similar title and author list. This Ref is given as a reference to the model. Fortunately, Ref 8 is helpful. Nevertheless, authors should include a schematic of the model.
      2. Also Mendeley Data DOI:10.17632/bv957x6frk.1 and GitHub https://github.com/Xiaolu-Guo/Combinatorial_ligand_NFkB lead to nowhere.
      3. Dataset 1 is not described. Possibly it contains sets of parameters of receptor modules (different numbers of sets for each module, why?), but the names of parameters never appear in the text, which makes it impossible to reproduce the data.
      4. It is difficult to understand how the simulations in response to more than one ligand are performed.

      Significance

      A lot of work has been done, the methodology is interesting, but the biological conclusions are overstated.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Guo et al. developed a heterogeneous, single-cell ODE model of NFκB signaling parameterized on five individual ligands (TNF, Pam, LPS, CpG, pIC) and extended it, via core-module parameter matching, to predict responses to all 31 combinations of up to five ligands. They found that simulated responder fractions and signaling codon features generally agreed with live-cell imaging data . A notable discrepancy emerged for the CpG (TLR9) + pIC (TLR3) pair: experiments exhibited non-integrative antagonism unpredicted by the original model. This issue was resolved by incorporating a Hill-type term for competitive, limited endosomal trafficking of these ligands. Finally, by decomposing NFκB trajectories into six "signaling codons" and applying Wasserstein distances plus random-forest and LSTM classifiers, the authors showed that stimulus-response specificity (SRS) declines with ligand complexity but remains statistically significant even for quintuple mixtures. This is a well written and scientifically sound manuscript about complexities of cellular signaling, especially considering the limitations of in vitro experiments in recapitulating in vivo dynamics. Here are a few comments and recommendations:

      1. The modeling approach used in this manuscript, while interesting, might need further validation. Inferring multi-ligand receptor parameters by matching single-ligand cells on core-module similarity may not capture true co-variation in receptor expression or adaptor availability. Single cell measurements of receptor expressions could be done (e.g. via flow cytometry) to ground this assumption in real data. If the authors think this is out of scope for this manuscript, they could fit core-matched single cell models with two receptor modules from scratch to the two-ligand experimental data. Would this fitted model produce similar receptor parameters compared to the presented approach? At least the authors should add a bit more explanation for why their modeling approach is better (or valid) than fitting the models with 2/3/4/5 receptor modules from scratch to the experimental data.
      2. The refined model posits competitive, saturable endosomal transport for CpG and pIC, but no direct measurements of endosomal uptake rates or compartmental saturation thresholds are provided, leaving the Hill parameters under-constrained. The authors could produce dose-response curves for CpG and pIC individually and in combination across a range of concentrations to fit the Hill parameters for competitive uptake. If this is out of scope for this paper, the authors should at least comment on why the endosome hypothesis is better than others e.g. crosstalks and other parallel pathway activations. Especially given that even the refined model simulations with Hill equations for CpG and pIC do not quite match with the experimental data (Fig 2 B,E).
      3. Authors asses the distinguishability of single-ligand stimuli and combinatorial ligands stimuli using the simulations from the refined model. While this is informative, the simulated data could propagate deviations from the experimental data to the classifiers. How would the classifiers fare when the experimental data is used to assess the single-stimulus distinguishability? The authors could use the experimental data they already have and confirm their main claim of the paper, that cells retain stimulus-response specificity even with multiple ligand exposure. In short, how would Fig 3E look when trained/validated on available experimental data?
      4. While the approach of presented here with multiple simultaneous ligand exposures is a major step towards the in vivo-like conditions, the temporal aspect is still missing. That is, temporal phasing i.e. sequential exposure to multiple ligands as one would expect in vivo rather than all at once. This is probably out of scope for this paper but the authors could comment how how their work could be taken forward in such direction and would the SRS be better or worse in such conditions.
      5. There is no caption for Figure 3F in the figure legend nor a reference in the main text.

      Significance

      General assessment: This is a good manuscript in it's present form which could get better with revision. There needs more supporting data and validation to back the main claim presented in the manuscript.

      Significance/impact/readership: When revised this manuscript could be of interest to a broad community involving single cells biology, cell and immune signaling, and mathematical modeling. Especially the models presented here could be used a starting point to more complex and detailed modeling approaches.

    1. La gestion de classe : Réalités et pistes de solution

      Ce document de synthèse récapitule les points essentiels de la formation dispensée par Elfa Hakimi et Ian Ducharme pour le Centre franco lors de l'Institut d'hiver 2025. Il explore les défis contemporains de la gestion de classe et propose des cadres théoriques et pratiques pour favoriser un environnement d'apprentissage optimal.

      Résumé exécutif

      La gestion de classe ne se limite pas à la discipline ; elle constitue un défi multidimensionnel exigeant une planification rigoureuse des ressources, l'établissement de relations authentiques et une communication pédagogique explicite. Les points saillants de cette analyse incluent :

      L'approche systémique de Nancy Gaudreau : Utilisation de la métaphore des « cinq doigts de la main » pour structurer la gestion (ressources, attentes, relations, engagement, indiscipline).

      Le passage de la réaction à la proaction : L'importance d'anticiper les comportements par l'enseignement explicite des routines et la connaissance approfondie du profil des élèves.

      L'équilibre relationnel : L'adoption d'une posture d'adulte selon l'analyse transactionnelle pour éviter le « triangle dramatique » (Persécuteur, Sauveur, Victime).

      L'engagement par la clarté : L'utilisation de résultats d'apprentissage (RA) et de critères de réussite (CR) visibles pour donner du sens aux tâches.

      --------------------------------------------------------------------------------

      1. Les défis de la gestion de classe contemporaine

      La gestion de classe est un défi incontournable qui influence directement le bon déroulement des apprentissages.

      Les comportements perturbateurs (bavardages, distractions, désobéissance, agressions) proviennent de facteurs divers :

      Troubles intrinsèques : Troubles de l'attention ou difficultés émotionnelles.

      Facteurs extrinsèques : Conflits interpersonnels ou situations familiales complexes.

      Désintéressement : Concurrence des stimuli externes (ex. : jeux vidéo).

      La formation souligne que l'enseignant doit agir comme un animateur capable de « vendre sa salade » en rendant les tâches attrayantes et accessibles.

      --------------------------------------------------------------------------------

      2. Le cadre de référence : Les cinq ingrédients de Nancy Gaudreau

      Inspiré de l'ouvrage de Nancy Gaudreau, ce modèle utilise les doigts de la main pour symboliser les piliers d'une gestion efficace.

      A. Le Pouce : La gestion des ressources

      Il s'agit de l'organisation matérielle et humaine :

      Temps et espace : L'espace est considéré comme le « troisième enseignant ». Il doit être polyvalent (travail en grand groupe, en dyades, centres de lecture).

      Ressources humaines : Utilisation des élèves comme « gardiens du temps », implication des parents, des orthopédagogues et des techniciens.

      Technologie : Intégration du codage et de la littératie numérique pour accroître la motivation.

      B. L'Index : Les attentes claires

      Ce pilier concerne la définition des règles et des routines :

      Enseignement explicite : Ne rien prendre pour acquis. On modélise le comportement (« Je fais »), on le pratique ensemble (« Nous faisons »), puis l'élève l'exécute seul (« Tu fais »).

      Signalétique visuelle : Utilisation de pictogrammes ou de systèmes de couleurs (vert, jaune, rouge) pour définir les niveaux de bruit permis selon l'activité (temps libre vs transition).

      C. Le Majeur : Les relations sociales positives

      La qualité du lien enseignant-élève est primordiale :

      Authenticité : Apprendre les prénoms rapidement, s'intéresser aux centres d'intérêt des élèves (ex. : sport) et échanger de manière informelle.

      Respect mutuel : Utiliser un ton calme, même en situation de conflit, et dissocier le comportement de la personne.

      D. L'Annulaire : L'attention et l'engagement

      Maintenir l'intérêt sur l'objet d'apprentissage :

      Zone proximale de développement : Proposer des tâches ni trop simples ni trop complexes pour éviter le découragement.

      Stratégies de captation : Utiliser des techniques de « reset » (éteindre les lumières, tapements de mains rythmés, signaux non verbaux comme le doigt sur le nez).

      E. L'Auriculaire : La gestion de l'indiscipline

      Bien que plus petit, ce doigt est crucial pour traiter les comportements inacceptables :

      Proaction : Anticiper les crises en connaissant les dossiers scolaires (DSO).

      Autorégulation : Enseigner l'empathie et la gestion des émotions par des cercles de communication.

      --------------------------------------------------------------------------------

      3. Cadres théoriques de l'accompagnement

      La théorie de la réalité (William Glasser)

      Ce processus en huit étapes vise à responsabiliser l'élève plutôt qu'à le punir :

      1. Créer un lien.

      2. Identifier le comportement.

      3. Faire évaluer le comportement par l'élève (« Est-ce que cela t'aide ? »).

      4. Établir un plan.

      5. Obtenir un engagement.

      6. Démontrer de la confiance.

      7. Ne pas accepter d'excuses ni punir inutilement.

      8. Persévérer.

      L'Analyse Transactionnelle (Eric Berne)

      Les interactions en classe sont influencées par trois états du « moi » :

      Le Parent (Normatif ou Nourricier) : Établit les attentes ou soutient.

      L'Adulte : État rationnel et équilibré à privilégier pour la résolution de problèmes.

      L'Enfant (Spontané, Soumis ou Rebelle) : Siège des émotions.

      Le Triangle Dramatique à éviter :

      Le Persécuteur : Domine et punit (« Tu es insupportable »).

      Le Sauveur : Fait le travail à la place de l'élève, nuisant à son autonomie.

      La Victime : Se sent impuissante et évite ses responsabilités (« Je suis nul »).

      --------------------------------------------------------------------------------

      4. Pistes pratiques et méthodologiques

      | Thème | Stratégies suggérées | | --- | --- | | Communication | Remplacer « Est-ce que tu comprends ? » par « Peux-tu reformuler dans tes mots ? ». | | Littératie | Utilisation de centres d'apprentissage et de la Littératie Structurée (80% grand groupe, 15% petit groupe, 5% individuel). | | Numératie | Pratiques pédagogiques à fort impact, manipulation de matériel concret, robotique et classes « collaboréflexives ». | | Rétroaction | privilégier le renforcement positif (« strokes ») et célébrer les progrès par des privilèges ou des certificats de valeur. |

      Conclusion

      Une gestion de classe efficace repose sur la capacité de l'enseignant à rester flexible et à adapter son style (autocratique, démocratique ou permissif) selon la situation.

      En rendant l'apprentissage visible et en structurant l'environnement de manière prévisible, l'enseignant réduit les opportunités d'indiscipline et favorise le succès de tous les élèves.

    1. Outside of the historically black tradition, an additional 15% of African-Americans are members of evangelical denominations, such as the Southern Baptist Convention or Assemblies of God, and 4% are members of mainline denominations, such as the Disciples of Christ. Overall, the membership of historically black Protestant denominations is 92% black, while African-Americans make up relatively small portions of the membership of evangelical (6%) and mainline (2%) churches.

      I am a black person who is now an Episcopalian

    1. Briefing : Analyse des idées reçues sur l'animation jeunesse

      Synthèse

      Ce document synthétise les travaux de l'Institut national de la jeunesse et de l'éducation populaire (INJEP) présentés lors de la parution de l'ouvrage collectif Idées reçues sur l'animation jeunesse.

      Le secteur de l'animation en France, bien qu'il concerne près de 4 millions de jeunes et mobilise plus de 350 000 intervenants, souffre d'un manque de reconnaissance et de représentations sociales souvent réductrices.

      L'analyse démontre que l'animation n'est pas un simple service de « gardiennage » ou de loisirs récréatifs, mais un pilier historique et structurel de l'écosystème éducatif français.

      Les principaux enjeux identifiés concernent la précarité des conditions d'emploi (notamment dans le périscolaire), la complexification des missions (gestion du handicap, violences sexistes et sexuelles) et la tension constante entre l'animation « volontaire » (occasionnelle) et l'animation professionnelle.

      Malgré une image de secteur « peu sérieux », les recherches en sciences sociales soulignent que le jeu et les activités de groupe constituent des vecteurs d'apprentissages fondamentaux, complémentaires à l'école.

      --------------------------------------------------------------------------------

      1. Évolution historique et structuration du secteur

      L'animation contemporaine est le fruit d'une longue histoire qui lie les mouvements d'éducation populaire à la construction du modèle républicain.

      Origines et continuité pédagogique : Dès la fin du XIXe siècle, les premières expérimentations (colonies de vacances, patronages) visaient à combler la vacance du temps scolaire.

      Ces initiatives ont souvent été portées par des enseignants cherchant à expérimenter des pédagogies actives en dehors du cadre formel.

      Professionnalisation : On observe un glissement sémantique et statutaire au fil des décennies : de « moniteur » à « éducateur », puis vers le terme « animateur » dans les années 1960.

      Soutien public et réseau associatif : Le secteur s'est structuré grâce à une combinaison d'initiatives associatives nationales (CMA, Francas, etc.) et d'un soutien de l'État via des agréments, des subventions et la création de corps de métiers au sein du ministère de la Jeunesse et des Sports.

      Réorientation vers l'insertion : Entre les années 1970 et 1990, sous l'effet de la crise économique, l'animation s'est progressivement intégrée aux politiques de jeunesse, avec un accent mis sur l'insertion des jeunes.

      --------------------------------------------------------------------------------

      2. Portrait de l'univers professionnel : Entre engagement et précarité

      Le secteur de l'animation se caractérise par des profils spécifiques et des conditions de travail souvent dégradées.

      Profils des animateurs et animatrices

      | Indicateur | Données clés | | --- | --- | | Féminisation | 3/4 des effectifs sont des femmes (surreprésentées dans le périscolaire). | | Âge | 50 % ont moins de 34 ans ; 25 % ont moins de 25 ans. | | Employeurs principaux | 60 % sont recrutés par des collectivités locales. | | Niveau de formation | 70 % possèdent un diplôme égal ou inférieur au baccalauréat. |

      Conditions d'emploi

      Instabilité : Utilisation massive de contrats courts et de temps partiels subis, particulièrement dans l'animation périscolaire où le temps de travail est fractionné (matin, midi, soir).

      Rémunération : Le salaire net moyen en équivalent temps plein est inférieur de 450 € à la moyenne des autres secteurs (environ 1 800 € net).

      Rotation : Un taux de rotation élevé (turnover), avec 30 % des équipes présentes dans leur structure depuis moins d'un an.

      --------------------------------------------------------------------------------

      3. Enjeux de formation : Du BAFA aux diplômes professionnels

      La formation constitue un point de tension majeur dans la reconnaissance du métier.

      Prédominance du BAFA : Bien que ce ne soit qu'un brevet pour l'animation occasionnelle, le BAFA reste la porte d'entrée principale (50 000 délivrés par an contre 3 000 diplômes professionnels de type BPJEPS).

      Technicisation du contenu : Le BAFA s'est densifié. Les stagiaires sont désormais formés à gérer des problématiques complexes : harcèlement, discriminations, violences sexistes et sexuelles, ou accueil d'enfants en situation de handicap.

      Abaissement de l'âge : Le passage de l'âge d'entrée en formation à 16 ans n'a pas révolutionné le secteur, mais nécessite des ajustements pédagogiques pour accompagner ces très jeunes encadrants.

      Délaissement des diplômes longs : Les employeurs, notamment les communes, privilégient souvent le BAFA car il est moins coûteux et plus rapide que les diplômes professionnels universitaires (BUT) ou de l'animation spécialisée.

      --------------------------------------------------------------------------------

      4. L'impact de l'animation sur les publics jeunes

      L'animation joue un rôle crucial dans la socialisation et le développement des enfants et adolescents.

      Apprentissage par les pairs : La proximité d'âge entre animateurs et jeunes favorise une transmission de savoirs différente du cadre scolaire, sans pour autant supprimer la hiérarchie éducative.

      Valeur éducative du jeu : La recherche infirme l'idée que les enfants « ne savent plus jouer ». Le jeu est un espace d'apprentissage de l'autonomie, de la négociation et de la prise de parole en public.

      Inégalités sociales : Les classes les plus favorisées investissent davantage la diversité des offres (culture, sport, loisirs), tandis que certaines fractions des classes populaires privilégient une prise en charge familiale au foyer.

      Saturation des rythmes : Les enfants sont souvent épuisés par l'empilement des activités scolaires et périscolaires, ce qui limite leur temps de « jeu libre » pour eux-mêmes.

      --------------------------------------------------------------------------------

      5. Défis contemporains et angles morts de la recherche

      Le document souligne plusieurs thématiques émergentes qui nécessitent une attention accrue.

      Violences sexuelles : Les accueils collectifs de mineurs (ACM) sont statistiquement des lieux plus sécurisés que le cadre familial. Cependant, la recherche montre que les filles subissent un continuum de violences sexistes de la petite enfance à l'âge adulte.

      Handicap : Cette question est identifiée comme un angle mort majeur de la recherche actuelle. Bien que traitée en formation, l'inclusion réelle des jeunes et des animateurs en situation de handicap reste peu documentée.

      Contrôle et réglementation : Le secteur est soumis à une inflation de normes (sécurité, alimentation, hygiène) qui transforme les pratiques professionnelles.

      Territorialisation : Il existe de fortes disparités dans l'offre d'animation selon les régions et le tissu associatif local (différences notables entre la Bretagne et la région PACA, par exemple).

      --------------------------------------------------------------------------------

      Citations clés

      « Si ce secteur concerne près de 4 millions de jeunes et plus de 350 000 animateurs et animatrices, il reste encore largement méconnu. Il est souvent associé au loisir et relégué aux marges de l'école. »

      « Le BAFA est une porte d'entrée majoritaire... Certains vont se former au BAFA sans savoir qu'ensuite ils vont se diriger vers l'animation comme métier. »

      « Moins que l'incapacité des enfants à jouer, c'est l'impossibilité de le faire au regard de l'ensemble des activités qui leur est demandé... à la fin desquelles ils sont régulièrement épuisés. »

    1. Arguments for Utilitarianismfunction togglePlayOrPause(){document.getElementById("player-container").classList.add("show-player"),document.getElementById("audio-icon").outerHTML=""}Table of ContentsIntroduction: Moral Methodology & Reflective EquilibriumArguments for UtilitarianismWhat Fundamentally MattersThe Veil of IgnoranceEx Ante ParetoExpanding the Moral CircleThe Poverty of the AlternativesThe Paradox of DeontologyThe Hope ObjectionSkepticism About the Distinction Between Doing and AllowingStatus Quo BiasEvolutionary Debunking ArgumentsConclusionResources and Further ReadingIntroduction: Moral Methodology & Reflective EquilibriumYou cannot prove a moral theory. Whatever arguments you come up with, it’s always possible for someone else to reject your premises—if they are willing to accept the costs of doing so. Different theories offer different advantages. This chapter will set out some of the major considerations that plausibly count in favor of utilitarianism. A complete view also needs to consider the costs of utilitarianism (or the advantages of its competitors), which are addressed in Chapter 8: Objections to Utilitarianism. You can then reach an all-things-considered judgment as to which moral theory strikes you as overall best or most plausible.To this end, moral philosophers typically use the methodology of reflective equilibrium. 1 1 This involves balancing two broad kinds of evidence as applied to moral theories:Intuitions about specific cases (thought experiments).General theoretical considerations, including the plausibility of the theory’s principles or systematic claims about what matters.General principles can be challenged by coming up with putative counterexamples, or cases in which they give an intuitively incorrect verdict. In response to such putative counterexamples, we must weigh the force of the case-based intuition against the inherent plausibility of the principle being challenged. This could lead you to either revise the principle to accommodate your intuitions about cases or to reconsider your verdict about the specific case, if you judge the general principle to be better supported (especially if you are able to “explain away” the opposing intuition as resting on some implicit mistake or confusion).As we will see, the arguments in favor of utilitarianism rest overwhelmingly on general theoretical considerations. Challenges to the view can take either form, but many of the most pressing objections involve thought experiments in which utilitarianism is held to yield counterintuitive verdicts.There is no neutral, non-question-begging answer to how one ought to resolve such conflicts. 2 2 It takes judgment, and different people may be disposed to react in different ways depending on their philosophical temperament. As a general rule, those of a temperament that favors systematic theorizing are more likely to be drawn to utilitarianism (and related views), whereas those who hew close to common sense intuitions are less likely to be swayed by its theoretical virtues. Considering the arguments below may thus do more than just illuminate utilitarianism; it may also help you to discern your own philosophical temperament!While our presentation focuses on utilitarianism, it’s worth noting that many of the arguments below could also be taken to support other forms of welfarist consequentialism (just as many of the objections to utilitarianism also apply to these related views). This chapter explores arguments for utilitarianism and closely related views over non-consequentialist approaches to ethics.Arguments for UtilitarianismWhat Fundamentally MattersMoral theories serve to specify what fundamentally matters, and utilitarianism offers a particularly compelling answer to this question.Almost anyone would agree with utilitarianism that suffering is bad, and well-being is good. What could be more obvious? If anything matters morally, human well-being surely does. And it would be arbitrary to limit moral concern to our own species, so we should instead conclude that well-being generally is what matters. That is, we ought to want the lives of sentient beings to go as well as possible (whether that ultimately comes down to maximizing happiness, desire satisfaction, or other welfare goods).Could anything else be more important? Such a suggestion can seem puzzling. Consider: it is (usually) wrong to steal. 3 3 But that is plausibly because stealing tends to be harmful, reducing people’s well-being. 4 4 By contrast, most people are open to redistributive taxation, if it allows governments to provide benefits that reliably raise the overall level of well-being in society. So it’s not that individuals just have a natural right to not be interfered with no matter what. When judging institutional arrangements (such as property and tax law), we recognize that what matters is coming up with arrangements that tend to secure overall good results, and that the most important factor in what makes a result good is that it promotes well-being. 5 5Such reasoning may justify viewing utilitarianism as the default starting point for moral theorizing. 6 6 If someone wants to claim that there is some other moral consideration that can override overall well-being (trumping the importance of saving lives, reducing suffering, and promoting flourishing), they face the challenge of explaining how that could possibly be so. Many common moral rules (like those that prohibit theft, lying, or breaking promises), while not explicitly utilitarian in content, nonetheless have a clear utilitarian rationale. If they did not generally promote well-being—but instead actively harmed people—it’s hard to see what reason we would have to still want people to follow them. To follow and enforce harmful moral rules (such as rules prohibiting same-sex relationships) would seem like a kind of “rule worship”, and not truly ethical at all. 7 7 Since the only moral rules that seem plausible are those that tend to promote well-being, that’s some reason to think that moral rules are, as utilitarianism suggests, purely instrumental to promoting well-being.Similar judgments apply to hypothetical cases in which you somehow know for sure that a typically reliable rule is, in this particular instance, counterproductive. In the extreme case, we all recognize that you ought to lie or break a promise if lives are on the line. In practice, of course, the best way to achieve good results over the long run is to respect commonsense moral rules and virtues while seeking opportunities to help others. (It’s important not to mistake the hypothetical verdicts utilitarianism offers in stylized thought experiments with the practical guidance it offers in real life.) The key point is just that utilitarianism offers a seemingly unbeatable answer to the question of what fundamentally matters: protecting and promoting the interests of all sentient beings to make the world as good as it can be.The Veil of IgnoranceHumans are masters of self-deception and motivated reasoning. If something benefits us personally, it’s all too easy to convince ourselves that it must be okay. We are also more easily swayed by the interests of more salient or sympathetic individuals (favoring puppies over pigs, for example). To correct for such biases, it can be helpful to force impartiality by imagining that you are looking down on the world from behind a “veil of ignorance”. This veil reveals the facts about each individual’s circumstances in society—their income, happiness level, preferences, etc.—and the effects that each choice would have on each person, while hiding from you the knowledge of which of these individuals you are. 8 8 To more fairly determine what ideally ought to be done, we may ask what everyone would have most personal reason to prefer from behind this veil of ignorance. If you’re equally likely to end up being anyone in the world, it would seem prudent to maximize overall well-being, just as utilitarianism prescribes. 9 9How much weight we should give to the verdicts that would be chosen, on self-interested grounds, from behind the veil? The veil thought experiment highlights how utilitarianism gives equal weight to everyone’s interests, without bias. That is, utilitarianism is just what we get when we are beneficent to all: extending to everyone the kind of careful concern that prudent people have for their own interests. 10 10 But it may seem question-begging to those who reject welfarism, and so deny that interests are all that matter. For example, the veil thought experiment clearly doesn’t speak to whether non-sentient life or natural beauty has intrinsic value. It’s restricted to that sub-domain of morality that concerns what we owe to each other, where this includes just those individuals over whom our veil-induced uncertainty about our identity extends: presently existing sentient beings, perhaps. 11 11 Accordingly, any verdicts reached via the veil of ignorance will still need to be weighed against what we might yet owe to any excluded others (such as future generations, or non-welfarist values).Still, in many contexts other factors will not be relevant, and the question of what we morally ought to do will reduce to the question of how we should treat each other. Many of the deepest disagreements between utilitarians and their critics concern precisely this question. And the veil of ignorance seems relevant here. The fact that some action is what everyone affected would personally prefer from behind the veil of ignorance seems to undermine critics’ claims that any individual has been mistreated by, or has grounds to complain about, that action.Ex Ante ParetoA Pareto improvement is better for some people, and worse for none. When outcomes are uncertain, we may instead assess the prospect associated with an action—the range of possible outcomes, weighted by their probabilities. A prospect can be assessed as better for you when it offers you greater well-being in expectation, or ex ante. 12 12 Putting these concepts together, we may formulate the following principle:Ex ante Pareto: in a choice between two prospects, one is morally preferable to another if it offers a better prospect for some individuals and a worse prospect for none.This bridge between personal value (or well-being) and moral assessment is further developed in economist John Harsanyi’s aggregation theorem. 13 13 But the underlying idea, that reasonable beneficence requires us to wish well to all, and prefer prospects that are in everyone’s ex ante interests, has also been defended and developed in more intuitive terms by philosophers. 14 14A powerful objection to most non-utilitarian views is that they sometimes violate ex ante Pareto, such as when choosing policies from behind the veil of ignorance. Many rival views imply, absurdly, that prospect Y could be morally preferable to prospect X, even when Y is worse in expectation for everyone involved.Caspar Hare illustrates the point with a Trolley case in which all six possible victims are stuffed inside suitcases: one is atop a footbridge, five are on the tracks below, and a train will hit and kill the five unless you topple the one on the footbridge (in which case the train will instead kill this one and then stop before reaching the others). 15 15 As the suitcases have recently been shuffled, nobody knows which position they are in. So, from each victim’s perspective, their prospects are best if you topple the one suitcase off the footbridge, increasing their chances of survival from 1/6 to 5/6. Given that this is in everyone’s ex ante interests, it’s deeply puzzling to think that it would be morally preferable to override this unanimous preference, shared by everyone involved, and instead let five of the six die; yet that is the implication of most non-utilitarian views. 16 16Expanding the Moral CircleWhen we look back on past moral atrocities—like slavery or denying women equal rights—we recognize that they were often sanctioned by the dominant societal norms at the time. The perpetrators of these atrocities were grievously wrong to exclude their victims from their “circle” of moral concern. 17 17 That is, they were wrong to be indifferent towards (or even delight in) their victims’ suffering. But such exclusion seemed normal to people at the time. So we should question whether we might likewise be blindly accepting of some practices that future generations will see as evil but that seem “normal” to us. 18 18 The best protection against making such an error ourselves would be to deliberately expand our moral concern outward, to include all sentient beings—anyone who can suffer—and so recognize that we have strong moral reasons to reduce suffering and promote well-being wherever we can, no matter who it is that is experiencing it.While this conclusion is not yet all the way to full-blown utilitarianism, since it’s compatible with, for example, holding that there are side-constraints limiting one’s pursuit of the good, it is likely sufficient to secure agreement with the most important practical implications of utilitarianism (stemming from cosmopolitanism, anti-speciesism, and longtermism).The Poverty of the AlternativesWe’ve seen that there is a strong presumptive case in favor of utilitarianism. If no competing view can be shown to be superior, then utilitarianism has a strong claim to be the “default” moral theory. In fact, one of the strongest considerations in favor of utilitarianism (and related consequentialist views) is the deficiencies of the alternatives. Deontological (or rule-based) theories, in particular, seem to rest on questionable foundations. 19 19Deontological theories are explicitly non-consequentialist: instead of morally assessing actions by evaluating their consequences, these theories tend to take certain types of action (such as killing an innocent person) to be intrinsically wrong. 20 20 There are reasons to be dubious of this approach to ethics, however.The Paradox of DeontologyDeontologists hold that there is a constraint against killing: that it’s wrong to kill an innocent person even if this would save five other innocent people from being killed. This verdict can seem puzzling on its face. 21 21 After all, given how terrible killing is, should we not want there to be less of it? Rational choice in general tends to be goal-directed, a conception which fits poorly with deontic constraints. 22 22 A deontologist might claim that their goal is simply to avoid violating moral constraints themselves, which they can best achieve by not killing anyone, even if that results in more individuals being killed. While this explanation can render deontological verdicts coherent, it does so at the cost of making them seem awfully narcissistic, as though the deontologist’s central concern was just to maintain their own moral purity or “clean hands”.Deontologists might push back against this characterization by instead insisting that moral action need not be goal-directed at all. 23 23 Rather than only seeking to promote value (or minimize harm), they claim that moral agents may sometimes be called upon to respect another’s value (by not harming them, even as a means to preventing greater harm to others), which would seem an appropriately outwardly-directed, non-narcissistic motivation.The challenge remains that such a proposal makes moral norms puzzlingly divergent from other kinds of practical norms. If morality sometimes calls for respecting value rather than promoting it, why is the same not true of prudence? (Given that pain is bad for you, for example, it would not seem prudent to refuse a painful operation now if the refusal commits you to five comparably painful operations in future.) Deontologists may offer various answers to this question, but insofar as we are inclined to think, pre-theoretically, that ethics ought to be continuous with other forms of rational choice, that gives us some reason to prefer consequentialist accounts. 24 24Deontologists also face a tricky question about where to draw the line. Is it at least okay to kill one person to prevent a hundred killings? Or a million? Absolutists never permit killing, no matter the stakes. But such a view seems too extreme for many. Moderate deontologists allow that sufficiently high stakes can justify violations. But how high? Any answer they offer is apt to seem arbitrary and unprincipled. Between the principled options of consequentialism or absolutism, many will find consequentialism to be the more plausible of the two.The Hope ObjectionImpartial observers should want and hope for the best outcome. Non-consequentialists claim, nonetheless, that it’s sometimes wrong to bring about the best outcome. Putting the two claims together yields the striking result that you should sometimes hope that others act wrongly.Suppose it would be wrong for some stranger—call him Jack—to kill one innocent person to prevent five other (morally comparable) killings. Non-consequentialists may claim that Jack has a special responsibility to ensure that he does not kill anyone, even if this results in more killings by others. But you are not Jack. From your perspective as an impartial observer, Jack’s killing one innocent person is no more or less intrinsically bad than any of the five other killings that would thereby be prevented. You have most reason to hope that there is only one killing rather than five. So you have reason to hope that Jack acts “wrongly” (killing one to save five). But that seems odd.More than merely being odd, this might even be taken to undermine the claim that deontic constraints matter, or are genuinely important to abide by. After all, to be important just is to be worth caring about. For example, we should care if others are harmed, which validates the claim that others’ interests are morally important. But if we should not care more about Jack’s abiding by the moral constraint against killing than we should about his saving five lives, that would seem to suggest that the constraint against killing is not in fact more morally important than saving five lives.Finally, since our moral obligations ought to track what is genuinely morally important, if deontic constraints are not in fact important then we cannot be obligated to abide by them. 25 25 We cannot be obliged to prioritize deontic constraints over others’ lives, if we ought to care more about others’ lives than about deontic constraints. So deontic constraints must not accurately describe our obligations after all. Jack really ought to do whatever would do the most good overall, and so should we.Skepticism About the Distinction Between Doing and AllowingYou might wonder: if respect for others requires not harming them (even to help others more), why does it not equally require not allowing them to be harmed? Deontological moral theories place great weight on distinctions such as those between doing and allowing harm, or killing and letting die, or intended versus merely foreseen harms. But why should these be treated so differently? If a victim ends up equally dead either way, whether they were killed or “merely” allowed to die would not seem to make much difference to them—surely what matters to them is just their death. Consequentialism accordingly denies any fundamental significance to these distinctions. 26 26Indeed, it’s far from clear that there is any robust distinction between “doing” and “allowing”. Sometimes you might “do” something by remaining perfectly still. 27 27 Also, when a doctor unplugs a terminal patient from life support machines, this is typically thought of as “letting die”; but if a mafioso, worried about an informant’s potentially incriminating testimony, snuck in to the hospital and unplugged the informant’s life support, we are more likely to judge it to constitute “killing”. 28 28 Jonathan Bennett argues at length that there is no satisfactory, fully general distinction between doing and allowing—at least, none that would vindicate the moral significance that deontologists want to attribute to such a distinction. 29 29 If Bennett is right, then that might force us towards some form of consequentialism (such as utilitarianism) instead.Status Quo BiasOpposition to utilitarian trade-offs—that is, benefiting some at a lesser cost to others—arguably amounts to a kind of status quo bias, prioritizing the preservation of privilege over promoting well-being more generally.Such conservatism might stem from the Just World fallacy: the mistake of assuming that the status quo is just, and that people naturally get what they deserve. Of course, reality offers no such guarantees of justice. What circumstances one is born into depends on sheer luck, including one’s endowment of physical and cognitive abilities which may pave the way for future success or failure. Thus, even later in life we never manage to fully wrest back control from the whimsies of fortune and, consequently, some people are vastly better off than others despite being no more deserving. In such cases, why should we not be willing to benefit one person at a lesser cost to privileged others? They have no special entitlement to the extra well-being that fortune has granted them. 30 30 Clearly, it’s good for people to be well-off, and we certainly would not want to harm anyone unnecessarily. 31 31 However, if we can increase overall well-being by benefiting one person at the lesser cost to another, we should not refrain from doing so merely due to a prejudice in favor of the existing distribution. 32 32 It’s easy to see why traditional elites would want to promote a “morality” which favors their entrenched interests. It’s less clear why others should go along with such a distorted view of what (and who) matters.It can similarly be argued that there is no real distinction between imposing harms and withholding benefits. The only difference between the two cases concerns what we understand to be the status quo, which lacks moral significance. Suppose scenario A is better for someone than B. Then to shift from A to B would be a “harm”, while to prevent a shift from B to A would be to “withhold a benefit”. But this is merely a descriptive difference. If we deny that the historically given starting point provides a morally privileged baseline, then we must say that the cost in either case is the same, namely the difference in well-being between A and B. In principle, it should not matter where we start from. 33 33Now suppose that scenario B is vastly better for someone else than A is: perhaps it will save their life, at the cost of the first person’s arm. Nobody would think it okay to kill a person just to save another’s arm (that is, to shift from B to A). So if we are to avoid status quo bias, we must similarly judge that it would be wrong to oppose the shift from A to B—that is, we should not object to saving someone’s life at the cost of another’s arm. 34 34 We should not care especially about preserving the privilege of whoever stood to benefit by default; such conservatism is not truly fair or just. Instead, our goal should be to bring about whatever outcome would be best overall, counting everyone equally, just as utilitarianism prescribes.Evolutionary Debunking ArgumentsAgainst these powerful theoretical objections, the main consideration that deontological theories have going for them is closer conformity with our intuitions about particular cases. But if these intuitions cannot be supported by independently plausible principles, that may undermine their force—or suggest that we should interpret these intuitions as good rules of thumb for practical guidance, rather than as indicating what fundamentally matters.The force of deontological intuitions may also be undermined if it can be demonstrated that they result from an unreliable process. For example, evolutionary processes may have endowed us with an emotional bias favoring those who look, speak, and behave like ourselves; this, however, offers no justification for discriminating against those unlike ourselves. Evolution is a blind, amoral process whose only “goal” is the propagation of genes, not the promotion of well-being or moral rightness. Our moral intuitions require scrutiny, especially in scenarios very different from our evolutionary environment. If we identify a moral intuition as stemming from our evolutionary ancestry, we may decide not to give much weight to it in our moral reasoning—the practice of evolutionary debunking. 35 35Katarzyna de Lazari-Radek and Peter Singer argue that views permitting partiality are especially susceptible to evolutionary debunking, whereas impartial views like utilitarianism are more likely to result from undistorted reasoning. 36 36 Joshua Greene offers a different psychological debunking argument. He argues that deontological judgments—for instance, in response to trolley cases—tend to stem from unreliable and inconsistent emotional responses, including our favoritism of identifiable over faceless victims and our aversion to harming someone up close rather than from afar. By contrast, utilitarian judgments involve the more deliberate application of widely respected moral principles. 37 37Such debunking arguments raise worries about whether they “prove too much”: after all, the foundational moral judgment that pain is bad would itself seem emotionally-laden and susceptible to evolutionary explanation—physically vulnerable creatures would have powerful evolutionary reasons to want to avoid pain whether or not it was objectively bad, after all! 38 38However, debunking arguments may be most applicable in cases where we feel that a principled explanation for the truth of the judgment is lacking. We do not tend to feel any such lack regarding the badness of pain—that is surely an intrinsically plausible judgment if anything is. Some intuitions may be over-determined: explicable both by evolutionary causes and by their rational merits. In such a case, we need not take the evolutionary explanation to undermine the judgment, because the judgment also results from a reliable process (namely, rationality). By contrast, deontological principles and partiality are far less self-evidently justified, and so may be considered more vulnerable to debunking. Once we have an explanation for these psychological intuitions that can explain why we would have them even if they were rationally baseless, we may be more justified in concluding that they are indeed rationally baseless.As such, debunking objections are unlikely to change the mind of one who is drawn to the target view (or regards it as independently justified and defensible). But they may help to confirm the doubts of those who already felt there were some grounds for scepticism regarding the intrinsic merits of the target view.ConclusionUtilitarianism can be supported by several theoretical arguments, the strongest perhaps being its ability to capture what fundamentally matters. Its main competitors, by contrast, seem to rely on dubious distinctions—like “doing” vs. “allowing”—and built-in status quo bias. At least, that is how things are apt to look to one who is broadly sympathetic to a utilitarian approach. Given the flexibility inherent in reflective equilibrium, these arguments are unlikely to sway a committed opponent of the view. For those readers who find a utilitarian approach to ethics deeply unappealing, we hope that this chapter may at least help you to better understand what appeal others might see in the view.However strong you judge the arguments in favor of utilitarianism to be, your ultimate verdict on the theory will also depend upon how well the view is able to counter the influential objections that critics have raised against it.The next chapter discusses theories of well-being, or what counts as being good for an individual.Next Chapter: Theories of Well-BeingHow to Cite This PageChappell, R.Y. and Meissner, D. (2023). Arguments for Utilitarianism. In R.Y. Chappell, D. Meissner, and W. MacAskill (eds.), An Introduction to Utilitarianism, <https://www.utilitarianism.net/arguments-for-utilitarianism>, accessed document.write((new Date).toLocaleDateString("en-US"))2/13/2026.
    1. Rapport de Synthèse : Autorité, Vérité et Défis Informationnels à l'Horizon 2050

      Résumé Exécutif

      Ce document synthétise les interventions de Pierre Rosanvallon, David Chavalarias et Antoine Bayet devant la délégation sénatoriale concernant l'évolution des valeurs d'autorité et de vérité face aux réseaux sociaux et aux mutations médiatiques.

      Les points clés identifiés sont :

      La crise de l'autorité : L'autorité ne se décrète pas ; elle est une "institution invisible" qui se reconnaît d'en bas. Sa reconstruction nécessite de valoriser la démarche scientifique (tâtonnements, confrontation) plutôt que le simple énoncé de vérités lointaines.

      La menace systémique des plateformes : Les réseaux sociaux, par leurs algorithmes de maximisation de l'engagement, favorisent structurellement les contenus toxiques (+49 % de toxicité mesurée sur X) et permettent des manipulations géopolitiques (Russie, États-Unis) visant à miner les démocraties européennes.

      L'émergence de la "Dark Information" : Une partie de la population, souvent diplômée et insérée, délaisse les médias traditionnels pour des canaux alternatifs qui imitent les codes du journalisme professionnel (information "Canada Dry") pour diffuser des récits militants ou tronqués.

      Scénarios 2050 : L'avenir de l'information oscille entre un miracle de réappropriation citoyenne, un effondrement total de la vérité, ou une fragmentation durable de la réalité en bulles hermétiques.

      Pistes d'action : La réponse réside dans la transparence algorithmique, l'éducation aux médias étendue à l'IA, la souveraineté numérique et l'adoption de nouveaux modes de délibération et de scrutin.

      --------------------------------------------------------------------------------

      1. La Nature de l'Autorité et de la Légitimité

      1.1. L'Autorité comme "Institution Invisible"

      L'autorité se distingue fondamentalement du pouvoir. Alors que le pouvoir dispose de moyens de coercition (police, règles), l'autorité, au même titre que la confiance et la légitimité, ne peut être imposée par décret.

      Reconnaissance ascendante : L'autorité "vient d'en bas". Elle est octroyée par ceux qui la reconnaissent, et non par celui qui prétend l'exercer.

      Le modèle universitaire médiéval : Historiquement, l'autorité s'est construite non par la parole d'un seul, mais par la confrontation critique et la discussion (procédures quodlibétiques).

      1.2. La Crise de l'Autorité Scientifique

      Le savant est aujourd'hui perçu comme une figure lointaine, enfermée dans sa bulle. Pour restaurer cette autorité, il est nécessaire de :

      Rendre la démarche sensible : Montrer le "bricolage", l'hésitation et le tâtonnement inhérents à la recherche.

      Privilégier la proximité : À l'instar des savants des années 1930 ou de François Arago au XIXe siècle, l'autorité se gagne en se mettant au service de la collectivité et en restant accessible.

      Accueillir l'indétermination : La démocratie doit accepter de prendre en charge les doutes et les préjugés des citoyens plutôt que de chercher à "rééduquer les cerveaux" de manière descendante.

      --------------------------------------------------------------------------------

      2. Réseaux Sociaux : Infrastructures de Manipulation

      2.1. Un Contexte Géopolitique de "Tenaille"

      L'Europe est confrontée à deux types d'influences extérieures cherchant à modifier la perception des citoyens :

      L'Est (Russie) : Utilisation de la doctrine du KGB visant à miner les démocraties en ciblant les médias et en désorientant l'opinion.

      L'Ouest (États-Unis/Big Tech) : Une stratégie visant à "inonder la zone" de contenus confus pour disqualifier les sources d'autorité traditionnelles au profit de modèles autoritaires ou suprémacistes.

      2.2. La Toxicité Algorithmique

      Les plateformes numériques ne sont pas des canaux neutres. Elles pratiquent une "éditorialisation" algorithmique délétère :

      Maximisation de l'engagement : Pour retenir l'attention, les algorithmes favorisent le clash et l'hostilité.

      Distorsion du flux : Sur X (anciennement Twitter), l'arrivée d'Elon Musk a fait passer la part de contenus toxiques dans les fils d'actualité de 32 % à 49 %.

      Invisibilisation des abonnements : Un utilisateur ne voit en moyenne que 3 % de la production de son environnement social réel, le reste étant sélectionné par la plateforme.

      2.3. Risques Systémiques et Souveraineté

      L'Astroturfing : Création de foules factices (robots, IA) pour simuler une adhésion populaire à une cause (ex: MacronLeaks en 2017, soutien à l'AFD en Allemagne).

      Dépendance aux infrastructures : Le cas de Starlink illustre le risque qu'un acteur privé puisse, en 2050, couper l'accès internet d'un État pour imposer sa volonté politique.

      La "Tech Autoritaire" : Pilotage de la démocratie par des outils technologiques opaques et centralisés.

      --------------------------------------------------------------------------------

      3. Les Nouveaux Visages de l'Information

      3.1. Les "Décrocheurs" de l'Information

      Contrairement aux clichés, les citoyens qui rejettent les médias traditionnels ("mainstream") sont souvent :

      • Très insérés socialement (cadres, médecins, avocats, élus).

      • Diplômés et actifs numériquement.

      • En recherche d'une "légitimité alternative".

      3.2. La "Dark Information" ou Information "Canada Dry"

      Cette forme d'information imite parfaitement les codes professionnels pour mieux tromper :

      Mise en scène : Interviews en studio, experts affichés, vocabulaire journalistique.

      Viralité supérieure : Lors du premier confinement, les contenus d'un groupe Facebook pro-Didier Raoult ont été plus partagés que ceux de six grands médias réunis (BFM, Le Monde, Le Figaro, etc.).

      L'effet "Holdup" : Utilisation de personnalités crédibles (anciens ministres, chercheurs) pour valider des récits tronqués ou manipulés.

      3.3. La Crise du Contexte

      L'information moderne souffre d'une décontextualisation systématique. Une image ou une vidéo extraite de son cadre devient une arme. Le combat pour la vérité passe désormais par la "guerre du contexte" et le temps long de l'archive.

      --------------------------------------------------------------------------------

      4. Prospective : Les Mondes de l'Information en 2050

      Trois scénarios contrastés ont été élaborés pour anticiper l'évolution du système :

      | Scénario | Description | Caractéristiques Clés | | --- | --- | --- | | Le Miracle | Reprise en main citoyenne | Information comme bien commun, algorithmes audités, IA au service du contexte, consentement à payer. | | L'Obscur | Effondrement de la vérité | Disparition de l'indépendance, fatigue informationnelle des citoyens, plateformes totalement dominantes, démocratie vulnérable. | | Le Clair-Obscur | Fragmentation (Le plus probable) | Coexistence de plusieurs régimes de vérité ; information de haute qualité pour une élite vs bulles informationnelles fermées pour le reste. |

      --------------------------------------------------------------------------------

      5. Pistes de Solution et Recommandations

      Pour parer à la destruction du débat démocratique, plusieurs leviers sont identifiés :

      1. Réforme des Modes de Scrutin : Sortir du scrutin uninominal, vulnérable à la manipulation de l'entre-deux-tours, pour aller vers des systèmes comme le Jugement Majoritaire, qui réduit le "vote utile" et la division haineuse.

      2. Transparence et Régulation : Appliquer strictement le Digital Services Act (DSA) pour ouvrir les "boîtes noires" algorithmiques, tout en développant des "communs numériques" et des services publics d'information.

      3. Éducation Globale : Étendre l'éducation aux médias (EMI) à une éducation à l'IA dès le collège. Il ne s'agit pas seulement de vérifier les faits (fact-checking), mais de comprendre la logistique de production de l'information et les biais des outils.

      4. Souveraineté Numérique : S'émanciper des infrastructures captives (États-Unis/Chine) pour garantir l'état de droit.

      5. Pédagogie de la Fabrication : Les journalistes et chercheurs doivent "montrer les coutures" de leur métier, accepter de dire "je ne sais pas" et expliciter leurs méthodes pour regagner la confiance.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This study generated 3D cell constructs from endometrial cell mixtures that were seeded in the Matrigel scaffold. The cell assemblies were treated with hormones to induce a "window of implantation" (WOI) state. Although many bioinformatic analyses point in this direction, there are major concerns that must be addressed.

      Strengths:

      The addition of 3 hormones to enhance the WOI state (although not clearly supported in comparison to the secretory state).

      Comments on revisions:

      The authors did their best to revise their study according to the Reviewers' comments. However, the study remains unconvincing, incomplete and at the same time still too dense and not focused enough.

      Reviewer #2 (Public review):

      Zhang et al. have developed an advanced three-dimensional culture system of human endometrial cells, termed a receptive endometrial assembloid, that models the uterine lining during the crucial window of implantation (WOI). During this mid-secretory phase of the menstrual cycle, the endometrium becomes receptive to an embryo, undergoing distinctive changes. In this work, endometrial cells (epithelial glands, stromal cells, and immune cells from patient samples) were grown into spheroid assembloids and treated with a sequence of hormones to mimic the natural cycle. Notably, the authors added pregnancy-related factors (such as hCG and placental lactogen) on top of estrogen and progesterone, pushing the tissue construct into a highly differentiated, receptive state. The resulting WOI assembloid closely resembles a natural receptive endometrium in both structure and function. The cultures form characteristic surface structures like pinopodes and exhibit abundant motile cilia on the epithelial cells, both known hallmarks of the mid-secretory phase. The assembloids also show signs of stromal cell decidualization and an epithelial mesenchymal transition, like process at the implantation interface, reflecting how real endometrial cells prepare for possible embryo invasion.

      Although the WOI assembloid represents an important step forward, it still has limitations: the supportive stromal and immune cell populations decrease over time in culture, so only earlypassage assembloids retain full complexity. Additionally, the differences between the WOI assembloid and a conventional secretory-phase organoid are more quantitative than absolute; both respond to hormones and develop secretory features, but the WOI assembloid achieves a higher degree of differentiation due to the addition of "pregnancy" signals. Overall, while it's a reinforced model (not an exact replica of the natural endometrium), it provides a valuable in vitro system for implantation studies and testing potential interventions, with opportunities to improve its long-term stability and biological fidelity in the future.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This study generated 3D cell constructs (i.e., assembloids) that were treated with hormones to induce a 'window of implantation' (WOI) state. While the authors have made large efforts to address the reviewers' feedback, the study's findings remain unconvincing and incomplete.

      (1) The authors have appropriately revised the terminology from 'organoids' to 'assembloids' in several parts of the manuscript. However, this revision remains incomplete, as the main title, figure legends, and figure titles still contain the incorrect term. A thorough review of the entire manuscript is recommended to ensure consistent and accurate use of terminology.

      Thank you for your meticulous review. We have now conducted a full check and confirmed that terminology is used consistently and accurately throughout the text.

      (1) Previous comments raised concerns about the feasibility of robustly passaging assembloid structures - comprising epithelial, stromal and immune cells - under epithelial growth conditions. The authors responded by stating that they optimized the expansion medium with a stromal cell-promoting factor. Additionally, rather than conducting scRNA-seq on both early and late passages (P6-P10) as suggested, they performed immunofluorescence staining, which confirmed the persistence of stromal cells at passage 6. However, the presence of immune cells was not addressed. Confirmation of their presence is essential for all further claims. Moreover, a more zoomed-out view of the immunostaining would help clarify the overall cellular composition across the entire well and facilitate comparison with corresponding brightfield images.

      Whole-mount immunofluorescence of the 6th - generation assembloids revealed that CD45<sup>+</sup> immune cells surrounded FOXA2<sup>+</sup> glands, with a more zoomed-out view provided.

      Author response image 1.

      Whole-mount immunofluorescence showed that CD45<sup>+</sup> cells (immune cells) were arranged around the glandular spheres that were FOXA2<sup>+</sup>. Scale bar =50 μm (left) and 30 μm (right).

      In their response, the authors mention using the first three passages to ensure optimal cell diversity and viability. However, the manuscript states that 'assembloids derived from the first generation are used for experiments' (line 106). This discrepancy must be clarified.

      Thank you for your suggestion. We have revised the relevant content to “The assembloids derived from the first three generation are used for experiments” (Line 90-91).

      (2) The authors have made a commendable effort to bring more focus to the manuscript, which has improved readability.

      We thank you for your insightful suggestions, which have greatly improved the quality of our manuscript.

      (3) The "embryo implantation" part remains very unconvincing. How did authors define "the blastoids could grow within the endometrial assembloids and interact with them"? What did they mean with "grow"? Did blastoids further differentiate? Normally, blastoids cannot further "grow". "Survival rates of blastoids" is not equal to "growth". It is not clear how the survival rate was quantified. Besides, regarding the "interaction rates", how did authors define and quantify it? Actually, blastoids are able to attach to Matrigel efficiently (even without any endometrial cells), so authors cannot simply define the "interaction" as the co-localization of blastoids and assembloids via brightfield images. In addition, for the assembloids as the 3D structures grow in the Matrigel, the epithelial parts are normally apical-in, while the blastoids attach to the apical (lumen) side of the epithelial cells, so physiologically, blastoids should interact with the apical part of the epithelial cells instead of the outside of the assembloids.

      (1) What did they mean with "grow"? Did blastoids further differentiate?

      On the one hand, volume and morphology undergo continuous dynamic changes; on the other hand, only the inner cell mass and trophectoderm exist at the blastocyst stage, with the ICM further differentiating into OCT4<sup>+</sup> epiblast and GATA6<sup>+</sup> hypoblast.

      (2) Survival rates of blastoids" is not equal to "growth". It is not clear how the survival rate was quantified.

      The definition of "survival rate" is as follows: morphologically, the blastocoel remains noncollapsed and the cell boundaries are distinct (with no obvious cell detachment); molecularly, the markers of epiblast, hypoblast and trophectoderm are expressed. The survival rate is calculated as the ratio of viable embryoids to the total number of embryoids.

      (3) Besides, regarding the "interaction rates", how did authors define and quantify it? Actually, blastoids are able to attach to Matrigel efficiently (even without any endometrial cells), so authors cannot simply define the "interaction" as the co-localization of blastoids and assembloids via brightfield images.

      The criteria for determining interaction include not only attachment between the blastoids and assembloids observed via brightfield images, but also their sustained tight adhesion against external mechanical perturbations (e.g., medium replacement, immunostaining procedures).

      (4) In addition, for the assembloids as the 3D structures grow in the Matrigel, the epithelial parts are normally apical-in, while the blastoids attach to the apical (lumen) side of the epithelial cells, so physiologically, blastoids should interact with the apical part of the epithelial cells instead of the outside of the assembloids.

      You are absolutely correct. In vivo, the embryo indeed makes initial contact with the apical side of the epithelial cells. The introduction of the blastoid co-culture model herein is intended to demonstrate that this receptive endometrial assembloids can better support blastoid growth and development.

      (4) Previous comments highlighted the absence of distinct shifts in gene expression profiles between SEC assembloids and WOI assembloids, which contrasts with findings from primary endometrial tissue reported by Wang et al. (2020). While the authors have expanded their analysis using the Mfuzz algorithm and identified changes in mitochondria- and cilia-associated genes, the manuscript still lacks evidence of significant transcriptional changes in key WOI marker genes, as described in Wang et al. This discrepancy must be addressed and discussed in greater depth to clarify the biological relevance of their model.

      The endometrium in vivo involves complex crosstalk among multiple cell types and is tightly regulated by the hypothalamic-pituitary-ovarian (HPO) axis, thus exhibiting distinct shifts in gene expression during the peri-implantation period.

      In our in vitro model, alterations in mitochondria- and cilia-related genes were observed, which to a certain extent demonstrates that these window of implantation (WOI) assembloids possess receptive-phase characteristics and can be employed to investigate WOI-associated scientific questions or conduct in vitro drug screening.

      However, substantial efforts are still required to optimize the current model for fully recapitulating the dynamic changes in endometrial gene expression across different phases in vivo, and this aspect is further addressed in the Limitations section of our discussion (Line 342-353).

      “However, our WOI endometrial assembloids also exhibit some limitations. It is undeniable that the assembloids cannot perfectly replicate the in vivo endometrium, which comprises functional and basal layers with a greater abundance of cell subtypes, under superior regulation by hypothalamic-pituitary-ovarian (HPO) axis. Specifically, stromal and immune cells are challenging to stably passage, and their proportion is lower than in the in vivo endometrium. While the in vivo peri-implantation period exhibits intricate gene expression dynamics driven by systemic regulation, our models only partially recapitulate these changes, primarily in mitochondria- and cilia-associated genes. Nevertheless, to some extent, these WOI assembloids possess receptivity characteristics and can be utilized for investigating receptivity-related scientific questions or conducting in vitro drug screening. Further refinements are required to fully simulate the dynamic endometrial gene expression patterns across all menstrual cycle stages. We are looking forward to integrating stem cell induction, 3D printing, and microfluidic systems to modify the culture environment.”

      (5) In the authors' response document, they present data integrating their results with those of Garcia Alonso et al. (2021). However, these integrated analyses are not included in the revised manuscript (which should be, if answering a major concern).

      Thanks for your valuable suggestions. We have now integrated the findings of Garcia Alonso et al. (2021) into the revised manuscript (Line 132) and Figure S2E–F.

      (8) Fig 2D: The authors have clarified that CD45+ staining is used. However, they have not yet adapted the typo in the figure legend of the right picture.

      Thanks for your thorough review. The left panel of Figure 2D is stained with CD45 to label immune cells, while the right panel is stained with CD44. These details have been clearly indicated in both the manuscript and the figure legend.  

      (9) All quantification analyses (as described in the authors' response document) should be clearly described in the Materials & Methods section.  

      Thanks for your valuable suggestions. All quantification analyses have now been added to the Supporting Materials and Methods section (Line 94-104, Line 110-111, Line 241244).

      (10) The authors have provided clarification regarding their method for quantifying immunofluorescence staining (e.g., OLFM4 expression in Fig. 3C) in their response document. However, these methodological details are not included in the revised manuscript. It is important that such information is incorporated into the manuscript itself to ensure transparency and reproducibility for others.

      Thanks for your valuable suggestions. All quantification analyses have now been added to the Supporting Materials and Methods section (Line 94-104).

      (13) It is needed to include the author's response to the comment about literature showing the opposite of increased number of cilia during the WOI into the discussion part of the paper.

      We appreciate your suggestions. The relevant content has now been added to the Discussion section (Lines 319–323).

      (14) In the authors' response, they explain the difference between pinopodes and microvilli. They should include this explanation briefly in the manuscript. Moreover, Fig. 3F lacks a picture of cilia structure in CTRL condition. In addition, the structures that are indicated as cilia with an orange arrow seem to not be attached to the endometrial cells (anymore). It would be useful to show another more representative picture for the cilia.

      (1) Thank you for your valuable suggestions. The distinction between pinopodes and microvilli has now been added to the Supporting Materials and Methods section (Line 230-236).

      (2) You are probably referring to Figure 2F—we did not observe ciliary structures in the CTRL group.

      (3) The cilia structure was visualized via transmission electron microscopy (TEM), which requires ultrathin sectioning. Thus, the cilia shown in the image correspond to a single cross-section of the captured assembloids. Owing to technical limitations, three-dimensional visualization of cilia on the cells cannot be achieved.

      (17) The results on co-culturing blastoids with the WOI assembloids is not convincing. The blastoids are exposed to the basolateral side of the endometrial epithelial cells, while in vivo, blastocysts interact with the apical side of the endometrial epithelial cells first (apposition and attachment), followed by invasion into the endometrium. This means that the interaction shown here is not physiological. Therefore, it is not justified to say that this platform holds promise to investigate maternal-fetal interactions.

      We agree with your perspective that discrepancies exist between this model and the physiological processes in vivo. However, such differences do not negate the scientific value of the model.

      The core merit of this study lies in the successful establishment of co-culture systems for blastoids and WOI assembloids. Notably, genuine cross-talk occurs between the two components, thereby providing a practical and operational tool for subsequent research.

      Although the current contact orientation differs from that observed in vivo, future optimization of the cell culture protocol (via modulation of cell polarity) will enable the model to better recapitulate physiological conditions. Therefore, the innovation and operability of this model within specific research contexts still render it a robust platform for investigating maternal-fetal interactions.

      Overall, it is highly recommended that the authors carefully review the manuscript for grammatical errors, inconsistencies and issues with scientific phrasing. The language throughout the text requires substantial editing to improve clarity, readability and precision. 

      We appreciate your suggestions. A full manuscript check was performed to rectify grammatical errors, inconsistencies, and inappropriate scientific phrasing, with further language refinement by a native English-speaking specialist.

      Fig 1A: This overview is unclear. How many days do the assembloids grow before being stimulated with hormones? Are CTRL assembloids only kept in culture until day 2 and SEC and WOI assembloids until day 8? This is also not clear form the Materials and Methods section. Should be clarified.

      Thanks for your valuable suggestions. We have now updated the overview (Figure 1A) and Materials and Methods section (Line 370-371, Line 379-381).

      “Hormonal treatment was initiated following the assembly of the endometrial assembloids (about 7-day growth period).”

      “The CTRL group was cultured in ExM without hormone supplementation and subjected to parallel culture for 8 days along with the two aforementioned groups.”

      Fig 1B: From these brightfield images, it appears that the size of the assembloids remains relatively consistent from Day 0 to Day 3 and up to Day 11 (especially in CTRL). However, in Fig S1A, the assembloids on Day 11 appear significantly larger compared to those on Day 2 (or Day 4). Authors should clarify this discrepancy (since both of the figures are shown as "brightfield of endometrial assembloids").

      You are probably referring to the observation that the assembloids at Day 11 in Fig. S1A are smaller in size than those at Day 2 (or Day 4) in Fig. 1B. This discrepancy arises because the time points in Fig. 1B are calculated starting from the initiation of hormone treatment for the SEC and WOI groups, rather than from the beginning of the overall culture as in Fig. S1A. In addition, assembloids exhibit size variability during the same culture period due to individual heterogeneity.

      To eliminate ambiguity, we have now labeled “Hormone Day 0, Day 2, Day 8” in Fig. 1B and revised the corresponding figure legend to read: “Endometrial assembloids from the CTRL, SEC, and WOI groups, which were subjected to hormone treatment on Days 0, 2, and 8, exhibited comparable growth patterns throughout the culture period.”

      Fig 2G: authors still used the description "organoids" here instead of "assembloids".

      We appreciate your careful review. Corrections have been made accordingly.

      Fig. 3C: For the OLFM4 staining quantification, in the Y-axis authors wrote "proportion of OLFM4 (+) cells (OLFM4 (+)/total", but in the rebuttal letter they mention "its fluorescence intensity (quantified as mean grey value) was significantly stronger in both the SEC and WOI groups compared to the CTRL group". This is confounding and should be clarified.

      We apologize for incorrectly writing "fluorescence intensity" in the rebuttal letter; the correct term should be the "proportion of OLFM4 (+) cells (OLFM4 (+)/total)" as shown in Fig. 3C.

      Fig 5D: Acetyl-α-tubulin is the marker of ciliated cells and should be expressed in the cilia instead of the whole cells. It is very strange to quantify as "mean fluorescence intensity (acetyl-αtubulin/DAPI)" to assess the cilia. Please clarify.

      Thank you for your insightful comment. To clarify, the ratio "mean fluorescence intensity (acetyl-α-tubulin/DAPI)" was calculated within individual acetyl-α-tubulin<sup>+</sup> ciliated cells. Acetyl-αtubulin fluorescence was normalized to the DAPI signal of the same cell nucleus, not the wholecell population. This corrected for variations in cell number and staining efficiency to ensure data accuracy.

      Fig 5F: it is very bizarre that unciliated epithelium was transformed from ciliated epithelium, and CTRL was transformed from SEC and WOI. Should be clarified and discussed.

      Pseudotime analysis sorts discrete cells along a "pseudotime axis" based on similarities and differences in cellular gene expression, thereby simulating cell state transitions.

      Ciliated epithelium → unciliated epithelium: During the menstrual cycle, ciliated and unciliated epithelia undergo mutual transformation from the secretory phase (or mid-secretory phase) to the menstrual phase, and then to the proliferative phase. Here, we demonstrate the transition of ciliated cells to unciliated cells from the SEC and WOI stages to the CTRL stage.

      Notably, the two cell types coexist, and what is presented here merely reflects a transformation trend. Relative content has been incorporated into the Discussion section (Line 319-321).

      “Throughout the menstrual cycle, ciliated and unciliated epithelia undergo mutual transformation from the secretory phase (or mid-secretory phase) to the menstrual phase, and then to the proliferative phase.”

      Fig 5H: To show "enhanced invasion ability", authors must provide some quantification and statistic analysis. It is very hard to see the difference between the CTRL and SEC regarding ROR2Wnt5A.

      We appreciate your suggestion. Quantification and statistic analysis have been added to Figure 5H.

      Fig 6A: please elaborate the "mIVC1" and "mIVC2" in the figure legends.

      Additions have been made to the figure legends accordingly, as follows: "mIVC1: modified In Vitro Culture Medium 1; mIVC2: modified In Vitro Culture Medium 2."

      Fig S1D: Is the PAS staining also done in CTRL assembloids? In addition, it is stated that the assembloids secrete glycogen because of a positive PAS staining, while it could also be neutral mucins, glycoproteins, etc, which are all detected by PAS staining. So, the authors should be more careful in stating that it is glycogen, or a PAS staining with diastase digestion should be done.

      The PAS staining results for the CTRL group are presented in Fig. S1I. In addition, results of PAS staining with diastase digestion are included in Figure S1.

      Line 120: references?

      The reference has been added accordingly.

      Line 178: The term 'Endometrial Receptivity Test (ERT)' is used. Do the authors mean Endometrial Receptivity Analysis (ERA) test? ERA is the commonly used abbreviation for this test. Moreover, the authors describe ERA as 'a kind of gene analysis-based test.' This should be rephrased more scientifically correct.

      Thank you for your valuable suggestion. We have revised the term to ERA, and modified the phrase "a kind of gene analysis-based test" to "gene expression profiling-based diagnostic assay" (Lines 160–163).

      “We performed Endometrial Receptivity Analysis (ERA), a gene expression profiling-based diagnostic assay that integrates high-throughput sequencing and machine learning to quantify the expression of endometrial receptivity-associated genes.”

      Line 83: assemblies à assembloids

      We appreciate your suggestion. The text has been updated to “the endometrial assembloids progressed from epithelial organoids, to assemblies of epithelial and stromal cells and then to stem cell-laden 3D artificial endometrium”.

      The Materials and Methods section currently lacks the needed details. Authors should substantially expand this section to clearly describe all experimental and analytical procedures, including, aùmong others, immunofluorescence staining, quantification methods, bioinformatics analyses and statistical approaches. Providing comprehensive methodological information is essential.

      A detailed description of these methods is provided in the Supporting Materials and Methods section.

      Reviewer #2 (Recommendations for the authors): 

      The revised manuscript is much improved in clarity, focus, and experimental support. The authors have thoughtfully addressed the major concerns from the previous review. In particular, the logic and flow of the paper are clearer, it now guides the reader through the rationale (constructing a WOI model), the comparative analysis against in vivo tissue and simpler organoids, and the key features that distinguish the WOI assembloid. The added functional validation (especially the blastoid co-culture experiment) significantly strengthens the work by showing a tangible outcome of "receptivity" beyond molecular profiling. The distinction between the standard secretory-phase organoid and the WOI assembloid is now more convincing, as the authors highlight several specific differences in morphology (more cilia, pinopodes), metabolism, and implantation success that favor the WOI model. The manuscript also reads cleaner with the bioinformatic sections condensed to the most important findings (excess detail was trimmed or moved to supplements) and the rationale for gene/pathway selection explicitly stated.

      The manuscript has been significantly strengthened through the addition of functional assays (like the blastoid co-culture), clearer transcriptomic and proteomic data, and detailed analyses of hormone treatments, cilia biology, and stromal and immune cell behavior in early passages. These updates confirm that the WOI assembloid supports embryo attachment and outperforms standard secretory organoids, while integrating external references and clarifications on terminology. Minor suggestions remain, such as clarifying statistical significance and adding functional interpretations for certain observations, but overall, the manuscript is now more robust and biologically convincing.

      Remaining points for clarification: There are a few minor points that still merit attention:

      - Use of the Endometrial Receptivity Test (ERT): As previously mentioned, if the authors have ERT data for the SEC organoid group, including that information would further support the claim that the WOI assembloid is uniquely receptive. If not, it would be helpful to add a statement clarifying that the ERT was employed specifically as a confirmatory test for the WOI assembloids, rather than as a comparative measure across all groups.

      Thank you for your valuable suggestion. We have now supplemented the description in the Supporting Materials and Methods section (Lines 160–162) as follows: “ERA was employed specifically as a confirmatory test for the WOI assembloids, rather than as a comparative measure across all groups.”

      - Because the assembloids are created from primary tissue samples, it would be helpful to briefly comment on how consistent the findings were across different patient-derived samples. For example, did all biological replicates show similar expression of receptivity markers and comparable capacity to support blastoid attachment? Although this seems implied, including a sentence in the Methods or Results sections that specifies the number of donor lines tested would help readers assess the model's variability and reproducibility.

      We appreciated your advice. The relevant statement has been added to the Supporting Materials and Methods section. (Line 312-313).

      “All biological replicates (fourteen individuals) of endometrial assembloids show similar expression of receptivity markers and comparable capacity to support blastoid attachment.”

      - The authors mention promising future directions, such as integrating 3D printing and microfluidics to further enhance the model, which is an excellent forward-looking statement. It would also be valuable to suggest the inclusion of additional cell types, like more robust immune cell populations or endothelial components, as future improvements to create an even more comprehensive model of the endometrial lining.

      Thank you for your valuable suggestion. 3D printing and microfluidics serve as approaches for introducing multiple cell types. We have supplemented the following statement in the manuscript: “We are looking forward to integrating stem cell induction, 3D printing, and microfluidic systems to modify the culture environment.” (Line 352-353).

      We are grateful for your valuable feedback and constructive criticism, which have helped us improve the quality of our work in terms of content and presentation. We have diligently revised the manuscript and made necessary changes. Here, we have attached the revised manuscript, figures, and all supplementary materials for your re-evaluation. Thank you again for your continued support and look forward to your favorable decision.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This paper presents maRQup, a Python pipeline for automating the quantitative analysis of preclinical cancer immunotherapy experiments using bioluminescent imaging in mice. maRQup processes images to quantify tumor burden over time and across anatomical regions, enabling large-scale analysis of over 1,000 mice. The study uses this tool to compare different CAR-T cell constructs and doses, identifying differences in initial tumor control and relapse rates, particularly noting that CD19.CD28 CAR-T cells show faster initial killing but higher relapse compared to CD19.4-1BB CAR-T cells. Furthermore, maRQup facilitates the spatiotemporal analysis of tumor dynamics, revealing differences in growth patterns based on anatomical location, such as the snout exhibiting more resistance to treatment than bone marrow.

      Strengths:

      (1) The maRQup pipeline enables the automatic processing of a large dataset of over 1,000 mice, providing investigators with a rapid and efficient method for analyzing extensive bioluminescent tumor image data.

      (2) Through image processing steps like tail removal and vertical scaling, maRQup normalizes mouse dimensions to facilitate the alignment of anatomical regions across images. This process enables the reliable demarcation of nine distinct anatomical regions within each mouse image, serving as a basis for spatiotemporal analysis of tumor burden within these consistent regions by quantifying average radiance per pixel.

      Weaknesses:

      (1) While the pipeline aims to standardize images for regional assessment, the reliance on scaling primarily along the vertical axis after tail removal may introduce limitations to the quantitative robustness of the anatomically defined regions. This approach does not account for potential non-linear growth across dimensions in animals of different ages or sizes, which could result in relative stretching or shrinking of subjects compared to an average reference.

      Our answer to this comment is included in the Supplemental Methods. The standard deviation of the mouse pixels was calculated to ensure that the image processing steps did not alter the shape or size of the mice. Such consistency is particularly striking because our dataset was accrued by nine lab members over the last five years, before we conceived and carried out our analysis (c.f., answer to point #2). In fact, it is the very consistency of this IVIS measurement that led us to conceive our pipeline. As seen from Supplemental Figure 4G, there is minimal difference in the shape or size of the mice across 7,534 images. A total of 99 images were removed either due to being too slanted (91/7663, 1.2%) or due to processing errors (8/7633, 0.1%). Also, the vertical scaling was conducted while keeping the aspect ratio unchanged to prevent any non-anatomical scaling. Hence, we did not record any nonlinear growth of the mice that would warrant more convoluted alignment and/or batch correction for our images.

      (2) Furthermore, despite excluding severely slanted images, the pipeline does not fully normalize for variations in animal pose during image acquisition (e.g., tucked body, leaning). This pose variability not only impacts the precise relative positioning of internal anatomical regions, potentially making their definition based on relative image coordinates more qualitative than truly quantitative for precise regional analysis, but it also means that the bioluminescent light signal from the tumor will not propagate equally to the camera, as photons will travel differentially through the tissue. This differing light path through tissues due to variable positioning can introduce large variability in the measured radiance that was not accounted for in the analysis algorithm. Achieving more robust anatomical and quantitative normalization might require methods that control animal posture using a rigid structure during imaging.

      Reviewer #1 is correct that different mouse postures would be an issue when aligning the images and normalizing for size. However, all experiments are conducted for luminescence measurements in the IVIS system (i.e., this requires anesthesia and long integration time for imaging). In our experience and in our 1000+ mouse dataset, we noticed that all experiments (n=37) did place the anesthetized mice in a stretched/elongated position. Of note, these experiments were conducted by nine different researchers who were not instructed on how to place the mice on the machine for ideal image processing, thus showing that the standard protocol of imaging mice on IVIS does not introduce large variations in animal pose during image acquisition. We think the issue raised by Reviewer #1 is moot in the context of classical settings for mouse luminescence imaging.

      Reviewer #2 (Public review):

      Summary:

      The authors developed a method that automatically processes bioluminescent tumor images for quantitative analysis and used it to describe the spatiotemporal distribution of tumor cells in response to CD19-targeting CAR-T cells, comprising CD28 or 4-1BB costimulatory domains. The conclusion highlights the dependence of tumor decay and relapse on the number of injected cells, the type of cells, and the initial growth rate of tumors (where initial is intended from the first day of therapy). The authors also determined the spatiotemporal analysis of tumor response to CAR T therapy in different regions of the mouse body in a model of acute lymphoblastic leukemia (ALL).

      Strengths:

      The analysis is based on a large number of images and accounts for many variables. The results of the analysis largely support their claims that the kinetics of tumor decay and relapse are dependent on the CAR T co-stimulatory domain and number of cells injected and tumor growth rates. 

      Weaknesses:

      The study does not specify how a) differences in mouse positioning (and whether they excluded not-aligned mice) and b) tumor spread at the start of therapy influenced their data. The study does not take into account the potential heterogeneity of CAR T cells in terms of CAR T expression or T cell immunophenotype (differentiation, exhaustion, fitness...).

      See answer #2 to Reviewer #1.

      Author response image 1.

      Author response image 1 shows the average tumor radiance on day zero (when CAR-T cell therapy was administered) for all mice. While there is some spread, most mice had tumor localized to the liver or bone marrow.

      Reviewer #3 (Public review):

      Summary:

      The paper "The 1000+ mouse project: large-scale spatiotemporal parametrization and modeling of preclinical cancer immunotherapies" is focused on developing a novel methodology for automatic processing of bioluminescence imaging data. It provides quantitative and statistically robust insights into preclinical experiments that will contribute to optimizing cell-based therapies. There is an enormous demand for such methods and approaches that enable the spatiotemporal evaluation of cell monitoring in large cohorts of experimental animals.

      Strengths:

      The manuscript is generally well written, and the experiments are scientifically sound. The conclusions reflect the soundness of experimental data. This approach seems to be quite innovative and promising to improve the statistical accuracy of BLI data quantification. 

      This methodology can be used as a universal quantification tool for BLI data for in vivo assessment of adoptively transferred cells due to the versatility of the technology.

      Weaknesses: 

      No weaknesses were identified by this Reviewer. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      In this paper, the authors propose a significant advancement in optical image data analysis by employing automation. They effectively demonstrate the valuable insights that can be gained from analyzing extensive datasets with a more unbiased methodology. At present, I do not have any specific suggestions for improvement.

      However, it is important to note that this work is limited in its operational scope. Specifically, it relies on predefined ROIs rather than aligning the signal site with anatomical systems. The scaling model and image cropping are simplistic, animal pose is not taken into account, and the data output needs to be called semi-quantitative or qualitative, and would have been stronger utilizing an AI agent. Nevertheless, this work underscores the potential of automated systems in preclinical image analysis, which is a crucial step towards developing more sophisticated approaches to optical image data analysis.

      While our analysis used predefined ROIs, the maRQup pipeline allows users to manually draw ROIs on the mouse image.

      Reviewer #2 (Recommendations for the authors):

      The writing and presentation of data are clear and accurate, but some additional information should be added regarding the imaging protocol used to acquire the original data. 

      The authors mention fluorescence in Figure 1. I expected all the data to be generated from bioluminescent NALM-6 tumors, since bioluminescence is indeed measured in average radiance and can be per pixel (p/sec/cm2/sr/pixel). Fluorescence should be measured using radiance efficiency (p/sec/cm2/sr)/(µW/cm2), a unit that compensates for non-uniform excitation light pattern in the instrument. Would the author find different results if fluorescence data were analyzed separately?

      Reviewer #2 is correct that the unit for fluorescence would be radiance efficiency. The word “fluorescent” was included in the label of Figure 1a  to highlight that our workflow could be applied to other types of light-generating methods (i.e., fluorescence vs. bioluminescence). However, in this study, measurements of bioluminescent tumors only were analyzed. If fluorescence measurements are to be analyzed, our methods of image acquisition and processing would be directly applicable.

      Did the author ever check the signal of the snout in mice with no tumor?

      In mice with no tumor, there is no detectable signal in the snout (or anywhere else, for that matter).

      The urine of mice contains phosphor, and might give a background signal, especially if longer exposure is used at the end of the study.

      For the mice with no tumor injection, the luminescence signal was below background (<10<sup>2</sup> p/sec/cm<sup>2</sup>/sr/pixel). In particular, we do not detect any signal in the bladder/urine. Additionally, as described in the Supplemental Methods and Figure 1b, only pixels that were on the mouse as determined from the brightfield image were used to calculate the tumor burden from the radiance of the luminescent image. This method ensures that any background signal (e.g., from phosphor in mouse urine) would be excluded in the radiance quantification and not bias the results.

      Additionally, as described in the Methods, the exposure time was held constant at 30 seconds for each IVIS measurement across all 37 experiments.

      The data using more than 2 million cells comes from only 10 mice, and maybe the biological relevance of this group is limited since it will not be achievable and translatable in humans (PMID: 33653113).

      We appreciate Reviewer #2’s attention to this issue. The effect observed in our study is large enough to reach statistical significance despite the small number of mice. Note that the dosing regimen used was optimized for the murine NSG model and would require appropriate scaling before clinical application. Nonetheless, NSG mice remain the gold standard for pre‑clinical in vivo evaluation and their use is generally required by regulatory agencies, such as the FDA, for assessing novel CAR‑T cell therapies; thus these findings are relevant for advancing such treatments.

    1. Briefing : Préparation de la 10ème Semaine de l'ESS à l'école (SESSE 2026)

      Résumé Exécutif

      Ce document synthétise les points clés du webinaire organisé par l'association L'ESPER en préparation de la 10ème édition de la Semaine de l'Économie Sociale et Solidaire (ESS) à l'école, qui se déroulera du 23 au 28 mars 2026.

      Copiloté avec l'OCCE, cet événement vise à sensibiliser les élèves, du primaire au supérieur, aux modèles économiques alternatifs basés sur la démocratie, la justice sociale et l'intérêt général.

      Le webinaire souligne une double ambition : éduquer à l'ESS (compréhension des modèles) et par l'ESS (expérimentation de projets collectifs).

      Les interventions mettent en avant des dispositifs concrets, des témoignages d'acteurs de terrain (notamment des Scops et des Scics) et une panoplie d'outils pédagogiques « clés en main » pour les enseignants.

      L'objectif final est de transformer la société en intégrant ces principes dans le parcours scolaire et citoyen des individus.

      --------------------------------------------------------------------------------

      1. Cadre Institutionnel et Ambitions Éducatives

      L'association L'ESPER, regroupant 41 organisations de l'éducation et de l'ESS, porte une vision politique et pédagogique forte pour le système éducatif français.

      Vision et Plaidoyer

      L'ESPER considère l'ESS comme un levier nécessaire pour transformer l'économie. Ses ambitions s'articulent autour de deux axes :

      Éducation à l'ESS : Faire comprendre un modèle de société basé sur la justice sociale et l'intérêt général. Un plaidoyer publié en août 2025 appelle d'ailleurs à l'intégration de l'ESS dans les programmes scolaires dès le collège.

      Éducation par l'ESS : Favoriser l'émancipation individuelle et collective par la mise en œuvre de projets concrets en classe, permettant aux élèves de découvrir la coopération par l'action.

      La Semaine de l'ESS à l'école (SESSE)

      Inscrite au calendrier de l'Éducation Nationale, cette semaine annuelle permet trois modes d'engagement :

      1. Équipes éducatives : Valorisation de projets annuels ou organisation d'actions ponctuelles.

      2. Acteurs de l'ESS : Accueil de classes dans leurs structures ou interventions directes en milieu scolaire.

      3. Élèves/Étudiants : Montage de projets autonomes et sensibilisation de leurs pairs.

      --------------------------------------------------------------------------------

      2. Fondamentaux de l'Économie Sociale et Solidaire

      L'ESS n'est pas une économie récente, mais elle s'est institutionnalisée, notamment via la loi Hamon du 31 juillet 2014.

      Les 5 types de structures de l'ESS

      | Type de structure | Caractéristiques principales | | --- | --- | | Associations | Groupements de personnes volontaires autour d'un projet non lucratif. | | Fondations | Affectation irrévocable de biens à une œuvre d'intérêt général. | | Coopératives | Entreprises où les associés partagent le pouvoir et les bénéfices. | | Mutuelles | Organismes à but non lucratif pratiquant la solidarité entre membres. | | Sociétés commerciales de l'ESS | Sociétés privées respectant les principes de l'ESS. |

      Principes et Valeurs Cardinaux

      Toutes ces organisations partagent un socle commun :

      Finalité d'intérêt général ou collectif.

      Lucrativité limitée : Les bénéfices sont prioritairement réinvestis dans le projet.

      Gestion démocratique : Application du principe « une personne, une voix », indépendamment du capital détenu.

      --------------------------------------------------------------------------------

      3. Retours d'Expérience et Témoignages d'Acteurs

      L'Union Régionale des Scops et Scics (Occitanie)

      Eugénie Bruni souligne l'importance de la promotion du modèle coopératif auprès des jeunes.

      Actions types : Interventions de 2 heures présentant l'histoire, les spécificités et des exemples concrets de coopératives.

      Impact : Ouverture des perspectives professionnelles pour les étudiants en montrant que la coopération est un modèle économique viable (4 558 sociétés coopératives en France générant 10,2 milliards d'euros de chiffre d'affaires).

      Conseils : Ne pas hésiter à solliciter les Unions Régionales qui disposent de délégués sur tout le territoire pour accompagner les projets.

      La Scop Morasuti (Imprimerie, région AURA)

      Témoignage de Damien sur une reprise d'entreprise à la barre du tribunal par les salariés.

      Le combat social : Transformation en Scop en juillet 2024. Le modèle a permis de supprimer les jours de carence et de rééquilibrer les salaires pour corriger les inégalités d'ancienneté.

      Engagement scolaire : Mise à disposition gratuite de chutes de matériaux pour les écoles et accompagnement technique (design, PAO) pour des projets d'exposition.

      Observation sur la démocratie : Les élèves sont souvent surpris par la double casquette « ouvrier et patron ». Damien explique : « Personne ne peut être d'accord avec tout... la démocratie, c'est aux voix. »

      --------------------------------------------------------------------------------

      4. Ressources et Outils Pédagogiques

      L'ESPER propose des outils testés et adaptés pour différents niveaux (collège, lycée, supérieur).

      Outils de sensibilisation "Clés en main"

      | Outil | Objectif | Méthode | | --- | --- | --- | | Junior Coopérative | Initier à la méthodologie de projet. | Puzzle sur les étapes d'un projet et études de cas réels. | | Idées reçues sur l'ESS | Déconstruire les préjugés. | Débat mouvant à partir de cartes "Vrai/Faux". | | Filmographie ESS | Illustrer les réalités de l'ESS. | Sélection de documentaires avec guides pédagogiques. | | Fiches Pratiques | Organiser une intervention. | Guides logistiques pour les visites d'entreprises ou les interventions en classe. |

      Recommandations pour les intervenants

      Adaptation : Simplifier le discours pour les collégiens en se concentrant sur les piliers (solidarité, partage des richesses, démocratie) plutôt que sur les détails juridiques.

      Interactivité : Utiliser des supports vidéo (ex: série "Ma boîte en Scop") et favoriser le dialogue.

      Préparation : Prévoir environ une heure d'échange préalable entre l'enseignant et l'intervenant pour cadrer l'action.

      --------------------------------------------------------------------------------

      5. Calendrier et Inscriptions

      Inscriptions : Ouvertes sur le site de L'ESPER. L'équipe salariée assure la mise en relation entre les établissements scolaires et les acteurs de l'ESS.

      25 février 2026 : Second webinaire de préparation dédié à une présentation détaillée de l'ESS avec l'expert Hervé de Falvar.

      23 au 28 mars 2026 : Déroulement de la Semaine de l'ESS à l'école. Valorisation des actions sur les réseaux sociaux et newsletters de L'ESPER.

      Citation clé : « Le SS porte un modèle de société qui est basé notamment sur la démocratie, la justice sociale, l'intérêt général [...] pour aboutir à une société plus juste dans laquelle les individus sont émancipés individuellement mais également collectivement. »

    1. État des Lieux Scientifique des Thérapies Manuelles : Entre Mythes et Réalités

      Résumé Exécutif

      Ce document de synthèse analyse l'état actuel des connaissances scientifiques concernant les thérapies manuelles (kinésithérapie, ostéopathie, chiropraxie, étiopathie), avec un accent particulier sur le mal de dos, principal motif de consultation.

      Les points saillants sont les suivants :

      Le primat du mouvement : La science moderne démontre que le traitement le plus efficace contre la lombalgie est le mouvement actif.

      Les thérapies passives ne doivent pas être utilisées de manière isolée.

      Obligations légales et déontologiques : Contrairement aux pseudomédecines, la kinésithérapie est encadrée par l'obligation d'utiliser des moyens conformes aux « données acquises de la science », un principe juridique ancré depuis l'arrêt Mercier de 1936.

      Déconstruction des mythes : Les concepts de « vertèbre déplacée » ou de « bassin décalé » sont des vues de l'esprit sans réalité anatomique.

      La palpation manuelle, bien que rassurante, manque de fiabilité scientifique pour établir un diagnostic de texture ou de blocage.

      Risques et conséquences sociales : Au-delà de l'effet placebo ou contextuel, certaines manipulations (notamment cervicales) présentent des risques graves comme l'accident vasculaire cérébral (AVC).

      De plus, ces pratiques peuvent parasiter les messages de santé publique et altérer la littératie en santé des patients.

      --------------------------------------------------------------------------------

      1. L'Évolution de la Science face au Mal de Dos

      L'approche médicale de la lombalgie a radicalement changé au cours des trente dernières années, passant d'une logique de repos à une logique d'action.

      Chronologie des changements de paradigme

      1986 : Une étude du New England Journal of Medicine suggère que deux jours de repos au lit sont plus bénéfiques que sept jours.

      1995 : Une étude pivot démontre que le groupe "témoin" (continuant à vivre normalement) récupère mieux que les groupes soumis à un repos strict ou à des exercices trop prudents.

      2019 : La Haute Autorité de Santé (HAS) et l'Assurance Maladie lancent des recommandations officielles : « Le bon traitement, c'est le mouvement ».

      Les thérapies passives isolées sont déclarées inefficaces sur l'évolution de la lombalgie.

      Le bénéfice physiologique du mouvement

      Contrairement aux idées reçues, des activités comme la course à pied améliorent la physiologie discale.

      L'alternance de pressions et dépressions (environ 1 Hz) lors de la course permet d'hydrater les disques intervertébraux. Statistiquement, les coureurs de fond souffrent moins du dos que les autres sportifs.

      --------------------------------------------------------------------------------

      2. Cadre Juridique et Déontologique : La Science comme Obligation

      La distinction entre kinésithérapie et thérapies alternatives repose sur un fondement juridique historique.

      L'Arrêt Mercier (1936)

      Ce tournant de la Cour de cassation a établi trois principes majeurs :

      1. Le contrat de soins : Il existe un lien contractuel entre le soignant et le patient.

      2. L'obligation de moyens : Le soignant n'a pas d'obligation de résultat (guérison), mais doit mettre en œuvre tous les moyens nécessaires.

      3. Les données acquises de la science : Les moyens choisis doivent être conformes aux connaissances scientifiques actuelles.

      Évolution des pratiques en kinésithérapie

      Le code de déontologie impose aux kinésithérapeutes d'abandonner les pratiques invalidées. Par exemple :

      Bronchiolite : La kinésithérapie respiratoire pédiatrique n'est plus recommandée depuis 2019 pour les nourrissons sains, car le bénéfice est jugé insuffisant par rapport au caractère traumatisant du soin.

      Massage : Son usage est désormais limité (cicatrices, œdèmes) et n'est plus recommandé comme traitement de première intention pour le mal de dos.

      --------------------------------------------------------------------------------

      3. Analyse Critique des Thérapies Manuelles

      Les limites de la palpation et du diagnostic manuel

      La science démontre que le sens tactile des praticiens est sujet à l'illusion.

      Manque de fiabilité : Deux évaluateurs sont rarement d'accord sur la texture (dur/mou) ou le caractère « bloqué » d'un tissu.

      Précision anatomique : En palpant une structure évidente sous la peau, l'erreur moyenne est de 5 cm.

      Impossibilité mécanique : Il est impossible de mobiliser une seule vertèbre de façon isolée ; une manipulation en impacte au minimum trois.

      Effet "Gate Control" et placebo

      Les thérapies manuelles produisent un effet antalgique réel mais transitoire :

      Distraction sensorielle : Le système nerveux privilégie les sensations tactiles, de chaud ou de froid sur la douleur. C'est un effet à court terme (quelques minutes à quelques heures).

      Effet contextuel : Le rituel de la consultation, l'attention portée par le praticien et la régression naturelle vers la moyenne (la douleur diminue souvent d'elle-même au moment où l'on consulte) renforcent l'illusion d'efficacité.

      --------------------------------------------------------------------------------

      4. Histoire et Fondements des Pseudomédecines Manuelles

      Les thérapies comme l'ostéopathie ou la chiropraxie reposent sur le vitalisme, une philosophie du XIXe siècle postulant l'existence d'une « force vitale » non physique.

      | Discipline | Origine | Fondements Idéologiques | État actuel en Europe | | --- | --- | --- | --- | | Ostéopathie | A.T. Still (1874) | "Le corps est la pharmacie de Dieu". Flux sanguin synonyme de santé. | Branche "puriste" (Littlejohn) très présente, axée sur le crânio-sacré et le fluidique. | | Chiropraxie | D.D. Palmer (1895) | Système nerveux central comme maître du corps. Recours aux manipulations à haute vélocité (faire craquer). | Pratique restée proche des concepts originels, avec une forte présence sur les réseaux sociaux. | | Étiopathie | C. Trédaniel (Fr) | Recherche de l'origine de la pathologie dans l'ajustement articulaire. | Très similaire à l'ostéopathie, sans distinction scientifique réelle. |

      Note sur l'exception américaine : Aux États-Unis, l'ostéopathie s'est médicalisée suite au rapport Flexner (1910). Les "DO" y sont des médecins généralistes qui ne pratiquent quasiment plus de thérapie manuelle, contrairement à la branche européenne restée mystique.

      --------------------------------------------------------------------------------

      5. Risques et Impacts Sociétaux

      Sécurité et perte de chance

      Risques graves : Les manipulations cervicales peuvent provoquer des dissections de l'artère vertébrale, entraînant des AVC ou le syndrome de "Locked-in" (paralysie totale avec conscience préservée).

      Erreurs de diagnostic : Le recours direct à ces thérapies sans avis médical peut retarder la prise en charge de pathologies graves (ex: fractures non détectées).

      Parasitage du message médical

      Le "vernis médical" utilisé par ces disciplines (mots tels que « diagnostic », « anamnèse », « consultation ») crée une confusion chez les patients :

      Atteinte à la littératie en santé : En ancrant des concepts erronés (vertèbre déplacée, jambe plus courte), les praticiens créent une dépendance et une peur de bouger (kinésiophobie).

      Facteurs sociaux : Le principal facteur de persistance d'une lombalgie n'est pas mécanique, mais lié à l'insatisfaction au travail ou à des problèmes sociétaux. Les thérapies manuelles, en se focalisant sur le "crack and go", ignorent cette complexité.

      Conclusion

      Si les thérapies manuelles offrent un soulagement temporaire et un confort relationnel, elles ne constituent pas une solution de fond au mal de dos.

      La science préconise une approche centrée sur l'éducation thérapeutique, la gestion de la motivation et, impérativement, le mouvement actif du patient.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review): 

      Strengths:

      (1) The use of chronic two-photon Ca<sup>2+</sup> imaging in awake, behaving mice represents a major technical strength, minimizing confounds introduced by anesthesia. The development of a Pf4Cre:GCaMP6s reporter line, combined with high-resolution intravital imaging, enables long-term and subcellular analysis of macrophage Ca<sup>2+</sup> dynamics in the meninges.

      (2) The comparison between perivascular and non-perivascular macrophages reveals clear niche-dependent differences in Ca<sup>2+</sup> signaling properties. The identification of macrophage Ca<sup>2+</sup> activity temporally coupled to dural vasomotion is particularly intriguing and highlights a potential macrophage-vascular functional unit in the dura.

      (3) By linking macrophage Ca<sup>2+</sup> responses to CSD and implicating CGRP/RAMP1 signaling in a subset of these responses, the study connects meningeal macrophage activity to clinically relevant neuroimmune pathways involved in migraine and other neurological disorders.

      Thank you for recognizing the strengths in our work.

      Weaknesses: 

      (1) The manuscript relies heavily on Pf4Cre-driven GCaMP6s expression to selectively image meningeal macrophages. Although prior studies are cited to support Pf4 specificity, Pf4 is not an exclusively macrophage-restricted marker, and developmental recombination cannot be excluded. The authors should provide direct validation of reporter specificity in the adult meninges (e.g., co-labeling with established macrophage markers and exclusion of other Pf4-expressing lineages). At minimum, the limitations of Pf4Cre-based labeling should be discussed more explicitly, particularly regarding how off-target expression might affect Ca<sup>2+</sup> signal interpretation.

      We acknowledge that PF4 is not an exclusively macrophage-restricted marker. Yet, among meningeal immunocytes, it is almost exclusively expressed in macrophages (1, 2). Furthermore, in the adult mouse meninges, Pf4<sup>Cre</sup>-based reporter lines label nearly all dural and leptomeningeal macrophages and almost no other cells (3, 4). This Cre line has also been used to target border-associated macrophages (2, 4). Moreover, a recent study suggests that the bacterial artificial chromosome used to generate the Pf4<sup>Cre</sup> line does not affect meningeal macrophage activity (4). Nonetheless, while we already discussed PF4 expression in meningeal megakaryocytes, in a revised version, we plan to discuss the possibility that a very small population of other meningeal immune cells may also be labeled.

      (2) The manuscript offers an extensive characterization of Ca<sup>2+</sup> event features (frequency spectra, propagation patterns, synchrony), but the biological significance of these signals is largely speculative. There is no direct link established between Ca<sup>2+</sup> activity patterns and macrophage function (e.g., activation state, motility, cytokine release, or interaction with other meningeal components). The discussion frequently implies functional specialization based on Ca<sup>2+</sup> dynamics without experimental validation. To strengthen the conceptual impact, a clearer framing of the study as a foundational descriptive resource, rather than a functional dissection, would improve alignment between data and conclusions.

      In our discussion, we indicated that “the exact link between the distinct Ca<sup>2+</sup> signal properties of meningeal macrophage subsets observed herein and their homeostatic function remains to be established”. In a revised version, we plan to further acknowledge that this is primarily a descriptive study that provides a foundational landscape of Ca<sup>2+</sup> dynamics in meningeal macrophages.

      (3) The GLM analysis revealing coupling between dural perivascular macrophage Ca<sup>2+</sup> activity and vasomotion is technically sophisticated and intriguing. However, the directionality of this relationship remains unresolved. The current data do not distinguish whether macrophages actively regulate vasomotion, respond to mechanical or hemodynamic changes, or are co-modulated by neural activity. Statements suggesting that macrophages may "mediate" vasomotion are therefore premature. The authors should reframe these conclusions more cautiously, emphasizing correlation rather than causation, and expand the discussion to explicitly outline experimental strategies required to establish causality (e.g., macrophage-specific Ca<sup>2+</sup> manipulation). 

      In the results section, we indicated that our data suggest that dural perivascular macrophages are functionally coupled to locomotion-driven dural vasomotion, either responding to it or mediating it. Furthermore, in our discussion, we discussed the possibilities that 1) macrophages sense vascular-related mechanical changes and 2) macrophage Ca<sup>2+</sup> signaling may regulate dural vasomotion. Moreover, we explicitly state that studying causality will require an experimental approach that has yet to be developed, enabling selective manipulation of dural perivascular macrophages.

      (4) The authors conclude that synchronous Ca<sup>2+</sup> events across macrophages are driven by extrinsic signals rather than intercellular communication, based primarily on distance-time analyses. This conclusion is not sufficiently supported, as spatial independence alone does not exclude paracrine signaling, vascular cues, or network-level coordination. No perturbation experiments are presented to test alternative mechanisms. The authors can either provide additional experimental evidence or rephrase the conclusion to acknowledge that the source of synchrony remains unresolved. 

      Thank you for this suggestion. In the revision, we will indicate that the source of synchrony remains unresolved.

      (5) A major and potentially important finding is that the dominant macrophage response to CSD is a persistent decrease in Ca<sup>2+</sup> activity, which is independent of CGRP/RAMP1 signaling. However, this phenomenon is not mechanistically explored. It remains unclear whether Ca<sup>2+</sup> suppression reflects macrophage inhibition, altered viability, homeostatic resetting, or an anti-inflammatory program. Minimally, the discussion should be more deeply engaged with possible interpretations and implications of this finding. 

      While we propose that the decrease in macrophage calcium signaling following CSD could indicate that a hyperexcitable cortex dampens meningeal immunity, in the revised version, we plan to elaborate on the possible implications of this finding.

      (6) The pharmacological blockade of RAMP1 supports a role for CGRP signaling in persistent Ca<sup>2+</sup> increases after CSD, but the experiments are based on a relatively small number of cells and animals. The limited sample size constrains confidence in the generality of the conclusions. Pharmacological inhibition alone does not establish cell-autonomous effects in macrophages. The authors should acknowledge these limitations more explicitly and avoid overextension of the conclusions. 

      We plan to acknowledge these limitations.

      Reviewer #2 (Public review): 

      Using chronic intravital two-photon imaging of calcium dynamics in meningeal macrophages in Pf4Cre:TIGRE2.0-GCaMP6 mice, the study identified heterogeneous features of perivascular and non-perivascular meningeal macrophages at steady state and in response to cortical spreading depolarization (CSD). Analyses of calcium dynamics and blood vessels revealed a subpopulation of perivascular meningeal macrophages whose activity is coupled to behaviorally driven diameter fluctuations of their associated vessels. The analyses also investigated synchrony between different macrophage populations and revealed a role for CGRP/RAMP1 signaling in the CSD-induced increase, but not the decrease, in calcium transients.

      This is a timely study at both the technical and conceptual levels, examining calcium dynamics of meningeal macrophages in vivo. The conclusions are well supported by the findings and will provide an important foundation for future research on immune cell dynamics within the meninges in vivo. The paper is well written and clearly presented.

      Thank you.

      I have only minor comments. 

      (1) Please indicate the formal definition of perivascular versus non-perivascular macrophages in terms of distance from the blood vessel. This information is not provided in the main text or the Methods. In addition, please explain how the meningeal vasculature was imaged in the main text. 

      We did not measure the exact distance of the perivascular macrophages from the blood vessels, but defined them as such based on previous data showing that these cells reside along the abluminal surface and maintain tight interactions with mural cells (5). We plan to provide this information in the revised manuscript.

      (2) Similarly, the method used to induce acute CSD (pin prick) is not described in the main text and is only mentioned in the figure legends and Methods. Additional background on the neurobiology of acute CSD, as well as the resulting brain activity and neuroinflammatory responses, could be helpful.

      We plan to add the method for inducing CSD (i.e., a pinprick in the frontal cortex) to the Results section and provide more background in the Introduction section.

      Reviewer #3 (Public review):

      Strengths: 

      Sophisticated in vivo imaging of meningeal immune cells is employed in the study, which has not been performed previously. A detailed analysis of the distinct calcium dynamics in various subtypes of meningeal macrophages is provided. Functional relevance of the responses is also noted in relation to CSD events.

      Thank you for recognizing the strengths of our paper

      Weaknesses:

      (1) The specificity of the methods used to target both meningeal macrophages and RAMP1 is limited. Additional discussion points on the functional relevance of the two subtypes of meningeal macrophages and their calcium responses are warranted. A section on potential pitfalls should be included. 

      We plan to address these issues in the revision

      References

      (1) H. Van Hove et al., A single-cell atlas of mouse brain macrophages reveals unique transcriptional identities shaped by ontogeny and tissue environment. Nat Neurosci 22, 1021-1035 (2019).

      (2) F. A. Pinho-Ribeiro et al., Bacteria hijack a meningeal neuroimmune axis to facilitate brain invasion. Nature 615, 472-481 (2023).

      (3) G. L. McKinsey et al., A new genetic strategy for targeting microglia in development and disease. Elife 9,  (2020).

      (4) H. J. Barr et al., The circadian clock regulates scavenging of fluid-borne substrates by brain border-associated macrophages. bioRxiv,  (2025).

      (5) H. Min et al., Mural cells interact with macrophages in the dura mater to regulate CNS immune surveillance. J Exp Med 221,  (2024).

    1. Reviewer #1 (Public review):

      Summary:

      In this study, the authors' aim was to determine whether hepatic palmitoylation is a physiologically relevant regulator of systemic metabolism. The data demonstrate that loss of DHHC7 in hepatocytes disrupts Gαi palmitoylation, enhances cAMP-PKA-CREB signaling, and drives transcriptional upregulation and secretion of Prg4. The KO mice display increased body weight, fat mass, and plasma cholesterol, but at 12 weeks on HFD, do not exhibit insulin resistance. The potential mechanism underlying the metabolic phenotype was examined by assessing adipocyte signaling and by exploring whether Prg4 acts through GPR146. Through this pathway, the authors intend to link DHHC7-dependent palmitoylation to the regulation of hepatokines that exert systemic metabolic effects.

      Strengths:

      (1) Hepatic palmitoylation in systemic metabolic regulation is largely unexplored. The authors demonstrate the role of DHHC7 in vivo using a successful liver-specific knockout mouse model that causes HFD-dependent obesity without insulin resistance.

      (2) Several studies were performed on chow and HFD, as well as male and female mice.

      (3) Plasma proteomics identified Prg4 as a circulating factor elevated in KO mice. Prg4 overexpression phenocopied the KO mice.

      (4) There is solid mechanistic data supporting the hypothesis that hepatic DHHC7 loss selectively increases Prg4 secretion as a hepatokine.

      (5) There is convincing evidence for the DHHC7 mechanism in liver: DHHC7 controls cAMP-PKA-CREB via Gαi palmitoylation. The authors recognize that the palmitoylation change is causative rather than correlated, and this needs to be more fully explored in the future.

      (6) Strong in vitro data support that Prg4 acts through adipocyte GPR146 via its SMB domain

      Weaknesses:

      (1) The assessment of liver and adipose tissue responses to DHH7 loss is insufficient to support claims that it alters systemic lipolysis. In this new mouse model, liver histology is necessary, especially given the cholesterol increase in the KO. As this is a newly established mouse line, common assessments of the liver during HFD feeding would be important for interpreting the phenotype.

      (2) The data show DHH7 loss causes adipose tissue dysfunction and alterations in lipid metabolism. Beyond that, I suggest not stating more regarding the phenotype of the DHH7 mice for this work. A thorough analysis would be needed to determine which factor drives the obesity and changes in energy balance in the mice. For example, the KO mice had lower oxygen consumption (but no change in CO2 production, which is also usually similarly altered), suggesting a CNS component could drive obesity. However, since the data are not normalized for lean mass and there is no information about locomotor activity, this analysis is incomplete. RER may be informative if available. A broad conservative description of the KO phenotype would be more accurate since Pgr4 has many paracrine targets and likely has autocrine signaling in the liver.

      (3) Most references to lipolysis or lipolysis flux systemically would be inaccurate. To suggest a suppression of lipolysis, serum NEFA would need to be measured, and in vivo or in vitro lipolysis assays performed to test the effect of DHH7 loss or the specificity of PGR4 action on adipocytes in vivo. To demonstrate adipose tissue dysfunction, analysis of lipogenesis markers, canonical markers for insulin sensitivity, and mitochondrial dysfunction should be performed/measured.

      (4) Line 179: The experiment was performed in brown adipocytes to show that Prg4 does not affect p-CREB Figure S8 under the heading: "DHHC7 controls hepatic PKA-CREB activity through Gαi palmitoylation to regulate Prg4 transcription." Unless repeated using liver lysate, the conclusions stated in the text throughout the paper should be revised.

      (5) It appears that the serum and liver proteomics were only assessed for factors that increased in KO mice? Were proteins that were significantly decreased analyzed?

      (6) The beige adipocyte culture method is unclear. The methods do not describe the fat pad used, and the protocol suggests the cells would be differentiated into mature white adipocytes. If they are beige cells, a reference for the method, gene expression, and cell images could support that claim.

      (7) The use of tamoxifen can confound adipocyte studies, as it increases beigeing and weight gain even after a brief initiation period. Both groups were treated with Tam, but another way to induce Cre would be ideal.

      (8) Evidence for the lack of the glucose phenotype is incomplete. One reason could be due to the IP route of glucose administration, which has a large impact on glucose handling during a GTT. To confirm the absence of a glucose tolerance phenotype, an OGTT should be performed, as it is more physiological. In addition, the mice should be fed for 16 weeks. Prg4 affects immune cells, changing how adipose tissue expands, and 12 weeks of HFD feeding is often not long enough to see the effects of adipose tissue inflammation spilling over into the system.

      (9) There may be liver-adipose tissue crosstalk in KO mice, but this was not fully assessed in this study and would be difficult to determine in any setting, given the diverse cell types that are targets of Pdg4. The crosstalk claim is unnecessary to share the basic premises; there is the DHH7 mechanism/phenotype and the Pgr4 mechanism/phenotype, and while there is no Pgr4 adipose direct mechanism, the paper can be successfully reframed.

      (10) Although the DHH7 loss on the chow diet did not result in a phenotype, did the Pgr4 increase in the KO mice on chow? This would determine whether either i) the expression of Pgr4 is dependent on HFD/obesity, or ii) circulating Pgr4 has effects only in an HFD condition. The receptors may also change on HFD, especially in adipocytes.

      Impact:

      This work would significantly contribute to the study of liver metabolism, provided it includes data describing the liver. The role of Pgr4 in adipocytes and other cell types is of substantial value to the field of metabolism. By reframing the paper and conducting some key experiments, its quality and impact can be increased.

    2. Reviewer #2 (Public review):

      In the current report, Sun and Colleagues sought to determine the liver-specific role that DHHC7, a DHHC palmitoyltransferase protein, plays in regulating whole-body energy balance and hepatic crosstalk with adipose tissues. The authors generated an inducible, liver-specific DHHC7 knockout mouse to determine how altered palmitoylation in hepatocytes alters hepatokine production/secretion, and in turn, systemic metabolism. The ablation of DHHC7 was found to alter the production of proteoglycan 4 (Prg4), a hepatokine previously linked to metabolic regulation. The authors propose that the change in Prg4 production is mediated by the loss of Gαi palmitoylation, due to DHHC7 ablation, thereby augmenting cAMP-PKA-CREB signaling in hepatocytes, which alleviates the 'brake' on Prg4 production. The authors further propose that Prg4 overexpression leads to excessive binding to GPR146 on adipocytes, which in turn suppresses PKA-mediated HSL activation, promoting impairments in lipolysis, leading to obesity. The report is interesting and generally well-written, but it appears to have some clear gaps in additional data that would aid in interpretation. The addition of confirmatory culture studies would be incredibly helpful for testing the hypotheses being explored. My comments, concerns, and/or suggestions are outlined below in no particular order.

      (1) Figures: All data should be presented in dot-boxplot format so the reader knows how many samples were analyzed for each assay and group. n=3 for some assays/experiments is incredibly low, particularly when considering the heterogeneity in responsiveness to HFD, food intake, etc....

      (2) Figure 1E-F: It is unclear when the food intake measure was performed. Mice can alter their feeding behavior based on a myriad of environmental and biological cues. It would also be interesting to show food intake data normalized to body mass over time. Mice can counterregulate anorexigenic cues by altering neuropeptide production over time. It is not clear if this is occurring in these mice, but the timing of measuring food intake is important. Additionally, the VO2 measure appears to be presented as being normalized to total body mass, when in fact, it would probably be more accurate to normalize this to lean body mass. Normalizing to total body mass provides a denominator effect due to excessive adiposity, but white fat is not as metabolically active as other high-glucose-consuming tissues. If my memory serves me right, several reports have discussed appropriate normalizations in circumstances such as this.

      (3) Figure 1J-N: It is not all that surprising that fasting glucose and/or TGs were found to be similar between groups. It is well-established that mice have an incredible ability to become hyperinsulinemic in an effort to maintain euglycemia and lipid metabolism dynamics. A few relatively easy assays can be performed to glean better insights into the metabolic status of the authors' model. First, fasting insulin concentrations will be incredibly helpful. Secondly, if the authors want to tease out which adipose depot is most adversely affected by ablation, they could take an additional set of CON and KO mice, fast them for 5-6 hours, provide a bolus injection of insulin (similar to that provided during an insulin tolerance test), and then quickly harvest the animals ~15 minutes after insulin injections; followed by evaluating AKT phosphorylation. This will really tell them if these issues have impairments in insulin signaling. The gold-standard approach would be to perform a hyperinsulinemic-euglyemic clamp in the CON and KO mice. I now see GTT and ITT data, but the aforementioned assays could help provide insight.

      (4) Figure 3A: This looks overexposed to me.

      (5) Figures 3-4: It appears that several of these assays could be complemented with culture-based models, which would almost certainly be cleaner. The conditioned media could then be used from hepatocyte cultures to treat differentiated adipocytes.

      (6) Figure 4: It is unclear how to interpret the phospho-HSL data because the fasting state can affect this readout. It needs to be made clear how the harvest was done. Moreover, insulin and glucagon were never measured, and these hormones have a significant influence over HSL activity. I suspect the KO mice have established hyperinsulinemia, which would likely affect HSL activity. This provides an example of why performing some of these experiments in a dish would make for cleaner outcomes that are easier to interpret.

    3. Reviewer #3 (Public review):

      Summary:

      In the current manuscript, Sun et al aimed to determine the metabolic function of hepatocyte DHHC7, one of the key enzymes in protein palmitoylation. They generated inducible liver-specific Dhhc7 knockout mice and discovered that Dhhc7-LKO mice are more prone to gain weight and develop adipose expansion and obesity. Via unbiased proteomic analysis, they identified PRG4 as one of the top secreted factors in the liver of Dhhc7-LKO mice. Hepatic overexpression of PRG4 recapitulates the obesity phenotype observed in Dhh7-LKO mice. At the mechanistic level, PRG4, once secreted from the liver, can bind to GPR146 on adipocytes and inhibit PKA-HSL signaling and lipolysis. Taken together, their findings suggest a novel pathway by which the liver communicates with adipose tissue and impacts systemic metabolism.

      Strengths:

      (1) The systemic metabolic homeostasis depends on coordination among metabolically active tissues. Thus, active communication between the liver and adipose tissue when facing nutritional challenges (such as high-fat diet feeding) is crucial for achieving metabolic health. The concept that the liver can communicate with adipose tissue and impact the lipolysis process via secreted hepatokines is quite significant but remains poorly understood.

      (2) Hepatocyte Dhhc7 knockout mice developed a significant obesity phenotype, which is associated with adipose expansion.

      (3) Unbiased proteomic analysis identified PRG4 as one of the top secreted factors in the liver of Dhh7-LKO mice. Hepatic overexpression of PRG4 recapitulates the obesity phenotype observed in Dhh7-LKO mice.

      (4) In vitro cell-based assay showed that PRG4 can bind to adipocyte GPR146, inhibit PKA-mediated HSL phosphorylation, and subsequently, the lipolysis process.

      Weaknesses:

      (1) Lack of a causal-effect study to generate evidence directly linking hepatocyte DHH7 and PRG4 in driving adipose expansion and obesity upon HFD feeding.

      (2) Lack of direct evidence to support that PRG4 inhibits adipocyte lipolysis via GPR146. A functional assay demonstrating adipocyte lipolysis is required.

      (3) The conclusion is largely based on the correlation evidence.

    4. Author response:

      Public reviews:

      Reviewer #1 (Public review):

      Weaknesses:

      (1) The assessment of liver and adipose tissue responses to DHH7 loss is insufficient to support claims that it alters systemic lipolysis. In this new mouse model, liver histology is necessary, especially given the cholesterol increase in the KO. As this is a newly established mouse line, common assessments of the liver during HFD feeding would be important for interpreting the phenotype.

      We will add the data of the liver histology in the revised version.

      (2) The data show DHH7 loss causes adipose tissue dysfunction and alterations in lipid metabolism. Beyond that, I suggest not stating more regarding the phenotype of the DHH7 mice for this work. A thorough analysis would be needed to determine which factor drives the obesity and changes in energy balance in the mice. For example, the KO mice had lower oxygen consumption (but no change in CO2 production, which is also usually similarly altered), suggesting a CNS component could drive obesity. However, since the data are not normalized for lean mass and there is no information about locomotor activity, this analysis is incomplete. RER may be informative if available. A broad conservative description of the KO phenotype would be more accurate since Pgr4 has many paracrine targets and likely has autocrine signaling in the liver.

      We will add the data of CO2 production, locomotor activity and RER in the revised version.

      (3) Most references to lipolysis or lipolysis flux systemically would be inaccurate. To suggest a suppression of lipolysis, serum NEFA would need to be measured, and in vivo or in vitro lipolysis assays performed to test the effect of DHH7 loss or the specificity of PGR4 action on adipocytes in vivo. To demonstrate adipose tissue dysfunction, analysis of lipogenesis markers, canonical markers for insulin sensitivity, and mitochondrial dysfunction should be performed/measured.

      We will measure the serum NEFA to test the effect of DHHC7. We will analyze the lipogenesis markers, canonical markers for insulin sensitivity, and mitochondrial dysfunction.

      (4) Line 179: The experiment was performed in brown adipocytes to show that Prg4 does not affect p-CREB Figure S8 under the heading: "DHHC7 controls hepatic PKA-CREB activity through Gαi palmitoylation to regulate Prg4 transcription." Unless repeated using liver lysate, the conclusions stated in the text throughout the paper should be revised.

      The figure S8 is to demonstrate that Prg4 has no impact on forskolin induced CREB phosphorylation at Ser133, and provide the evidence that the prg4 acts on the upstream of adenylyl cyclase. We will revise the description.

      (5) It appears that the serum and liver proteomics were only assessed for factors that increased in KO mice? Were proteins that were significantly decreased analyzed?

      We are analyzing the decreased proteins in the following project.

      (6) The beige adipocyte culture method is unclear. The methods do not describe the fat pad used, and the protocol suggests the cells would be differentiated into mature white adipocytes. If they are beige cells, a reference for the method, gene expression, and cell images could support that claim.

      We will add a reference for the method, gene expression, asn cell images.

      (7) The use of tamoxifen can confound adipocyte studies, as it increases beigeing and weight gain even after a brief initiation period. Both groups were treated with Tam, but another way to induce Cre would be ideal.

      We will use the Doxycycline-inducible systems in the future.

      (8) Evidence for the lack of the glucose phenotype is incomplete. One reason could be due to the IP route of glucose administration, which has a large impact on glucose handling during a GTT. To confirm the absence of a glucose tolerance phenotype, an OGTT should be performed, as it is more physiological. In addition, the mice should be fed for 16 weeks. Prg4 affects immune cells, changing how adipose tissue expands, and 12 weeks of HFD feeding is often not long enough to see the effects of adipose tissue inflammation spilling over into the system.

      We will perform the OGTT and feed the mice for 16 weeks in the future.

      (9) There may be liver-adipose tissue crosstalk in KO mice, but this was not fully assessed in this study and would be difficult to determine in any setting, given the diverse cell types that are targets of Pdg4. The crosstalk claim is unnecessary to share the basic premises; there is the DHH7 mechanism/phenotype and the Pgr4 mechanism/phenotype, and while there is no Pgr4 adipose direct mechanism, the paper can be successfully reframed.

      We will reframe the paper.

      (10) Although the DHH7 loss on the chow diet did not result in a phenotype, did the Pgr4 increase in the KO mice on chow? This would determine whether either i) the expression of Pgr4 is dependent on HFD/obesity, or ii) circulating Pgr4 has effects only in an HFD condition. The receptors may also change on HFD, especially in adipocytes.

      We will test the Prg4 in the KO mice on chow diet.

      Reviewer #2 (Public review):

      (1) Figures: All data should be presented in dot-boxplot format so the reader knows how many samples were analyzed for each assay and group. n=3 for some assays/experiments is incredibly low, particularly when considering the heterogeneity in responsiveness to HFD, food intake, etc.

      We will present the data in dot-boxplot format.

      (2) Figure 1E-F: It is unclear when the food intake measure was performed. Mice can alter their feeding behavior based on a myriad of environmental and biological cues. It would also be interesting to show food intake data normalized to body mass over time. Mice can counterregulate anorexigenic cues by altering neuropeptide production over time. It is not clear if this is occurring in these mice, but the timing of measuring food intake is important. Additionally, the VO2 measure appears to be presented as being normalized to total body mass, when in fact, it would probably be more accurate to normalize this to lean body mass. Normalizing to total body mass provides a denominator effect due to excessive adiposity, but white fat is not as metabolically active as other high-glucose-consuming tissues. If my memory serves me right, several reports have discussed appropriate normalizations in circumstances such as this.

      We will see how to be more accurate to normalize.

      (3) Figure 1J-N: It is not all that surprising that fasting glucose and/or TGs were found to be similar between groups. It is well-established that mice have an incredible ability to become hyperinsulinemic in an effort to maintain euglycemia and lipid metabolism dynamics. A few relatively easy assays can be performed to glean better insights into the metabolic status of the authors' model. First, fasting insulin concentrations will be incredibly helpful. Secondly, if the authors want to tease out which adipose depot is most adversely affected by ablation, they could take an additional set of CON and KO mice, fast them for 5-6 hours, provide a bolus injection of insulin (similar to that provided during an insulin tolerance test), and then quickly harvest the animals ~15 minutes after insulin injections; followed by evaluating AKT phosphorylation. This will really tell them if these issues have impairments in insulin signaling. The gold-standard approach would be to perform a hyperinsulinemic-euglyemic clamp in the CON and KO mice. I now see GTT and ITT data, but the aforementioned assays could help provide insight.

      We have the data for evaluating AKT phosphorylation and will add it in the revised version.

      (4) Figure 3A: This looks overexposed to me.

      We will replace it with short exposed one.

      (5) Figures 3-4: It appears that several of these assays could be complemented with culture-based models, which would almost certainly be cleaner. The conditioned media could then be used from hepatocyte cultures to treat differentiated adipocytes.

      We will perform the cell culture experiments for Figures 3-4

      (6) Figure 4: It is unclear how to interpret the phospho-HSL data because the fasting state can affect this readout. It needs to be made clear how the harvest was done. Moreover, insulin and glucagon were never measured, and these hormones have a significant influence over HSL activity. I suspect the KO mice have established hyperinsulinemia, which would likely affect HSL activity. This provides an example of why performing some of these experiments in a dish would make for cleaner outcomes that are easier to interpret.

      We will perform some experiments in cell culture dish.

      Reviewer #3 (Public review):

      Weaknesses:

      (1) Lack of a causal-effect study to generate evidence directly linking hepatocyte DHH7 and PRG4 in driving adipose expansion and obesity upon HFD feeding.

      We will perform the causal-effect study to demonstrate the hypothesis.

      (2) Lack of direct evidence to support that PRG4 inhibits adipocyte lipolysis via GPR146. A functional assay demonstrating adipocyte lipolysis is required.

      We will add the direct evidence in the revised version.

      (3) The conclusion is largely based on the correlation evidence.

      We will perform the experiment to strengthen the conclusion base on the a causal-effect study.

    1. Analyse de la Rhétorique Complotiste : Mécanismes, Discours et l'Allégorie du « Mouton »

      Ce document de synthèse analyse les recherches et les réflexions de Loïc Massaia, vulgarisateur pour le projet Utopia, concernant la rhétorique employée dans les milieux complotistes.

      Il détaille les structures argumentatives, les fonctions psychologiques du discours et l'usage spécifique de l'insulte « mouton » comme outil de distinction sociale et de clôture du débat.

      Synthèse

      L'analyse de la rhétorique complotiste révèle un système de communication visant moins à établir une vérité qu'à asseoir un ascendant sur l'auditoire.

      Cette rhétorique se caractérise par une structure circulaire (tautologique) et un recours systématique à l'essentialisme.

      L'usage de termes comme « mouton » remplit une triple fonction : une attaque ad personam pour éviter le débat de fond, une accusation de complicité passive, et un mécanisme de distinction permettant de renforcer l'estime de soi du locuteur.

      En s'affranchissant des règles du « débat sain », le discours complotiste s'établit comme un système fermé où la conclusion (l'existence d'un complot) est déjà contenue dans les prémisses.

      -------------------------------------------------------------------------------

      1. Définition et Catégorisation de la Rhétorique Complotiste

      Le document propose de définir la rhétorique comme l'ensemble des moyens mis en œuvre dans un discours pour convaincre, briller, manipuler ou obtenir un ascendant sur autrui.

      Une définition complémentaire la décrit comme la « négociation de la différence entre les individus sur une question donnée ».

      Dans le cadre du complotisme, les expressions récurrentes peuvent être classées selon quatre dimensions principales :

      | Dimension | Exemples de phrases types | Objectif recherché | | --- | --- | --- | | Accusatoire | « Journalopes », « Merdias », « On ne vous dit pas tout » | Discréditer les sources d'information officielles. | | Incitatoire | « Faites vos propres recherches », « Réveillez-vous » | Pousser l'interlocuteur à adopter la même conclusion par une illusion d'autonomie. | | Négation du hasard | « Coïncidence ? Je ne crois pas », « Tout est lié » | Refuser la contingence au profit d'un dessein caché. | | Surconfiance et Distinction | « Tous des moutons », « On avait raison » | Se placer au-dessus de la « masse » ignorante. |

      --------------------------------------------------------------------------------

      2. Analyse Structurelle de l'Argumentation

      Le Modèle de Toulmin

      Pour évaluer la solidité d'un argument, le document mobilise le modèle de Toulmin, qui identifie les composants d'une argumentation optimale :

      1. Données : Les informations de base.

      2. Conclusion : Ce que l'on veut démontrer.

      3. Justifications : Le lien logique entre données et conclusion.

      4. Fondement : Ce qui rend la justification solide et acceptée.

      5. Réfutation : L'intégration des limites et des conditions qui pourraient contredire l'argument.

      La défaillance du discours complotiste

      L'analyse montre que le discours complotiste omet généralement la réfutation.

      Par exemple, l'argument consistant à dire que le gouvernement est une secte parce qu'il lutte contre les dérives sectaires (pour étouffer la dissidence) s'effondre si l'on introduit d'autres facteurs de distinction entre État et secte.

      Circularité et Essentialisme

      Le discours complotiste est décrit comme un système fermé ou une tautologie.

      Il repose sur l'essentialisation : on décrète que la « nature » profonde d'une entité (le gouvernement, les élites) est malveillante.

      Dès lors, toute action de cette entité, même positive en apparence, est interprétée comme une preuve supplémentaire de sa malveillance.

      Le complot existe nécessairement au départ pour expliquer les faits qui servent ensuite à prouver l'existence du complot.

      --------------------------------------------------------------------------------

      3. L'Allégorie du « Mouton » : Origines et Usages

      L'expression « tous des moutons » est un idiotisme animalier présent dans plusieurs langues (français, italien, anglais, polonais).

      Origine Littéraire

      L'image du mouton qui suit aveuglément remonte notamment à Rabelais (l'épisode des moutons de Panurge), où les animaux sautent à l'eau et meurent simplement parce que le premier a sauté.

      Cela souligne une dimension « naturelle » ou essentialiste de l'animal : le besoin de suivre.

      Fonctions dans le discours complotiste

      1. L'identification du comploteur : S'il y a des moutons, il y a nécessairement un « berger » ou un « maître » (le comploteur).

      2. L'accusation de complicité : Les non-complotistes sont jugés idiots, mais aussi complices par leur passivité.

      3. Le besoin de distinction : Se déclarer « non-mouton » permet de s'extraire de la masse. Selon les travaux d'Anthony Lantian (2015), l'adhésion aux théories du complot serait un moyen de rehausser une estime de soi initialement basse en se sentant détenteur d'un savoir supérieur.

      --------------------------------------------------------------------------------

      4. La Rhétorique comme Rupture du Débat

      L'usage de l'insulte « mouton » est qualifié d'argument ad personam.

      Théorisée par Schopenhauer, cette tactique consiste à attaquer l'individu plutôt que ses arguments pour mettre fin à une discussion que l'on ne peut pas gagner sur le fond.

      Violation des règles de la controverse honorable

      En s'appuyant sur les travaux de Levi Hedge (XIXe siècle), le document identifie trois règles fondamentales d'un débat sain systématiquement violées par la rhétorique complotiste :

      Règle n°4 : Interdiction des attaques personnelles.

      Règle n°5 : Interdiction d'accuser l'adversaire de mobiles cachés.

      Règle n°7 : La vérité doit être le but, non la victoire. L'usage du ridicule ou de la raillerie (traiter l'autre de mouton) est une violation de cette règle.

      Toutefois, le document souligne que ces dérives ne sont pas l'apanage des complotistes ; elles se retrouvent fréquemment dans tout débat public où l'objectif des participants est de « gagner » plutôt que de chercher la vérité.

      --------------------------------------------------------------------------------

      5. Perspectives Critiques

      En conclusion, le document invite à une réflexion sur la nature même de la critique du complotisme.

      Si l'on définit la rhétorique complotiste comme étant « par nature » une tautologie basée sur un essentialisme, on court le risque de produire soi-même un discours fermé et essentialiste.

      Cette mise en abyme suggère que l'analyse du complotisme doit elle-même rester vigilante quant à ses propres structures argumentatives pour ne pas tomber dans les travers qu'elle dénonce.

    1. Briefing : Devenir parent, un grand défi — Analyse des obstacles systémiques, médicaux et sociaux

      Résumé exécutif

      Ce document synthétise les échanges d'une table ronde consacrée aux défis majeurs de l'accès à la parentalité.

      L'analyse révèle un décalage profond entre l'injonction sociétale à la natalité et la réalité des parcours « atypiques » (infertilité, handicap, adoption).

      Les parents et futurs parents font face à une triple épreuve :

      1. Des préjugés tenaces : Une stigmatisation de l'infertilité masculine et une négation de la compétence parentale des personnes handicapées.

      2. Une faillite de l'accompagnement : Un manque d'information neutre et de formation du personnel médical, poussant parfois les individus vers des dérives idéologiques ou des pseudo-sciences.

      3. Des barrières systémiques violentes : Des procédures administratives d'adoption exténuantes et une surveillance intrusive des services sociaux pouvant mener à des traumatismes familiaux graves (placements abusifs).

      Malgré ces obstacles, l'esprit critique et l'engagement associatif émergent comme des outils de résilience essentiels pour naviguer dans ces systèmes complexes.

      --------------------------------------------------------------------------------

      1. L'infertilité : Entre réalités biologiques et mythes sociaux

      L'infertilité est souvent perçue à tort comme une problématique essentiellement féminine.

      Les données scientifiques et les témoignages personnels rectifient cette vision.

      Répartition des causes d'infertilité

      Selon Marjorie Whitfield (chercheuse à l'Inserm), la responsabilité de l'infertilité est équitablement répartie :

      Un tiers des cas est d'origine féminine.

      Un tiers des cas est d'origine masculine.

      Un tiers des cas est d'origine mixte (impliquant les deux partenaires).

      Le poids des préjugés masculins

      L'infertilité masculine est particulièrement sujette à des amalgames psychologiques et sociaux :

      Confusion avec l'impuissance : La société confond souvent la capacité à procréer (production de spermatozoïdes) et la virilité ou la performance sexuelle. Un homme stérile peut avoir une fonction sexuelle normale.

      Atteinte à la virilité : Pour beaucoup, l'incapacité à concevoir est vécue comme une défaillance du « contrat » de virilité.

      Déni de paternité : Dans les cas de recours à un donneur, le préjugé social tend à nier le rôle de père au profit de la seule génétique.

      --------------------------------------------------------------------------------

      2. Parentalité et handicap : Un parcours d'obstacles discriminatoire

      Le témoignage de Leitha met en lumière un système de santé et un encadrement social profondément « validocentrés », où le handicap est systématiquement perçu comme un frein, voire un danger.

      La stigmatisation médicale

      Les professionnels de santé manifestent souvent une incompréhension totale face au désir de grossesse d'une personne handicapée :

      Invisibilisation de la sexualité : Étonnement des soignants face à la conception (« Comment avez-vous fait ? »).

      Orientation systématique vers l'IVG : Des patientes se voient proposer l'interruption volontaire de grossesse par défaut, sans que leur choix ou leur projet parental ne soit envisagé.

      Manque de matériel adapté : Absence de tables d'examen gynécologique ou d'instruments permettant la prise en charge de personnes en fauteuil roulant, menant à des violences gynécologiques.

      La suspicion des services sociaux

      Une fois parents, les personnes handicapées subissent une surveillance disproportionnée :

      Injonctions contradictoires : Les services sociaux imposent des cadres rigides et changeants, sans offrir de solutions concrètes aux difficultés quotidiennes liées au handicap.

      Le « signalement » par défaut : Des inquiétudes infondées ou des préjugés sur la capacité de protection de l'enfant peuvent mener à des procédures de placement.

      Traumatismes familiaux : Des enfants sont parfois retirés à leurs parents durant plusieurs années sur la base de suspicions de danger jamais étayées par des faits.

      --------------------------------------------------------------------------------

      3. Les entraves administratives et législatives

      L'accès à la parentalité est également conditionné par des mécanismes bureaucratiques lourds qui peuvent décourager les candidats.

      | Type de parcours | Nature des obstacles identifiés | | --- | --- | | Adoption | Délais d'agrément longs (5 ans), enquêtes sociales intrusives (voisinage, famille), tests psychologiques obsolètes (ex: test de Rorschach), et fermetures de pays étrangers suite à des évolutions législatives françaises (ex: Mariage pour tous). | | PMA | Délais rallongés pour les personnes handicapées (examens supplémentaires), limitation du nombre de tentatives prises en charge, et coût élevé des démarches à l'étranger. | | Suivi Social | Surveillance psychosociale non demandée, sentiment d'être « jugé à la loupe » contrairement aux parents biologiques sans difficultés apparentes. |

      --------------------------------------------------------------------------------

      4. Le danger du manque d'information et de l'isolement

      Le déficit d'accompagnement par les structures officielles crée un vide dangereux que comblent des organisations aux agendas variés.

      Dérives idéologiques : En l'absence de ressources publiques pour accompagner les grossesses avec handicap, des associations anti-IVG deviennent parfois les seules détentrices d'informations pratiques, utilisant cette aide pour manipuler psychologiquement les futures mères.

      Pseudo-médecines : Le désir de parentalité est un marché lucratif pour des cures ou formations miracles promettant de « booster » la fertilité sans base scientifique.

      Isolement psychologique : La culpabilité, souvent induite par le discours médical (« Vous ne pouvez pas faire ça à un enfant »), isole les parents et fragilise leur santé mentale.

      --------------------------------------------------------------------------------

      5. Le rôle crucial de l'esprit critique

      L'esprit critique est présenté comme un levier fondamental pour reprendre le pouvoir sur son parcours de parent.

      1. Filtrer l'information : Apprendre à vérifier les sources et à ne pas accepter la parole médicale comme une vérité absolue, surtout lorsqu'elle est empreinte de jugements de valeur.

      2. Désamorcer la culpabilité : Comprendre les mécanismes systémiques permet de réaliser que l'échec ou la difficulté n'est pas une faute individuelle mais le résultat d'un manque de soutien.

      3. Créer des ressources : Face à l'absence de structures adaptées, l'engagement associatif (comme la création de sites de ressources neutres) permet de briser l'isolement et de proposer un accompagnement basé sur l'expérience et les preuves (EBM - Evidence-Based Medicine).

      --------------------------------------------------------------------------------

      Conclusion : Une question de dignité et de droits

      Les parcours de Sylvain Rozier et de Leitha démontrent que devenir parent, lorsqu'on s'écarte de la norme biologique ou sociale, est un acte de résistance.

      Malgré la dureté des épreuves — 11 ans de combat pour l'un, des années de bataille judiciaire pour l'autre — l'issue positive de ces parcours souligne la nécessité urgente d'une réforme de l'accompagnement de la parentalité :

      Formation des personnels soignants et sociaux aux enjeux du handicap.

      Neutralité et accessibilité de l'information médicale.

      Soutien logistique plutôt que surveillance répressive.

      « La parentalité est un chemin semé d'embûches [...] mais sur des parcours atypiques, on est vraiment à un autre niveau d'embûches qui isolent. » — Marjorie Whitfield.

    1. L'Esprit Critique au Cœur de l'Enquête Privée Spécialisée : Analyse des Pratiques de Benoît Judde

      Ce document de synthèse analyse les interventions de Benoît Judde, détective privé spécialisé, concernant l'évolution de la profession de détective en France, le cadre juridique des dérives sectaires et l'utilisation de l'esprit critique comme outil méthodologique fondamental pour l'administration de la preuve.

      Synthèse

      La profession de détective privé en France, désormais strictement réglementée et contrôlée par le ministère de l'Intérieur (CNAPS), s'est transformée en un auxiliaire de fait pour la défense des intérêts privés et le système judiciaire.

      Benoît Judde, spécialisé dans les faits de manipulation et les dérives sectaires, démontre que l'efficacité de l'enquêteur repose sur une maîtrise rigoureuse du cadre juridique et sur l'application de l'esprit critique.

      Cette approche, adossée aux psychologies cognitive et sociale expérimentales, permet de transformer des phénomènes subjectifs comme la « sujétion psychologique » en éléments de preuve objectifs, circonstanciés et recevables en justice.

      Le passage récent (2024) de la sujétion psychologique au statut d'infraction autonome renforce la nécessité d'une expertise technique capable de caractériser les manœuvres de manipulation sans tomber dans le biais de confirmation.

      --------------------------------------------------------------------------------

      1. Le Cadre Légal et Déontologique de la Profession

      La profession de détective privé, officiellement dénommée « agent de recherche privée », est définie par le Code de la sécurité intérieure (CSI).

      Définition et Prérogatives

      Selon l'article L621-1 du CSI, le détective est un professionnel libéral dont la mission consiste à recueillir des informations ou des renseignements destinés à des tiers, en vue de la défense de leurs intérêts.

      Anonymat d'enquête : C’est la seule profession parajuridique autorisée à enquêter sans révéler sa qualité, son identité réelle ou l’objet de sa mission. Contrairement aux commissaires de justice (huissiers), le détective peut agir sous une identité fictive.

      Recevabilité des preuves : Les rapports de détective doivent être « détaillés, circonstanciés et précis » (DCP) pour être recevables devant les tribunaux, selon une jurisprudence de la Cour de cassation datant de 1962.

      Régulation et Formation

      La profession est passée d'un état de « freestyle » à un encadrement strict :

      Contrôle du CNAPS : Le Conseil national des activités privées de sécurité (sous tutelle du ministère de l'Intérieur) délivre trois agréments distincts (personne physique, structure juridique, carte professionnelle), renouvelables tous les 5 ans après enquête de moralité approfondie.

      Formation obligatoire : Un niveau Bac+3 (licence professionnelle) est requis. Il n'existe que quatre écoles en France (deux universités et deux écoles privées), formant environ 120 nouveaux professionnels par an.

      Déontologie : Les détectives sont soumis au secret professionnel et à une obligation de conseil. Ils doivent notamment vérifier la légitimité de la demande pour éviter de servir des projets de vengeance ou des recherches malveillantes.

      --------------------------------------------------------------------------------

      2. L'Enquête Spécialisée dans les Dérives Sectaires

      Le champ d'action des détectives est vaste (recherche de personnes, contrefaçon, fraude à l'assurance), mais la spécialisation de Benoît Judde porte sur la manipulation mentale.

      Les Critères de la MIVILUDES

      Pour objectiver une dérive sectaire, l'enquêteur s'appuie sur le référentiel de la Mission interministérielle de vigilance et de lutte contre les dérives sectaires (MIVILUDES), qui identifie 10 critères principaux.

      | Catégorie d'atteinte | Exemples de sous-critères | | --- | --- | | Atteintes aux personnes | Rupture avec l'environnement d'origine, perte d'esprit critique, embrigadement des enfants, privation de sommeil ou de nourriture. | | Atteintes aux biens | Exigences financières disproportionnées, endettement, travail dissimulé (ex: détournement du concept de woofing). | | Vie sociale et démocratique | Discours antisocial, trouble à l'ordre public, détournement des circuits économiques. |

      Collaboration Interdisciplinaire

      L'enquêteur travaille en binôme avec un psychologue (spécialisé en psychologie scientifique, cognitive et sociale) pour valider la réalité de l'emprise.

      Cette collaboration permet d'apporter une « parole psychologique » crédible que le juriste ou le détective ne peut formuler seul, notamment pour qualifier le préjudice ou la sujétion devant un juge.

      --------------------------------------------------------------------------------

      3. Évolutions Législatives Récentes (Loi de 2024)

      Le cadre juridique français a récemment évolué pour faciliter la répression des dérives sectaires, rendant le rôle de la preuve plus complexe et crucial.

      Autonomie de la sujétion psychologique : Auparavant liée à l'abus de faiblesse (nécessitant de prouver un état de faiblesse préalable et un préjudice), la « mise en état de sujétion psychologique » est devenue une infraction autonome en 2024.

      Il suffit désormais de prouver l'utilisation de techniques de pression ou de manipulation altérant le jugement.

      Détournement de traitement médical : Une nouvelle infraction punit le fait de provoquer une personne à abandonner un traitement médical thérapeutique ou prophylactique (vaccination) au profit de pratiques pseudo-scientifiques.

      L'Escroquerie et la Cybermalveillance : Dans le domaine numérique, 95 % des arnaques reposent sur l'ingénierie sociale (manipulation humaine) plutôt que sur des failles purement techniques.

      --------------------------------------------------------------------------------

      4. L'Esprit Critique comme Méthodologie d'Enquête

      Pour Benoît Judde, l'esprit critique n'est pas une posture intellectuelle mais un outil de travail permettant d'éviter le biais de confirmation et d'assurer l'objectivité du rapport.

      Les Trois Piliers de la Manipulation

      L'enquêteur analyse les situations à travers trois mécanismes identifiés par la psychologie expérimentale :

      1. L'automanipulation : Utilisation des biais cognitifs naturels des individus.

      2. La soumission librement consentie : Techniques comme le « pied dans la porte » (obtenir un petit engagement pour en obtenir un plus grand) ou la « porte au nez » (demander l'excessif pour obtenir le raisonnable).

      3. La soumission à l'autorité : Référence à l'expérience de Milgram. La manipulation réussit si l'autorité est perçue comme légitime (ex: port d'une blouse, titre de « frère de Jésus », etc.).

      L'Objectivité de la Preuve

      Recours à la technologie : Utilisation de caméras cachées lors d'infiltrations pour fournir une preuve brute et incontestable, évitant ainsi la faillibilité de la mémoire humaine ou les accusations de partialité.

      Nécessité et proportionnalité : L'enquêteur doit justifier que l'atteinte à la vie privée (infiltration, surveillance) était strictement indispensable à la manifestation de la vérité et proportionnée à l'enjeu (droit à la preuve vs droit à la vie privée).

      --------------------------------------------------------------------------------

      5. Conclusion : Vers un Continuum de Sécurité

      Le document souligne que l'État ne peut assurer seul la surveillance de tous les risques, particulièrement dans les domaines complexes des dérives sectaires et thérapeutiques.

      Synergie Public-Privé : Le détective privé intervient là où la police ne peut plus agir (disparitions non inquiétantes, enquêtes pré-pénales pour consolider une plainte).

      Auxiliaire de Justice : En apportant des éléments basés sur un consensus scientifique (psychologie expérimentale), le détective aide le magistrat à fonder sa décision sur des faits plutôt que sur des témoignages contradictoires.

      Complémentarité : L'objectif n'est pas une « américanisation » du système, mais une validation réciproque où le secteur privé complète l'action régalienne en fournissant une expertise technique et de terrain spécifique.

    1. Synthèse Clinique : Comprendre et Accompagner la Cooccurrence TSA-TDAH (ODHD)

      Résumé Exécutif

      Ce document propose une analyse approfondie de la cooccurrence entre le Trouble du Spectre de l'Autisme (TSA) et le Trouble du Déficit de l'Attention avec ou sans Hyperactivité (TDAH), un profil souvent désigné sous l'acronyme anglo-saxon « ODHD ».

      Longtemps ignorée par les classifications officielles (notamment avant le DSM-5 en 2013), cette double problématique est aujourd'hui reconnue comme une entité clinique à part entière, et non une simple addition de symptômes.

      Les points clés de cette analyse incluent :

      Prévalence élevée : Plus de 40 % des individus avec un TSA présentent un TDAH associé.

      Complexité clinique : La combinaison des deux troubles entraîne une sévérité accrue des symptômes, une fatigue majeure (burnout autistique) et des profils sensoriels complexes.

      Prise en charge spécifique : L'approche doit être multidisciplinaire, privilégiant la psychoéducation et une pharmacologie prudente, tout en évitant le recours systématique aux antipsychotiques.

      Changement de paradigme : Il est crucial de passer d'une vision centrée sur le symptôme à une vision axée sur le fonctionnement global et la qualité de l'environnement.

      --------------------------------------------------------------------------------

      1. Analyse du Diagnostic et Prévalence

      1.1 Évolution des Classifications

      Avant 2013, le DSM-5 interdisait formellement le double diagnostic TSA et TDAH. Pourtant, la pratique clinique révélait déjà des patients présentant des caractéristiques marquées des deux troubles. Depuis la levée de cette interdiction, la littérature scientifique et l'expérience de terrain confirment une imbrication fréquente.

      1.2 Statistiques de Cooccurrence

      Les données actuelles mettent en évidence une asymétrie dans la comorbidité :

      TSA avec TDAH : Plus de 40 % des personnes autistes répondent également aux critères du TDAH.

      TDAH avec TSA : Environ 13 % à 20 % des personnes TDAH présentent des traits autistiques associés.

      1.3 L'importance du Diagnostic Différentiel

      Il est impératif de distinguer l'origine des symptômes pour éviter un empilement erroné de diagnostics. Par exemple :

      • Les difficultés sociales du TDAH sont souvent liées à l'impulsivité ou l'inattention, tandis que dans le TSA, elles relèvent de la cognition sociale.

      • Les troubles attentionnels du TSA sont souvent la conséquence d'une hyper-sensorialité ou d'intérêts restreints plutôt que d'un mécanisme TDAH intrinsèque.

      --------------------------------------------------------------------------------

      2. Manifestations Cliniques et Impacts Fonctionnels

      L'association des deux troubles (ODHD) crée un tableau singulier où les symptômes s'influencent mutuellement, augmentant la sévérité globale.

      | Domaine de fonctionnement | Impact de la cooccurrence TSA + TDAH | | --- | --- | | Fonctions Exécutives | Difficultés plus marquées (inhibition, flexibilité, attention) ; profil proche du TDAH isolé mais plus sévère. | | Cognition Sociale | Difficultés sociales accrues, contact visuel moindre et peu d'amélioration spontanée avec le temps. | | Sensorialité | Cumul des hypersensibilités ; profil sensoriel complexe et particulièrement intense. | | Santé Mentale | Risque accru de troubles dépressifs, troubles du sommeil, épuisement majeur et burnout autistique. | | Adaptation | Précarité économique plus importante et difficultés psychosociales majeures. |

      2.1 La Question du "Trouble" vs "Fonctionnement"

      Un point crucial de l'analyse est la distinction entre avoir un fonctionnement neurodivergent et présenter un trouble. Le trouble n'apparaît que lorsqu'il y a une répercussion fonctionnelle négative. Cette répercussion est étroitement liée à la qualité environnementale (par exemple, la personnalité d'un enseignant ou l'adaptation d'un poste de travail).

      --------------------------------------------------------------------------------

      3. Stratégies Thérapeutiques et Accompagnement

      3.1 La Psychoéducation : Le Pilier Central

      La psychoéducation doit être « sextuple » (incluant l'enfant, les parents et la fratrie). Ses objectifs sont de :

      • Donner du sens aux symptômes.

      • Mettre fin aux idées reçues et aux préjugés (notamment ceux des soignants).

      • Réduire l'auto-stigmatisation et la culpabilité.

      • Limiter le "masking" (suradaptation permanente), qui est une cause majeure d'épuisement et de burnout.

      3.2 Approche Médicamenteuse (Méthylphénidate)

      Le recours au méthylphénidate est possible mais nécessite une expertise clinique fine :

      Sensibilité accrue : Les patients TSA sont souvent hyper-sensibles aux substances (perception fine des changements corporels).

      Posologie : Il est recommandé de commencer par des doses très faibles (ex: 5 mg) et d'augmenter de manière très progressive.

      Vigilance : Surveiller l'augmentation potentielle des stéréotypies ou de l'irritabilité.

      Critique des pratiques : Le document dénonce comme une « hérésie » l'usage de première intention des antipsychotiques (type Haldol ou Risperdal) en France, au détriment du méthylphénidate.

      3.3 La "Thérapie de Mamie" et Médiations Corporelles

      L'hygiène de vie et le corps sont des leviers fondamentaux :

      Hygiène de vie : Régime méditerranéen, sommeil de qualité et régulation de l'exposition aux écrans.

      Activité physique : Présente une efficacité majeure prouvée par la littérature pour la régulation du TDAH.

      Régulation émotionnelle : Utilisation d'outils de cohérence cardiaque (ex: RespiRelax) pour agir sur le système nerveux autonome.

      Médiations alternatives : La musicothérapie et la danse-thérapie sont particulièrement efficaces car elles passent par les fréquences et le corps plutôt que par le langage verbal.

      --------------------------------------------------------------------------------

      4. Neurodiversité : Forces et Perspectives Évolutionnistes

      Il est essentiel de ne pas réduire l'individu à ses symptômes mais de reconnaître les forces inhérentes à ces profils.

      Forces du TDAH : Empathie, créativité (issue des stratégies d'adaptation développées), curiosité, enthousiasme, intuition et rapidité.

      Forces du TSA : Précision, sérieux, honnêteté, respect des horaires et sens du détail.

      Lecture évolutionniste : La persistance des troubles du neurodéveloppement (TND) dans l'évolution humaine suggère leur utilité sociale. Par exemple, le TDAH pour l'exploration et la résolution de problèmes rapides, et le TSA pour la vigilance et l'expertise technique au sein d'un groupe.

      Vers des environnements inclusifs

      Le projet « Atipy Friendly » illustre la transition nécessaire vers une société (notamment l'université) capable de s'adapter à la singularité de ces fonctionnements, plutôt que d'exiger une suradaptation systématique des personnes concernées.

      --------------------------------------------------------------------------------

      Conclusion

      Le profil TSA-TDAH (ODHD) nécessite une attention particulière et une coordination accrue entre les professionnels (psychomotriciens, pédopsychiatres, éducateurs).

      L'enjeu n'est pas seulement de traiter des symptômes, mais de répondre aux besoins spécifiques de la personne pour favoriser son autonomie et sa qualité de vie, tout en valorisant les forces liées à sa neurodivergence.

    1. THE AMERICAN YAWP Menu Skip to content HomeAbout Barbara Jordan – On the Impeachment of Richard Nixon (1974) Brookes print Casta painting Contributors How the Other Half Lived: Photographs of Jacob Riis Introduction Note on Recommended Readings Press Sample Feedback (@AmericanYawp) Teaching Materials TEST: 11/18/2025 Updates Who Pays for This? 6. A New Nation “The Federal Pillars,” from The Massachusetts Centinel, August 2, 1789. Library of Congress. *The American Yawp is an evolving, collaborative text. Please click here to improve this chapter.* I. IntroductionII. Shays’s RebellionIII. The Constitutional ConventionIV. Ratifying the ConstitutionV. Rights and CompromisesVI. Hamilton’s Financial SystemVII. The Whiskey Rebellion and Jay’s TreatyVIII. The French Revolution and the Limits of LibertyIX. Religious FreedomX. The Election of 1800XI. ConclusionXII. Primary SourcesXIII. Reference Material I. Introduction On July 4, 1788, Philadelphians turned out for a “grand federal procession” in honor of the new national constitution. Workers in various trades and professions demonstrated. Blacksmiths carted around a working forge, on which they symbolically beat swords into farm tools. Potters proudly carried a sign paraphrasing from the Bible, “The potter hath power over his clay,” linking God’s power with an artisan’s work and a citizen’s control over the country. Christian clergymen meanwhile marched arm-in-arm with Jewish leaders. The grand procession represented what many Americans hoped the United States would become: a diverse but cohesive, prosperous nation.1 Over the next few years, Americans would celebrate more of these patriotic holidays. In April 1789, for example, thousands gathered in New York to see George Washington take the presidential oath of office. That November, Washington called his fellow citizens to celebrate with a day of thanksgiving, particularly for “the peaceable and rational manner” in which the government had been established.2 But the new nation was never as cohesive as its champions had hoped. Although the officials of the new federal government—and the people who supported it—placed great emphasis on unity and cooperation, the country was often anything but unified. The Constitution itself had been a controversial document adopted to strengthen the government so that it could withstand internal conflicts. Whatever the later celebrations, the new nation had looked to the future with uncertainty. Less than two years before the national celebrations of 1788 and 1789, the United States had faced the threat of collapse.   II. Shays’s Rebellion Daniel Shays became a divisive figure, to some a violent rebel seeking to upend the new American government, to others an upholder of the true revolutionary virtues Shays and others fought for. This contemporary depiction of Shays and his accomplice Job Shattuck portrays them in the latter light as rising “illustrious from the Jail.” Unidentified artist, Daniel Shays and Job Shattuck, 1787. Wikimedia. In 1786 and 1787, a few years after the Revolution ended, thousands of farmers in western Massachusetts were struggling under a heavy burden of debt. Their problems were made worse by weak local and national economies. Many political leaders saw both the debt and the struggling economy as a consequence of the Articles of Confederation, which provided the federal government with no way to raise revenue and did little to create a cohesive nation out of the various states. The farmers wanted the Massachusetts government to protect them from their creditors, but the state supported the lenders instead. As creditors threatened to foreclose on their property, many of these farmers, including Revolutionary War veterans, took up arms. Led by a fellow veteran named Daniel Shays, these armed men, the “Shaysites,” resorted to tactics like the patriots had used before the Revolution, forming blockades around courthouses to keep judges from issuing foreclosure orders. These protesters saw their cause and their methods as an extension of the “Spirit of 1776”; they were protecting their rights and demanding redress for the people’s grievances. Governor James Bowdoin, however, saw the Shaysites as rebels who wanted to rule the government through mob violence. He called up thousands of militiamen to disperse them. A former Revolutionary general, Benjamin Lincoln, led the state force, insisting that Massachusetts must prevent “a state of anarchy, confusion and slavery.”3 In January 1787, Lincoln’s militia arrested more than one thousand Shaysites and reopened the courts. Daniel Shays and other leaders were indicted for treason, and several were sentenced to death, but eventually Shays and most of his followers received pardons. Their protest, which became known as Shays’s Rebellion, generated intense national debate. While some Americans, like Thomas Jefferson, thought “a little rebellion now and then” helped keep the country free, others feared the nation was sliding toward anarchy and complained that the states could not maintain control. For nationalists like James Madison of Virginia, Shays’s Rebellion was a prime example of why the country needed a strong central government. “Liberty,” Madison warned, “may be endangered by the abuses of liberty as well as the abuses of power.”4   III. The Constitutional Convention The uprising in Massachusetts convinced leaders around the country to act. After years of goading by James Madison and other nationalists, delegates from twelve of the thirteen states met at the Pennsylvania state house in Philadelphia in the summer of 1787. Only Rhode Island declined to send a representative. The delegates arrived at the convention with instructions to revise the Articles of Confederation. The biggest problem the convention needed to solve was the federal government’s inability to levy taxes. That weakness meant that the burden of paying back debt from the Revolutionary War fell on the states. The states, in turn, found themselves beholden to the lenders who had bought up their war bonds. That was part of why Massachusetts had chosen to side with its wealthy bondholders over poor western farmers.5 James Madison, however, had no intention of simply revising the Articles of Confederation. He intended to produce a completely new national constitution. In the preceding year, he had completed two extensive research projects—one on the history of government in the United States, the other on the history of republics around the world. He used this research as the basis for a proposal he brought with him to Philadelphia. It came to be called the Virginia Plan, named after Madison’s home state.6 James Madison was a central figure in the reconfiguration of the national government. Madison’s Virginia Plan was a guiding document in the formation of a new government under the Constitution. John Vanderlyn, Portrait of James Madison, 1816. Wikimedia. The Virginia Plan was daring. Classical learning said that a republican form of government required a small and homogenous state: the Roman republic, or a small country like Denmark, for example. Citizens who were too far apart or too different could not govern themselves successfully. Conventional wisdom said the United States needed to have a very weak central government, which should simply represent the states on certain matters they had in common. Otherwise, power should stay at the state or local level. But Madison’s research had led him in a different direction. He believed it was possible to create “an extended republic” encompassing a diversity of people, climates, and customs. The Virginia Plan, therefore, proposed that the United States should have a strong federal government. It was to have three branches—legislative, executive, and judicial—with power to act on any issues of national concern. The legislature, or Congress, would have two houses, in which every state would be represented according to its population size or tax base. The national legislature would have veto power over state laws.7 Other delegates to the convention generally agreed with Madison that the Articles of Confederation had failed. But they did not agree on what kind of government should replace them. In particular, they disagreed about the best method of representation in the new Congress. Representation was an important issue that influenced a host of other decisions, including deciding how the national executive branch should work, what specific powers the federal government should have, and even what to do about the divisive issue of slavery. For more than a decade, each state had enjoyed a single vote in the Continental Congress. William Patterson’s New Jersey Plan proposed to keep things that way. The Connecticut delegate Roger Sherman, furthermore, argued that members of Congress should be appointed by the state legislatures. Ordinary voters, Sherman said, lacked information, were “constantly liable to be misled” and “should have as little to do as may be” about most national decisions.8 Large states, however, preferred the Virginia Plan, which would give their citizens far more power over the legislative branch. James Wilson of Pennsylvania argued that since the Virginia Plan would vastly increase the powers of the national government, representation should be drawn as directly as possible from the public. No government, he warned, “could long subsist without the confidence of the people.”9) Ultimately, Roger Sherman suggested a compromise. Congress would have a lower house, the House of Representatives, in which members were assigned according to each state’s population, and an upper house, which became the Senate, in which each state would have one vote. This proposal, after months of debate, was adopted in a slightly altered form as the Great Compromise: each state would have two senators, who could vote independently. In addition to establishing both types of representation, this compromise also counted three-fifths of a state’s enslaved population for representation and tax purposes. The delegates took even longer to decide on the form of the national executive branch. Should executive power be in the hands of a committee or a single person? How should its officeholders be chosen? On June 1, James Wilson moved that the national executive power reside in a single person. Coming only four years after the American Revolution, that proposal was extremely contentious; it conjured up images of an elected monarchy.10 The delegates also worried about how to protect the executive branch from corruption or undue control. They endlessly debated these questions, and not until early September did they decide the president would be elected by a special electoral college. In the end, the Constitutional Convention proposed a government unlike any other, combining elements copied from ancient republics and English political tradition but making some limited democratic innovations—all while trying to maintain a delicate balance between national and state sovereignty. It was a complicated and highly controversial scheme.   IV. Ratifying the Constitution Delegates to the Constitutional Convention assembled, argued, and finally agreed in this room, styled in the same manner as during the Convention. Photograph of the Assembly Room, Independence Hall, Philadelphia, Pennsylvania. Wikimedia. Creative Commons Attribution-Share Alike 3.0 Unported. The convention voted to send its proposed Constitution to Congress, which was then sitting in New York, with a cover letter from George Washington. The plan for adopting the new Constitution, however, required approval from special state ratification conventions, not just Congress. During the ratification process, critics of the Constitution organized to persuade voters in the different states to oppose it. Importantly, the Constitutional Convention had voted down a proposal from Virginia’s George Mason, the author of Virginia’s state Declaration of Rights, for a national bill of rights. This omission became a rallying point for opponents of the document. Many of these Anti-Federalists argued that without such a guarantee of specific rights, American citizens risked losing their personal liberty to the powerful federal government. The pro-ratification Federalists, on the other hand, argued that including a bill of rights was not only redundant but dangerous; it could limit future citizens from adding new rights.11 Citizens debated the merits of the Constitution in newspaper articles, letters, sermons, and coffeehouse quarrels across America. Some of the most famous, and most important, arguments came from Alexander Hamilton, John Jay, and James Madison in the Federalist Papers, which were published in various New York newspapers in 1787 and 1788.12 The first crucial vote came at the beginning of 1788 in Massachusetts. At first, the Anti-Federalists at the Massachusetts ratifying convention probably had the upper hand, but after weeks of debate, enough delegates changed their votes to narrowly approve the Constitution. But they also approved a number of proposed amendments, which were to be submitted to the first Congress. This pattern—ratifying the Constitution but attaching proposed amendments—was followed by other state conventions. The most high-profile convention was held in Richmond, Virginia, in June 1788, when Federalists like James Madison, Edmund Randolph, and John Marshall squared off against equally influential Anti-Federalists like Patrick Henry and George Mason. Virginia was America’s most populous state, it had produced some of the country’s highest-profile leaders, and the success of the new government rested upon its cooperation. After nearly a month of debate, Virginia voted 89 to 79 in favor of ratification.13 On July 2, 1788, Congress announced that a majority of states had ratified the Constitution and that the document was now in effect. Yet this did not mean the debates were over. North Carolina, New York, and Rhode Island had not completed their ratification conventions, and Anti-Federalists still argued that the Constitution would lead to tyranny. The New York convention would ratify the Constitution by just three votes, and finally Rhode Island would ratify it by two votes—a full year after George Washington was inaugurated as president.   V. Rights and Compromises Although debates continued, Washington’s election as president cemented the Constitution’s authority. By 1793, the term Anti-Federalist would be essentially meaningless. Yet the debates produced a piece of the Constitution that seems irreplaceable today. Ten amendments were added in 1791. Together, they constitute the Bill of Rights. James Madison, against his original wishes, supported these amendments as an act of political compromise and necessity. He had won election to the House of Representatives only by promising his Virginia constituents such a list of rights. There was much the Bill of Rights did not cover. Women found no special protections or guarantee of a voice in government. Many states continued to restrict voting only to men who owned significant amounts of property. And slavery not only continued to exist; it was condoned and protected by the Constitution. Of all the compromises that formed the Constitution, perhaps none would be more important than the compromise over the slave trade. Americans generally perceived the transatlantic slave trade as more violent and immoral than slavery itself. Many northerners opposed it on moral grounds. But they also understood that letting southern states import more Africans would increase their political power. The Constitution counted each enslaved individual as three fifths of a person for purposes of representation, so in districts with many enslaved people, the white voters had extra influence. On the other hand, the states of the Upper South also welcomed a ban on the Atlantic trade because they already had a surplus of enslaved laborers. Banning importation meant enslavers in Virginia and Maryland could get higher prices when they sold their enslaved laborers to states like South Carolina and Georgia that were dependent on a continued slave trade. New England and the Deep South agreed to what was called a “dirty compromise” at the Constitutional Convention in 1787. New Englanders agreed to include a constitutional provision that protected the foreign slave trade for twenty years; in exchange, South Carolina and Georgia delegates had agreed to support a constitutional clause that made it easier for Congress to pass commercial legislation. As a result, the Atlantic slave trade resumed until 1808 when it was outlawed for three reasons. First, Britain was also in the process of outlawing the slave trade in 1807, and the United States did not want to concede any moral high ground to its rival. Second, the Haitian Revolution (1791–1804), a successful slave revolt against French colonial rule in the West Indies, had changed the stakes in the debate. The image of thousands of armed Black revolutionaries terrified white Americans. Third, the Haitian Revolution had ended France’s plans to expand its presence in the Americas, so in 1803, the United States had purchased the Louisiana Territory from the French at a fire-sale price. This massive new territory, which had doubled the size of the United States, had put the question of slavery’s expansion at the top of the national agenda. Many white Americans, including President Thomas Jefferson, thought that ending the external slave trade and dispersing the domestic slave population would keep the United States a white man’s republic and perhaps even lead to the disappearance of slavery. The ban on the slave trade, however, lacked effective enforcement measures and funding. Moreover, instead of freeing illegally imported Africans, the act left their fate to the individual states, and many of those states simply sold intercepted enslaved people at auction. Thus, the ban preserved the logic of property ownership in human beings. The new federal government protected slavery as much as it expanded democratic rights and privileges for white men.14   VI. Hamilton’s Financial System Alexander Hamilton saw America’s future as a metropolitan, commercial, industrial society, in contrast to Thomas Jefferson’s nation of small farmers. While both men had the ear of President Washington, Hamilton’s vision proved most appealing and enduring. John Trumbull, Portrait of Alexander Hamilton, 1806. Wikimedia. President George Washington’s cabinet choices reflected continuing political tensions over the size and power of the federal government. The vice president was John Adams, and Washington chose Alexander Hamilton to be his secretary of the treasury. Both men wanted an active government that would promote prosperity by supporting American industry. However, Washington chose Thomas Jefferson to be his secretary of state, and Jefferson was committed to restricting federal power and preserving an economy based on agriculture. Almost from the beginning, Washington struggled to reconcile the Federalist and Republican (or Democratic-Republican) factions within his own administration.15 Alexander Hamilton believed that self-interest was the “most powerful incentive of human actions.” Self-interest drove humans to accumulate property, and that effort created commerce and industry. According to Hamilton, government had important roles to play in this process. First, the state should protect private property from theft. Second, according to Hamilton, the state should use human “passions” and “make them subservient to the public good.”16 In other words, a wise government would harness its citizens’ desire for property so that both private individuals and the state would benefit. Hamilton, like many of his contemporary statesmen, did not believe the state should ensure an equal distribution of property. Inequality was understood as “the great & fundamental distinction in Society,” and Hamilton saw no reason why this should change. Instead, Hamilton wanted to tie the economic interests of wealthy Americans, or “monied men,” to the federal government’s financial health. If the rich needed the government, then they would direct their energies to making sure it remained solvent.17 Hamilton, therefore, believed that the federal government must be “a Repository of the Rights of the wealthy.”18 As the nation’s first secretary of the treasury, he proposed an ambitious financial plan to achieve just that. The first part of Hamilton’s plan involved federal “assumption” of state debts, which were mostly left over from the Revolutionary War. The federal government would assume responsibility for the states’ unpaid debts, which totaled about $25 million. Second, Hamilton wanted Congress to create a bank—a Bank of the United States. The goal of these proposals was to link federal power and the country’s economic vitality. Under the assumption proposal, the states’ creditors (people who owned state bonds or promissory notes) would turn their old notes in to the treasury and receive new federal notes of the same face value. Hamilton foresaw that these bonds would circulate like money, acting as “an engine of business, and instrument of industry and commerce.”19 This part of his plan, however, was controversial for two reasons. First, many taxpayers objected to paying the full face value on old notes, which had fallen in market value. Often the current holders had purchased them from the original creditors for pennies on the dollar. To pay them at full face value, therefore, would mean rewarding speculators at taxpayer expense. Hamilton countered that government debts must be honored in full, or else citizens would lose all trust in the government. Second, many southerners objected that they had already paid their outstanding state debts, so federal assumption would mean forcing them to pay again for the debts of New Englanders. Nevertheless, President Washington and Congress both accepted Hamilton’s argument. By the end of 1794, 98 percent of the country’s domestic debt had been converted into new federal bonds.20 Hamilton’s plan for a Bank of the United States, similarly, won congressional approval despite strong opposition. Thomas Jefferson and other Republicans argued that the plan was unconstitutional; the Constitution did not authorize Congress to create a bank. Hamilton, however, argued that the bank was not only constitutional but also important for the country’s prosperity. The Bank of the United States would fulfill several needs. It would act as a convenient depository for federal funds. It would print paper banknotes backed by specie (gold or silver). Its agents would also help control inflation by periodically taking state bank notes to their banks of origin and demanding specie in exchange, limiting the amount of notes the state banks printed. Furthermore, it would give wealthy people a vested interest in the federal government’s finances. The government would control just 20 percent of the bank’s stock; the other eighty percent would be owned by private investors. Thus, an “intimate connexion” between the government and wealthy men would benefit both, and this connection would promote American commerce. In 1791, therefore, Congress approved a twenty-year charter for the Bank of the United States. The bank’s stocks, together with federal bonds, created over $70 million in new financial instruments. These spurred the formation of securities markets, which allowed the federal government to borrow more money and underwrote the rapid spread of state-charted banks and other private business corporations in the 1790s. For Federalists, this was one of the major purposes of the federal government. For opponents who wanted a more limited role for industry, however, or who lived on the frontier and lacked access to capital, Hamilton’s system seemed to reinforce class boundaries and give the rich inordinate power over the federal government. Hamilton’s plan, furthermore, had another highly controversial element. In order to pay what it owed on the new bonds, the federal government needed reliable sources of tax revenue. In 1791, Hamilton proposed a federal excise tax on the production, sale, and consumption of a number of goods, including whiskey.   VII. The Whiskey Rebellion and Jay’s Treaty Grain was the most valuable cash crop for many American farmers. In the West, selling grain to a local distillery for alcohol production was typically more profitable than shipping it over the Appalachians to eastern markets. Hamilton’s whiskey tax thus placed a special burden on western farmers. It seemed to divide the young republic in half—geographically between the East and West, economically between merchants and farmers, and culturally between cities and the countryside. In the fall of 1791, sixteen men in western Pennsylvania, disguised in women’s clothes, assaulted a tax collector named Robert Johnson. They tarred and feathered him, and the local deputy marshals seeking justice met similar fates. They were robbed and beaten, whipped and flogged, tarred and feathered, and tied up and left for dead. The rebel farmers also adopted other protest methods from the Revolution and Shays’s Rebellion, writing local petitions and erecting liberty poles. For the next two years, tax collections in the region dwindled. Then, in July 1794, groups of armed farmers attacked federal marshals and tax collectors, burning down at least two tax collectors’ homes. At the end of the month, an armed force of about seven thousand, led by the radical attorney David Bradford, robbed the U.S. mail and gathered about eight miles east of Pittsburgh. President Washington responded quickly. First, Washington dispatched a committee of three distinguished Pennsylvanians to meet with the rebels and try to bring about a peaceful resolution. Meanwhile, he gathered an army of thirteen thousand militiamen in Carlisle, Pennsylvania. On September 19, Washington became the only sitting president to lead troops in the field, though he quickly turned over the army to the command of Henry Lee, a Revolutionary hero and the current governor of Virginia. As the federal army moved westward, the farmers scattered. Hoping to make a dramatic display of federal authority, Alexander Hamilton oversaw the arrest and trial of a number of rebels. Many were released because of a lack of evidence, and most of those who remained, including two men sentenced to death for treason, were soon pardoned by the president. The Whiskey Rebellion had shown that the federal government was capable of quelling internal unrest. But it also demonstrated that some citizens, especially poor westerners, viewed it as their enemy.21 Around the same time, another national issue also aroused fierce protest. Along with his vision of a strong financial system, Hamilton also had a vision of a nation busily engaged in foreign trade. In his mind, that meant pursuing a friendly relationship with one nation in particular: Great Britain. America’s relationship with Britain since the end of the Revolution had been tense, partly because of warfare between the British and French. Their naval war threatened American shipping, and the impressment of men into Britain’s navy terrorized American sailors. American trade could be risky and expensive, and impressment threatened seafaring families. Nevertheless, President Washington was conscious of American weakness and was determined not to take sides. In April 1793, he officially declared that the United States would remain neutral.22 With his blessing, Hamilton’s political ally John Jay, who was currently serving as chief justice of the Supreme Court, sailed to London to negotiate a treaty that would satisfy both Britain and the United States. Jefferson and Madison strongly opposed these negotiations. They mistrusted Britain and saw the treaty as the American state favoring Britain over France. The French had recently overthrown their own monarchy, and Republicans thought the United States should be glad to have the friendship of a new revolutionary state. They also suspected that a treaty with Britain would favor northern merchants and manufacturers over the agricultural South. In November 1794, despite their misgivings, John Jay signed a “treaty of amity, commerce, and navigation” with the British. Jay’s Treaty, as it was commonly called, required Britain to abandon its military positions in the Northwest Territory (especially Fort Detroit, Fort Mackinac, and Fort Niagara) by 1796. Britain also agreed to compensate American merchants for their losses. The United States, in return, agreed to treat Britain as its most prized trade partner, which meant tacitly supporting Britain in its current conflict with France. Unfortunately, Jay had failed to secure an end to impressment.23 For Federalists, this treaty was a significant accomplishment. Jay’s Treaty gave the United States, a relatively weak power, the ability to stay officially neutral in European wars, and it preserved American prosperity by protecting trade. For Jefferson’s Republicans, however, the treaty was proof of Federalist treachery. The Federalists had sided with a monarchy against a republic, and they had submitted to British influence in American affairs without even ending impressment. In Congress, debate over the treaty transformed the Federalists and Republicans from temporary factions into two distinct (though still loosely organized) political parties.   VIII. The French Revolution and the Limits of Liberty The mounting body count of the French Revolution included that of the queen and king, who were beheaded in a public ceremony in early 1793, as depicted in the engraving. While Americans disdained the concept of monarchy, the execution of King Louis XVI was regarded by many Americans as an abomination, an indication of the chaos and savagery reigning in France at the time. Charles Monnet (artist), Antoine-Jean Duclos and Isidore-Stanislas Helman (engravers), Day of 21 January 1793 the death of Louis Capet on the Place de la Révolution, 1794. Wikimedia. In part, the Federalists were turning toward Britain because they feared the most radical forms of democratic thought. In the wake of Shays’s Rebellion, the Whiskey Rebellion, and other internal protests, Federalists sought to preserve social stability. The course of the French Revolution seemed to justify their concerns. In 1789, news had arrived in America that the French had revolted against their king. Most Americans imagined that liberty was spreading from America to Europe, carried there by the returning French heroes who had taken part in the American Revolution. Initially, nearly all Americans had praised the French Revolution. Towns all over the country hosted speeches and parades on July 14 to commemorate the day it began. Women had worn neoclassical dress to honor republican principles, and men had pinned revolutionary cockades to their hats. John Randolph, a Virginia planter, named two of his favorite horses Jacobin and Sans-Culotte after French revolutionary factions.24 In April 1793, a new French ambassador, “Citizen” Edmond-Charles Genêt, arrived in the United States. During his tour of several cities, Americans greeted him with wild enthusiasm. Citizen Genêt encouraged Americans to act against Spain, a British ally, by attacking its colonies of Florida and Louisiana. When President Washington refused, Genêt threatened to appeal to the American people directly. In response, Washington demanded that France recall its diplomat. In the meantime, however, Genêt’s faction had fallen from power in France. Knowing that a return home might cost him his head, he decided to remain in America. Genêt’s intuition was correct. A radical coalition of revolutionaries had seized power in France. They initiated a bloody purge of their enemies, the Reign of Terror. As Americans learned about Genêt’s impropriety and the mounting body count in France, many began to have second thoughts about the French Revolution. Americans who feared that the French Revolution was spiraling out of control tended to become Federalists. Those who remained hopeful about the revolution tended to become Republicans. Not deterred by the violence, Thomas Jefferson declared that he would rather see “half the earth desolated” than see the French Revolution fail. “Were there but an Adam and an Eve left in every country, and left free,” he wrote, “it would be better than as it now is.”25 Meanwhile, the Federalists sought closer ties with Britain. Despite the political rancor, in late 1796 there came one sign of hope: the United States peacefully elected a new president. For now, as Washington stepped down and executive power changed hands, the country did not descend into the anarchy that many leaders feared. The new president was John Adams, Washington’s vice president. Adams was less beloved than the old general, and he governed a deeply divided nation. The foreign crisis also presented him with a major test. In response to Jay’s Treaty, the French government authorized its vessels to attack American shipping. To resolve this, President Adams sent envoys to France in 1797. The French insulted these diplomats. Some officials, whom the Americans code-named X, Y, and Z in their correspondence, hinted that negotiations could begin only after the Americans offered a bribe. When the story became public, this XYZ Affair infuriated American citizens. Dozens of towns wrote addresses to President Adams, pledging him their support against France. Many people seemed eager for war. “Millions for defense,” toasted South Carolina representative Robert Goodloe Harper, “but not one cent for tribute.”26 By 1798, the people of Charleston watched the ocean’s horizon apprehensively because they feared the arrival of the French navy at any moment. Many people now worried that the same ships that had aided Americans during the Revolutionary War might discharge an invasion force on their shores. Some southerners were sure that this force would consist of Black troops from France’s Caribbean colonies, who would attack the southern states and cause their enslaved laborers to revolt. Many Americans also worried that France had covert agents in the country. In the streets of Charleston, armed bands of young men searched for French disorganizers. Even the little children prepared for the looming conflict by fighting with sticks.27 Meanwhile, during the crisis, New Englanders were some of the most outspoken opponents of France. In 1798, they found a new reason for Francophobia. An influential Massachusetts minister, Jedidiah Morse, announced to his congregation that the French Revolution had been hatched in a conspiracy led by a mysterious anti-Christian organization called the Illuminati. The story was a hoax, but rumors of Illuminati infiltration spread throughout New England like wildfire, adding a new dimension to the foreign threat.28 Against this backdrop of fear, the French Quasi-War, as it would come to be known, was fought on the Atlantic, mostly between French naval vessels and American merchant ships. During this crisis, however, anxiety about foreign agents ran high, and members of Congress took action to prevent internal subversion. The most controversial of these steps were the Alien and Sedition Acts. These two laws, passed in 1798, were intended to prevent French agents and sympathizers from compromising America’s resistance, but they also attacked Americans who criticized the president and the Federalist Party. The Alien Act allowed the federal government to deport foreign nationals, or “aliens,” who seemed to pose a national security threat. Even more dramatically, the Sedition Act allowed the government to prosecute anyone found to be speaking or publishing “false, scandalous, and malicious writing” against the government.29 These laws were not simply brought on by war hysteria. They reflected common assumptions about the nature of the American Revolution and the limits of liberty. In fact, most of the advocates for the Constitution and the First Amendment accepted that free speech simply meant a lack of prior censorship or restraint, not a guarantee against punishment. According to this logic, “licentious” or unruly speech made society less free, not more. James Wilson, one of the principal architects of the Constitution, argued that “every author is responsible when he attacks the security or welfare of the government.”30 In 1798, most Federalists were inclined to agree. Under the terms of the Sedition Act, they indicted and prosecuted several Republican printers—and even a Republican congressman who had criticized President Adams. Meanwhile, although the Adams administration never enforced the Alien Act, its passage was enough to convince some foreign nationals to leave the country. For the president and most other Federalists, the Alien and Sedition Acts represented a continuation of a conservative rather than radical American Revolution. However, the Alien and Sedition Acts caused a backlash in two ways. First, shocked opponents articulated a new and expansive vision for liberty. The New York lawyer Tunis Wortman, for example, demanded an “absolute independence” of the press.31 Likewise, the Virginia judge George Hay called for “any publication whatever criminal” to be exempt from legal punishment.32 Many Americans began to argue that free speech meant the ability to say virtually anything without fear of prosecution. Second, James Madison and Thomas Jefferson helped organize opposition from state governments. Ironically, both of them had expressed support for the principle behind the Sedition Act in previous years. Jefferson, for example, had written to Madison in 1789 that the nation should punish citizens for speaking “false facts” that injured the country.33 Nevertheless, both men now opposed the Alien and Sedition Acts on constitutional grounds. In 1798, Jefferson made this point in a resolution adopted by the Kentucky state legislature. A short time later, the Virginia legislature adopted a similar document written by Madison. The Kentucky and Virginia Resolutions argued that the national government’s authority was limited to the powers expressly granted by the U.S. Constitution. More importantly, they asserted that the states could declare federal laws unconstitutional. For the time being, these resolutions were simply gestures of defiance. Their bold claim, however, would have important effects in later decades. In just a few years, many Americans’ feelings toward France had changed dramatically. Far from rejoicing in the “light of freedom,” many Americans now feared the “contagion” of French-style liberty. Debates over the French Revolution in the 1790s gave Americans some of their earliest opportunities to articulate what it meant to be American. Did American national character rest on a radical and universal vision of human liberty? Or was America supposed to be essentially pious and traditional, an outgrowth of Great Britain? They couldn’t agree. It was on this cracked foundation that many conflicts of the nineteenth century would rest.   IX. Religious Freedom One reason the debates over the French Revolution became so heated was that Americans were unsure about their own religious future. The Illuminati scare of 1798 was just one manifestation of this fear. Across the United States, a slow but profound shift in attitudes toward religion and government began. In 1776, none of the American state governments observed the separation of church and state. On the contrary, all thirteen states either had established, official, and tax-supported state churches, or at least required their officeholders to profess a certain faith. Most officials believed this was necessary to protect morality and social order. Over the next six decades, however, that changed. In 1833, the final state, Massachusetts, stopped supporting an official religious denomination. Historians call that gradual process disestablishment. In many states, the process of disestablishment had started before the creation of the Constitution. South Carolina, for example, had been nominally Anglican before the Revolution, but it had dropped denominational restrictions in its 1778 constitution. Instead, it now allowed any church consisting of at least fifteen adult males to become “incorporated,” or recognized for tax purposes as a state-supported church. Churches needed only to agree to a set of basic Christian theological tenets, which were vague enough that most denominations could support them.34 South Carolina tried to balance religious freedom with the religious practice that was supposed to be necessary for social order. Officeholders were still expected to be Christians; their oaths were witnessed by God, they were compelled by their religious beliefs to tell the truth, and they were called to live according to the Bible. This list of minimal requirements came to define acceptable Christianity in many states. As new Christian denominations proliferated between 1780 and 1840, however, more and more Christians fell outside this definition. South Carolina continued its general establishment law until 1790, when a constitutional revision removed the establishment clause and religious restrictions on officeholders. Many other states, though, continued to support an established church well into the nineteenth century. The federal Constitution did not prevent this. The religious freedom clause in the Bill of Rights, during these decades, limited the federal government but not state governments. It was not until 1833 that a state supreme court decision ended Massachusetts’s support for the Congregational Church. Many political leaders, including Thomas Jefferson and James Madison, favored disestablishment because they saw the relationship between church and state as a tool of oppression. Jefferson proposed a Statute for Religious Freedom in the Virginia state assembly in 1779, but his bill failed in the overwhelmingly Anglican legislature. Madison proposed it again in 1785, and it defeated a rival bill that would have given equal revenue to all Protestant churches. Instead Virginia would not use public money to support religion. “The Religion then of every man,” Jefferson wrote, “must be left to the conviction and conscience of every man; and it is the right of every man to exercise it as these may dictate.”35 At the federal level, the delegates to the Constitutional Convention of 1787 easily agreed that the national government should not have an official religion. This principle was upheld in 1791 when the First Amendment was ratified, with its guarantee of religious liberty. The limits of federal disestablishment, however, required discussion. The federal government, for example, supported Native American missionaries and congressional chaplains. Well into the nineteenth century, debate raged over whether the postal service should operate on Sundays, and whether non-Christians could act as witnesses in federal courts. Americans continued to struggle to understand what it meant for Congress not to “establish” a religion.   X. The Election of 1800 The year 1800 brought about a host of changes in government, in particular the first successful and peaceful transfer of power from one political party to another. But the year was important for another reason: the U.S. Capitol in Washington, D.C. (pictured here in 1800) was finally opened to be occupied by Congress, the Supreme Court, the Library of Congress, and the courts of the District of Columbia. William Russell Birch, A view of the Capitol of Washington before it was burnt down by the British, c. 1800. Wikimedia. Meanwhile, the Sedition and Alien Acts expired in 1800 and 1801. They had been relatively ineffective at suppressing dissent. On the contrary, they were much more important for the loud reactions they had inspired. They had helped many Americans decide what they didn’t want from their national government. By 1800, therefore, President Adams had lost the confidence of many Americans. They had let him know it. In 1798, for instance, he had issued a national thanksgiving proclamation. Instead of enjoying a day of celebration and thankfulness, Adams and his family had been forced by rioters to flee the capital city of Philadelphia until the day was over. Conversely, his prickly independence had also put him at odds with Alexander Hamilton, the leader of his own party, who offered him little support. After four years in office, Adams found himself widely reviled. In the election of 1800, therefore, the Republicans defeated Adams in a bitter and complicated presidential race. During the election, one Federalist newspaper article predicted that a Republican victory would fill America with “murder, robbery, rape, adultery, and incest.”36 A Republican newspaper, on the other hand, flung sexual slurs against President Adams, saying he had “neither the force and firmness of a man, nor the gentleness and sensibility of a woman.” Both sides predicted disaster and possibly war if the other should win.37 In the end, the contest came down to a tie between two Republicans, Thomas Jefferson of Virginia and Aaron Burr of New York, who each had seventy-three electoral votes. (Adams had sixty-five.) Burr was supposed to be a candidate for vice president, not president, but under the Constitution’s original rules, a tie-breaking vote had to take place in the House of Representatives. It was controlled by Federalists bitter at Jefferson. House members voted dozens of times without breaking the tie. On the thirty-sixth ballot, Thomas Jefferson emerged victorious. Republicans believed they had saved the United States from grave danger. An assembly of Republicans in New York City called the election a “bloodless revolution.” They thought of their victory as a revolution in part because the Constitution (and eighteenth-century political theory) made no provision for political parties. The Republicans thought they were fighting to rescue the country from an aristocratic takeover, not just taking part in a normal constitutional process. This image attacks Jefferson’s support of the French Revolution and religious freedom. The letter, “To Mazzei,” refers to a 1796 correspondence that criticized the Federalists and, by association, President Washington. Providential Detection, 1797. Courtesy American Antiquarian Society. Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0. In his first inaugural address, however, Thomas Jefferson offered an olive branch to the Federalists. He pledged to follow the will of the American majority, whom he believed were Republicans, but to respect the rights of the Federalist minority. His election set an important precedent. Adams accepted his electoral defeat and left the White House peacefully. “The revolution of 1800,” Jefferson wrote years later, did for American principles what the Revolution of 1776 had done for its structure. But this time, the revolution was accomplished not “by the sword” but “by the rational and peaceable instrument of reform, the suffrage of the people.”38 Four years later, when the Twelfth Amendment changed the rules for presidential elections to prevent future deadlocks, it was designed to accommodate the way political parties worked. Despite Adams’s and Jefferson’s attempts to tame party politics, though, the tension between federal power and the liberties of states and individuals would exist long into the nineteenth century. And while Jefferson’s administration attempted to decrease federal influence, Chief Justice John Marshall, an Adams appointee, worked to increase the authority of the Supreme Court. These competing agendas clashed most famously in the 1803 case of Marbury v. Madison, which Marshall used to establish a major precedent. The Marbury case seemed insignificant at first. The night before leaving office in early 1801, Adams had appointed several men to serve as justices of the peace in Washington, D.C. By making these “midnight appointments,” Adams had sought to put Federalists into vacant positions at the last minute. On taking office, however, Jefferson and his secretary of state, James Madison, had refused to deliver the federal commissions to the men Adams had appointed. Several of the appointees, including William Marbury, sued the government, and the case was argued before the Supreme Court. Marshall used Marbury’s case to make a clever ruling. On the issue of the commissions, the Supreme Court ruled in favor of the Jefferson administration. But Chief Justice Marshall went further in his decision, ruling that the Supreme Court reserved the right to decide whether an act of Congress violated the Constitution. In other words, the court assumed the power of judicial review. This was a major (and lasting) blow to the Republican agenda, especially after 1810, when the Supreme Court extended judicial review to state laws. Jefferson was particularly frustrated by the decision, arguing that the power of judicial review “would make the Judiciary a despotic branch.”39   XI. Conclusion A grand debate over political power engulfed the young United States. The Constitution ensured that there would be a strong federal government capable of taxing, waging war, and making law, but it could never resolve the young nation’s many conflicting constituencies. The Whiskey Rebellion proved that the nation could stifle internal dissent but exposed a new threat to liberty. Hamilton’s banking system provided the nation with credit but also constrained frontier farmers. The Constitution’s guarantee of religious liberty conflicted with many popular prerogatives. Dissension only deepened, and as the 1790s progressed, Americans became bitterly divided over political parties and foreign war. During the ratification debates, Alexander Hamilton had written of the wonders of the Constitution. “A nation, without a national government,” he wrote, would be “an awful spectacle.” But, he added, “the establishment of a Constitution, in time of profound peace, by the voluntary consent of a whole people, is a prodigy,” a miracle that should be witnessed “with trembling anxiety.”40 Anti-Federalists had grave concerns about the Constitution, but even they could celebrate the idea of national unity. By 1795, even the staunchest critics would have grudgingly agreed with Hamilton’s convictions about the Constitution. Yet these same individuals could also take the cautions in Washington’s 1796 farewell address to heart. “There is an opinion,” Washington wrote, “that parties in free countries are useful checks upon the administration of the government and serve to keep alive the spirit of liberty.” This, he conceded, was probably true, but in a republic, he said, the danger was not too little partisanship, but too much. “A fire not to be quenched,” Washington warned, “it demands a uniform vigilance to prevent its bursting into a flame, lest, instead of warming, it should consume.”41 For every parade, thanksgiving proclamation, or grand procession honoring the unity of the nation, there was also some political controversy reminding American citizens of how fragile their union was. And as party differences and regional quarrels tested the federal government, the new nation increasingly explored the limits of its democracy.   XII. Primary Sources 1. Hector St. Jean de Crèvecœur describes the American people, 1782 Hector St. John de Crèvecœur was born in France, but relocated to the colony of New York and married a local woman named Mehitable Tippet. For a period of several years, de Crèvecœur wrote about the people he encountered in North America. The resulting work was widely successful in Europe. In this passage, Crèvecœur attempts to reflect on the difference between life in Europe and life in North America. 2. A Confederation of Native peoples seek peace with the United States, 1786 In 1786, half a year before the Constitutional Convention, a collection of Native American leaders gathered on the banks of the Detroit River to offer a unified message to the Congress of the United States. Despite this proposal, American surveyors, settlers, and others continued to cross the Ohio River. 3. Mary Smith Cranch comments on politics, 1786-87 In the aftermath of the Revolution, politics became a sport consumed by both men and women. In a series of letters sent to her sister, Mary Smith Cranch comments on a series of political events including the lack of support for diplomats, the circulation of paper or hard currency, legal reform, tariffs against imported tea tables, Shays’s rebellion, and the role of women in supporting the nation’s interests. 4. James Madison, Memorial and Remonstrance Against Religious Assessments, 1785 Before the American Revolution, Virginia supported local Anglican churches through taxes. After the American Revolution, Virginia had to decide what to do with this policy. Some founding fathers, including Patrick Henry, wanted to equally distribute tax dollars to all churches. In this document, James Madison explains why he did not want any government money to support religious causes in Virginia. 5. George Washington, “Farewell Address,” 1796 George Washington used his final public address as president to warn against what he understood as the two greatest dangers to American prosperity: political parties and foreign wars. Washington urged the American people to avoid political partisanship and entanglements with European wars.  6. Venture Smith, A Narrative of the Life and Adventures of Venture Smith, 1798 Venture Smith’s autobiography is one of the earliest slave narratives to circulate in the Atlantic World. Slave narratives grew into the most important genre of antislavery literature and bore testimony to the injustices of the slave system. Smith was unusually lucky in that he was able to purchase his freedom, but his story nonetheless reveals the hardships faced by even the most fortunate enslaved men and women. 7. Susannah Rowson, Charlotte Temple, 1794 In Charlotte Temple, the first novel written in America, Susannah Rowson offered a cautionary tale of a woman deceived and then abandoned by a roguish man. Americans throughout the new nation read the book with rapt attention and many even traveled to New York City to visit the supposed grave of this fictional character. 8. Constitutional ratification cartoon, 1789 The Massachusetts Centinel ran a series of cartoons depicting the ratification of the Constitution.  Each vertical pillar represents a state that has ratified the new government.  In this cartoon, North Carolina’s pillar is being guided into place (it would vote for ratification in November 1789).  Rhode Island’s pillar, however, is crumbling and shows the uncertainty of the vote there.    9. Anti-Thomas Jefferson Cartoon, 1797 This image attacks Jefferson’s support of the French Revolution and religious freedom.  The Altar to “Gallic Despotism” mocks Jefferson’s allegiance to the French. The letter, “To Mazzei,” refers to a 1796 correspondence that criticized the Federalists and, by association, President Washington.    XIII. Reference Material This chapter was edited by Tara Strauch, with content contributions by Marco Basile, Nathaniel C. Green, Brenden Kennedy, Spencer McBride, Andrea Nero, Cara Rogers, Tara Strauch, Michael Harrison Taylor, Jordan Taylor, Kevin Wisniewski, and Ben Wright. Recommended citation: Marco Basile et al., “A New Nation,” Tara Strauch, ed., in The American Yawp, eds. Joseph Locke and Ben Wright (Stanford, CA: Stanford University Press, 2018).   Recommended Reading Allgor, Catherine. Parlor Politics: In which the Ladies of Washington Help Build a City and a Government. Charlottesville: University of Virginia Press, 2000. Appleby, Joyce. Inheriting the Revolution: The First Generation of Americans. Cambridge, Mass.: Belknap Press, 2001. Bartolini-Tuazon, Kathleen. For Fear of an Elective King: George Washington and the Presidential Title Controversy of 1789. Ithaca: Cornell University Press, 2014. Beeman, Richard, Stephen Botein, and Edward C. Carter II eds. Beyond Confederation: Origins of the Constitution and American National Identity. Chapel Hill, N.C.: University of North Carolina Press, 1987. Bilder, Mary Sarah. Madison’s Hand: Revising the Constitutional Convention. Cambridge: Harvard University Press, 2015. Bouton, Terry. “A Road Closed: Rural Insurgency in Post-Independence Pennsylvania,” Journal of American History 87:3 (December 2000): 855-887. Cunningham, Noble E. The Jeffersonian Republicans: The Formation of Party Organization, 1789-1801. Chapel Hill, N.C.: University of North Carolina Press, 1967. Dunn, Susan. Jefferson’s Second Revolution: The Election of 1800 and the Triumph of Republicanism. Boston: Houghton Mifflin, 2004. Edling, Max. A Revolution in Favor of Government: Origins of the U.S. Constitution and the Making of the American State. New York: Oxford University Press, 2003 Gordon-Reed, Annette. The Hemingses of Monticello: An American Family. New York: W. W. Norton, 2008. Halperin, Terri Diane. The Alien and Sedition Acts of 1798: Testing the Constitution. Baltimore: Johns Hopkins University Press, 2016. Holton, Woody. Unruly Americans and the Origins of the Constitution. 1st edition. New York: Hill and Wang, 2007. Kierner, Cynthia A. Martha Jefferson Randolph, Daughter of Monticello: Her Life and Times. Chapel Hill: University of North Carolina Press, 2012. Maier, Pauline. Ratification: The People Debate the Constitution, 1787-1788. New York: Simon & Schuster, 2010. Papenfuse, Eric Robert. “Unleashing the ‘Wildness’: The Mobilization of Grassroots Antifederalism in Maryland,” Journal of the Early Republic 16:1 (Spring 1996): 73-106. Pasley, Jeffrey L. The First Presidential Contest: 1796 and the Founding of American Democracy. Lawrence: The University of Kansas Press, 2013. Smith-Rosenberg, Carroll. “Dis-Covering the Subject of the ‘Great Constitutional Discussion,’ 1786-1789,” Journal of American History 79:3 (December 1992): 841-873 Taylor, Alan. William Cooper’s Town: Power and Persuasion on the Frontier of the Early American Republic. Reprint edition. New York: Vintage, 1996. Rakove, Jack N. Original Meanings: Politics and Ideas in the Making of the Constitution. New York: Vintage Books, 1996. Salmon, Marylynn. Women and the Law of Property in Early America. Chapel Hill, N.C.: University of North Carolina Press, 1989. Sharp, James Roger. American Politics in the Early Republic: The New Nation in Crisis. New Haven: Yale University Press, 1993. Slaughter, Thomas P. The Whiskey Rebellion: Frontier Epilogue to the American Revolution. New York: Oxford University Press, 1988. Waldstreicher, David. In the Midst of Perpetual Fetes : The Making of American Nationalism, 1776-1820. Chapel Hill : Williamsburg, Virginia, by the University of North Carolina Press, 1997. Wood, Gordon. Empire of Liberty: A History of the Early Republic, 1789-1815. Oxford: Oxford University Press, 2011. Zagarri, Rosemarie. Revolutionary Backlash: Women and Politics in the Early American Republic. Philadelphia: University of Pennsylvania Press, 2007. Allgor, Catherine. Parlor Politics: In Which the Ladies of Washington Help Build a City and a Government. Charlottesville: University of Virginia Press, 2000. Appleby, Joyce. Inheriting the Revolution: The First Generation of Americans. Cambridge, MA: Belknap Press, 2001. Bartolini-Tuazon, Kathleen. For Fear of an Elective King: George Washington and the Presidential Title Controversy of 1789. Ithaca, NY: Cornell University Press, 2014. Beeman, Richard, Stephen Botein, and Edward C. Carter II, eds. Beyond Confederation: Origins of the Constitution and American National Identity. Chapel Hill: University of North Carolina Press, 1987. Bilder, Mary Sarah. Madison’s Hand: Revising the Constitutional Convention. Cambridge, MA: Harvard University Press, 2015. Bouton, Terry. “A Road Closed: Rural Insurgency in Post-Independence Pennsylvania.” Journal of American History 87, no. 3 (December 2000): 855–887. Cunningham, Noble E. The Jeffersonian Republicans: The Formation of Party Organization, 1789–1801. Chapel Hill: University of North Carolina Press, 1967. Dunn, Susan. Jefferson’s Second Revolution: The Election of 1800 and the Triumph of Republicanism. Boston: Houghton Mifflin, 2004. Edling, Max. A Revolution in Favor of Government: Origins of the U.S. Constitution and the Making of the American State. New York: Oxford University Press, 2003. Gordon-Reed, Annette. The Hemingses of Monticello: An American Family. New York: Norton, 2008. Halperin, Terri Diane. The Alien and Sedition Acts of 1798: Testing the Constitution. Baltimore: Johns Hopkins University Press, 2016. Holton, Woody. Unruly Americans and the Origins of the Constitution. New York: Hill and Wang, 2007. Kierner, Cynthia A. Martha Jefferson Randolph, Daughter of Monticello: Her Life and Times. Chapel Hill: University of North Carolina Press, 2012. Maier, Pauline. Ratification: The People Debate the Constitution, 1787–1788. New York: Simon and Schuster, 2010. Papenfuse, Eric Robert. “Unleashing the ‘Wildness’: The Mobilization of Grassroots Antifederalism in Maryland.” Journal of the Early Republic 16, no. 1 (Spring 1996): 73–106. Pasley, Jeffrey L. The First Presidential Contest: 1796 and the Founding of American Democracy. Lawrence: University of Kansas Press, 2013. Rakove, Jack N. Original Meanings: Politics and Ideas in the Making of the Constitution. New York: Vintage Books, 1996. Salmon, Marylynn. Women and the Law of Property in Early America. Chapel Hill: University of North Carolina Press, 1989. Sharp, James Roger. American Politics in the Early Republic: The New Nation in Crisis. New Haven, CT: Yale University Press, 1993. Slaughter, Thomas P. The Whiskey Rebellion: Frontier Epilogue to the American Revolution. New York: Oxford University Press, 1986. Smith-Rosenberg, Carroll. “Dis-Covering the Subject of the ‘Great Constitutional Discussion,’ 1786–1789.” Journal of American History 79, no. 3 (December 1992): 841–873. Taylor, Alan. William Cooper’s Town: Power and Persuasion on the Frontier of the Early American Republic. New York: Vintage, 1996. Waldstreicher, David. In the Midst of Perpetual Fetes : The Making of American Nationalism, 1776–1820. Chapel Hill : University of North Carolina Press, 1997. Wood, Gordon. Empire of Liberty: A History of the Early Republic, 1789–1815. Oxford, UK: Oxford University Press, 2011. Zagarri, Rosemarie. Revolutionary Backlash: Women and Politics in the Early American Republic. Philadelphia: University of Pennsylvania Press, 2007   Notes Francis Hopkinson, An Account of the Grand Federal Procession, Philadelphia, July 4, 1788 (Philadelphia: Carey, 1788). []George Washington, Thanksgiving Proclamation, October, 3, 1789; Fed. Reg., Presidential Proclamations, 1791–1991. []Hampshire Gazette (CT), September 13, 1786. []James Madison, The Federalist Papers, (New York: Signet Classics, 2003), no. 63. []Woody Holton, Unruly Americans and the Origins of the Constitution (New York: Hill and Wang, 2007), 8–9. []Madison took an active role during the convention. He also did more than anyone else to shape historians’ understandings of the convention by taking meticulous notes. Many of the quotes included here come from Madison’s notes. To learn more about this important document, read Mary Sarah Bilder, Madison’s Hand: Revising the Constitutional Convention (Cambridge, MA: Harvard University Press, 2015). []Virginia (Randolph) Plan as Amended (National Archives Microfilm Publication M866, 1 roll); The Official Records of the Constitutional Convention; Records of the Continental and Confederation Congresses and the Constitutional Convention, 1774–1789, Record Group 360; National Archives. []Richard Beeman, Plain, Honest Men: The Making of the American Constitution (New York: Random House, 2009), 114. []Herbert J. Storing, What the Anti-Federalists Were For: The Political Thought of the Opponents of the Constitution (Chicago: University of Chicago Press, 1981), 16. []Ray Raphael, Mr. President: How and Why the Founders Created a Chief Executive (New York: Knopf, 2012), 50. See also Kathleen Bartoloni-Tuazon, For Fear of an Elected King: George Washington and the Presidential Title Controversy of 1789 (Ithaca, NY: Cornell University Press, 2014). []David J. Siemers, Ratifying the Republic: Antifederalists and Federalists in Constitutional Time (Stanford, CA: Stanford University Press, 2002). []Alexander Hamilton, James Madison, and John Jay, The Federalist Papers, ed. Ian Shapiro (New Haven, CT: Yale University Press, 2009). []Pauline Maier, Ratification: The People Debate the Constitution, 1787–1788 (New York: Simon and Schuster, 2010), 225–237. []David Waldstreicher, Slavery’s Constitution: From Revolution to Ratification (New York: Hill and Wang, 2009). []Carson Holloway, Hamilton Versus Jefferson in the Washington Administration: Completing the Founding or Betraying the Founding? (New York: Cambridge University Press, 2015). []Alexander Hamilton, The Works of Alexander Hamilton, Volume 1, ed. Henry Cabot Lodge, ed. (New York: Putnam, 1904), 70, 408. []Alexander Hamilton, Report on Manufactures (New York: Childs and Swaine, 1791). []James H. Hutson, ed., Supplement to Max Farrand’s the Records of the Federal Convention of 1787 (New Haven, CT: Yale University Press, 1987), 119. []Hamilton, Report on Manufactures). []Richard Sylla, “National Foundations: Public Credit, the National Bank, and Securities Markets,” in Founding Choices: American Economic Policy in the 1790s, ed. Douglas A. Irwin and Richard Sylla (Chicago: University of Chicago Press, 2011), 68. []Thomas P. Slaughter, The Whiskey Rebellion: Frontier Epilogue to the American Revolution (New York: Oxford University Press, 1986). []“Proclamation of Neutrality, 1793,” in A Compilation of the Messages and Papers of the Presidents Prepared Under the Direction of the Joint Committee on printing, of the House and Senate Pursuant to an Act of the Fifty-Second Congress of the United States (New York: Bureau of National Literature, 1897). []United States, Treaty of Amity, Commerce, and Navigation, signed at London November 19, 1794, Submitted to the Senate June 8, Resolution of Advice and Consent, on condition, June 24, 1795. Ratified by the United States August 14, 1795. Ratified by Great Britain October 28, 1795. Ratifications exchanged at London October 28, 1795. Proclaimed February 29, 1796. []Elizabeth Fox-Genovese and Eugene D. Genovese, The Mind of the Master Class: History and Faith in the Southern Slaveholders Worldview (New York: Cambridge University Press, 2005), 18. []From Thomas Jefferson to William Short, 3 January 1793,” Founders Online, National Archives. http://founders.archives.gov/documents/Jefferson/01-25-02-0016, last modified June 29, 2015; The Papers of Thomas Jefferson, vol. 25, 1 January–10 May 1793, ed. John Catanzariti (Princeton, NJ: Princeton University Press, 1992), 14–17. []Robert Goodloe Harper, June 18, 1798, quoted in American Daily Advertiser (Philadelphia), June 20, 1798. []Robert J. Alderson Jr., This Bright Era of Happy Revolutions: French Consul Michel-Ange-Bernard Mangourit and International Republicanism in Charleston, 1792–1794 (Columbia: University of South Carolina Press, 2008). []Rachel Hope Cleves, The Reign of Terror in America: Visions of Violence from Anti-Jacobinism to Antislavery (New York: Cambridge University Press, 2012), 47. []Alien Act, July 6, 1798, and An Act in Addition to the Act, Entitled “An Act for the Punishment of Certain Crimes Against the United States,” July 14, 1798; Fifth Congress; Enrolled Acts and Resolutions; General Records of the United States Government; Record Group 11; National Archives. []James Wilson, Congressional Debate, December 1, 1787, in Jonathan Elliot, ed., The Debates in the Several State Conventions on the Adoption of the Federal Constitution as Recommended by the General Convention at Philadelphia in 1787, Vol. 2 (New York: s.n., 1888) 448–450. []Tunis Wortman, A Treatise Concerning Political Enquiry, and the Liberty of the Press (New York: Forman, 1800), 181. []George Hay, An Essay on the Liberty of the Press (Philadelphia: s.n., 1799), 43. []Thomas Jefferson to James Madison, August 28, 1789, from The Works of Thomas Jefferson in Twelve Volumes, Federal Edition, ed. Paul Leicester Ford. http://www.loc.gov/resource/mtj1.011_0853_0861 []Francis Newton Thorpe, ed., The Federal and State Constitutions, Colonial Charters, and Other Organic Laws of the States, Territories, and Colonies Now or Heretofore Forming the United States of America Compiled and Edited Under the Act of Congress of June 30, 1906 (Washington, DC: U.S. Government Printing Office, 1909). []Thomas Jefferson, An Act for Establishing Religious Freedom, 16 January 1786, Manuscript, Records of the General Assembly, Enrolled Bills, Record Group 78, Library of Virginia. []Catherine Allgor, Parlor Politics: In Which the Ladies of Washington Help Build a City and a Government (Charlottesville: University of Virginia Press, 2000), 14. []James T. Callender, The Prospect Before Us (Richmond: s.n., 1800). []Letter from Thomas Jefferson to Spencer Roane, September 6, 1819, in The Writings of Thomas Jefferson, 20 vols., ed. Albert Ellery Bergh (Washington, DC: Thomas Jefferson Memorial Association of the United States, 1903), 142. []Harold H. Bruff, Untrodden Ground: How Presidents Interpret the Constitution (Chicago: University of Chicago Press, 2015), 65. []Alexander Hamilton, The Federalist Papers (New York: Signet Classics, 2003), no. 85. []George Washington, Farewell Address, Annals of Congress, 4th Congress, 2869–2870. [] This entry was posted in Uncategorized on June 7, 2013 by All Chapters. Post navigation ← 5. The American Revolution 7. The Early Republic →

      The discussion of Shays’s Rebellion reveals how economic struggles and weak national power under the Articles of Confederation created serious unrest among farmers. While some leaders viewed the rebellion as a dangerous threat to order, others believed it represented the same revolutionary spirit that founded the country.

  2. onlinelibrary.wiley.com onlinelibrary.wiley.com
    1. The college counseling landscape has evolved quite a bit. Over the past decade, we have witnessed 2-year and 4-year colleges being shaped by increased attention to mental health issues, crisis response and triage procedures, and students coming to campus already taking prescribed psychotropic medication.

      Test comment for annotation.

    1. Dossier de Synthèse : La Psychologie de l'Engagement

      Résumé Exécutif

      Ce document synthétise les concepts clés de la psychologie de l'engagement, tels que présentés par le professeur Fabien Girandola.

      La thèse centrale est que la persuasion traditionnelle, basée sur l'information et l'argumentation, est largement inefficace pour modifier durablement les comportements.

      En opposition, la théorie de l'engagement propose une approche contre-intuitive mais puissante :

      • amener les individus à réaliser un premier acte, peu coûteux et en situation de libre choix, pour les lier à cet acte et
      • les inciter à adopter des comportements plus significatifs par la suite.

      Des techniques comme le "pied-dans-la-porte" et "l'étiquetage", validées par des décennies de recherche expérimentale, démontrent qu'il est possible d'influencer les actions en structurant la situation plutôt qu'en tentant de convaincre les esprits.

      Un effet psychologique majeur de ces techniques est la "naturalisation" : les individus attribuent leur nouveau comportement à leur propre nature ("je suis altruiste") sans avoir conscience de la manipulation situationnelle qui en est la véritable cause.

      La maîtrise de ces techniques soulève des questions éthiques fondamentales, naviguant entre l'influence et la manipulation.

      1. L'Inefficacité de la Persuasion : Le Fossé entre Opinion et Comportement

      La démarche classique pour changer les comportements repose sur la persuasion : l'idée qu'en fournissant des informations et des arguments convaincants, on peut modifier les opinions des individus, ce qui entraînera une modification de leurs actions.

      1.1. Le Postulat de la Persuasion

      L'approche persuasive suppose une chaîne causale directe :

      1. Information : Présenter des faits (ex: "Le tabac tue").

      2. Conviction : L'individu intègre l'information et modifie son opinion.

      3. Action : L'individu ajuste son comportement pour qu'il soit cohérent avec sa nouvelle opinion.

      1.2. La Démonstration de l'Échec

      Des décennies de recherche en psychologie sociale, depuis les années 1960, montrent que ce lien est faible, voire inexistant.

      Savoir quelque chose ne garantit pas de se conformer à cette connaissance.

      Exemples courants :

      ◦ Les fumeurs savent que le tabac est nocif mais continuent de fumer.   

      ◦ La majorité des gens s'accordent sur l'importance de l'écologie mais n'adoptent que peu de comportements pro-environnementaux.

      L'Expérimentation de Bigman (1972) : Cette étude princeps illustre parfaitement le décalage entre l'opinion déclarée et le comportement réel.

      Phase de l'Expérience

      Résultat

      Sondage d'opinion

      95 % des passants déclarent qu'il est important de garder les rues propres.

      Mise en situation

      Confrontés à un papier à ramasser dans la rue, seulement 2 % des mêmes personnes effectuent le geste.

      Cette expérience fondatrice démontre que l'adhésion à une idée (la propreté) ne se traduit pas automatiquement en action.

      2. La Théorie de l'Engagement : Agir d'Abord, Penser Ensuite

      Face aux limites de la persuasion, la théorie de l'engagement, développée notamment par des chercheurs comme Kiesler, Jean-Léon Beauvois et Robert-Vincent Joule, propose de renverser la logique.

      Au lieu de viser les opinions pour changer les actes, elle vise les actes pour, par la suite, influencer les opinions et les comportements futurs.

      2.1. Définition et Principes

      Définition (Kiesler, 1971) : L'engagement est "le lien qui unit l'individu à son acte".

      Principe fondamental : Ce n'est pas l'individu qui s'engage de lui-même, mais la situation qui l'engage.

      L'objectif est d'amener une personne à réaliser de petits actes progressifs qui l'entraîneront vers des comportements plus coûteux qu'elle n'aurait pas réalisés spontanément.

      2.2. Les Facteurs Clés de l'Engagement

      Pour qu'une situation soit engageante, plusieurs facteurs doivent être réunis.

      Facteur

      Description

      Exemple

      Le Sentiment de Liberté

      C'est le facteur le plus crucial. L'individu doit avoir l'impression qu'il a choisi librement de réaliser l'acte.

      Les formules comme "Vous êtes libre d'accepter ou de refuser" ou "Faites comme vous voulez" augmentent considérablement le taux d'acceptation, car elles créent un sentiment de liberté, même si celui-ci est contextuellement contraint.

      Demander de signer une pétition en ajoutant "mais vous êtes libre de refuser" fait passer le taux d'acceptation de 15 % à 45 %.

      Le Caractère Public

      Un acte réalisé publiquement (signer une pétition, prendre la parole) est plus engageant qu'un acte privé.

      Le nom et la signature laissés lient l'individu à son action.

      Signer une pétition avec son nom complet.

      La Répétition de l'Acte

      Répéter un comportement renforce le lien d'engagement.

      Après avoir prêté un objet plusieurs fois, il devient difficile de refuser.

      Prêter un outil à un voisin chaque semaine.

      Le Coût de l'Acte

      Un acte qui demande un effort ou un sacrifice (en temps, en argent, en énergie) est plus engageant.

      Prêter sa voiture est plus engageant que de prêter un stylo.

      L'Étiquetage (Imputation Interne)

      Attribuer une qualité à une personne ("Je sais que vous êtes serviable") l'engage à se comporter conformément à cette étiquette.

      L'acte semble alors "naturel" pour l'individu.

      Dire à quelqu'un "Vous êtes vraiment quelqu'un de bien".

      Note importante : L'engagement ne fonctionne pas en présence de récompenses ou de punitions.

      Si une personne est payée ou menacée pour faire quelque chose, l'acte n'est pas attribué à une décision interne mais à la contrainte externe.

      Il n'y a donc pas d'engagement psychologique.

      3. Les Techniques de Soumission Librement Consentie

      Ces principes théoriques ont été déclinés en techniques d'induction comportementale concrètes, regroupées sous le nom paradoxal de "soumission librement consentie" :

      l'individu se soumet à une demande tout en ayant le sentiment d'avoir agi librement.

      3.1. Le Pied-dans-la-Porte : Demander Peu pour Obtenir Plus

      C'est la technique la plus connue.

      Elle consiste à faire accepter une première requête très peu coûteuse (l'acte préparatoire) pour augmenter significativement les chances que la personne accepte une seconde requête, beaucoup plus coûteuse (le comportement visé).

      Expérimentation de Freedman & Fraser (1966) - Scénario 1 : L'enquête à domicile

      Condition Expérimentale

      Requête

      Taux d'Acceptation

      Contrôle

      Demande directe : Accepter la visite de 2-3h d'une équipe d'enquêteurs pour fouiller la maison.

      22 %

      Pied-dans-la-porte

      1. Acte préparatoire : Répondre à un court questionnaire téléphonique (accepté par tous).<br>

      2. Requête finale (3 jours plus tard) : Accepter la visite de l'équipe d'enquêteurs.

      53 %

      Expérimentation de Freedman & Fraser (1966) - Scénario 2 : Le panneau dans le jardin

      Condition Expérimentale

      Requête

      Taux d'Acceptation

      Contrôle

      Demande directe : Planter un grand panneau de 4x4m pour la sécurité routière dans son jardin.

      17 %

      Pied-dans-la-porte

      1. Acte préparatoire : Apposer un petit autocollant pour la prévention routière sur sa vitre (accepté par tous).<br>

      2. Requête finale (3 jours plus tard) : Accepter de planter le grand panneau.

      76 %

      3.2. L'Étiquetage et le Pied-dans-la-Porte Implicite

      Cette approche combine l'acte préparatoire avec une valorisation de la personne, l'incitant à réaliser d'elle-même un comportement coûteux, sans qu'on le lui demande explicitement.

      Expérimentation de Joule et al. (2002) - Le billet perdu à Aix-en-Provence

      Le comportement visé est l'altruisme : rendre un billet de 10 € tombé de la poche d'un complice.

      L'acte préparatoire consiste à renseigner un "touriste" (un autre complice) sur un plan.

      La variable clé est la manière dont le touriste remercie la personne.

      Condition

      Réponse du "Touriste" après avoir été aidé

      Taux de Restitution du Billet

      Contrôle

      Pas d'interaction préalable avec le touriste.

      30 %

      Pied-dans-la-porte (Remerciement simple)

      "Merci."

      43 %

      Pied-dans-la-porte (Service)

      "Vous m'avez rendu un grand service."

      48 %

      Pied-dans-la-porte + Étiquetage 1

      "Vous êtes serviable."

      70 %

      Pied-dans-la-porte + Étiquetage 2

      "Vous êtes vraiment quelqu'un de bien."

      78 %

      Cette expérience démontre que l'on peut faire varier le taux d'altruisme de 30 % à 78 % uniquement en modifiant une interaction anodine quelques minutes auparavant.

      4. Conséquences Psychologiques et Éthiques

      4.1. La Naturalisation du Comportement

      L'effet le plus remarquable de l'engagement est que les individus n'ont pas conscience d'avoir été influencés. Interrogés sur les raisons de leur acte (ex: rendre le billet), ils répondent systématiquement :

      "C'est normal, je suis quelqu'un d'altruiste/généreux".

      Signification vs. Détermination :

      Signification : L'explication que l'individu donne à son comportement (interne, liée à sa personnalité).  

      Détermination : La cause réelle du comportement (externe, liée à la situation créée par l'expérimentateur).

      Les individus n'ont pas accès à la véritable détermination de leurs actes et la remplacent par une signification qui valorise leur "moi".

      4.2. La Frontière avec la Manipulation

      Le professeur Girandola insiste sur le fait que ces techniques sont puissantes et naviguent à la frontière de la manipulation.

      Leur connaissance est essentielle non seulement pour les utiliser à bon escient (santé publique, éducation) mais aussi pour s'en prémunir.

      Il rappelle que l'usage de ces techniques par les psychologues est encadré par un code de déontologie strict : "il n'y a pas d'action sans éthique".

      5. Lectures et Ressources Recommandées

      Pour approfondir le sujet, plusieurs ouvrages et articles ont été mentionnés :

      Ouvrages de référence :

      Petit traité de manipulation à l'usage des honnêtes gens par R.-V. Joule et J.-L. Beauvois.  

      La soumission librement consentie par les mêmes auteurs.    ◦ Psychologie sociale et Attitude et comportement par F. Girandola.

      Articles en ligne :

      ◦ Des articles de vulgarisation sur la plateforme The Conversation, notamment sur l'application des techniques de manipulation par Donald Trump ou dans le contexte des soldes.

      Vidéo :

      ◦ La reconstitution filmée de l'expérience du "billet perdu" est disponible en ligne.

      6. Conclusion et Perspectives

      La présentation s'est concentrée sur les fondements de la théorie de l'engagement et la technique du pied-dans-la-porte.

      Il a été précisé que d'autres aspects importants n'ont pas été abordés, notamment :

      • Les effets de l'engagement sur les opinions (via la théorie de la dissonance cognitive).

      L'escalade d'engagement, un processus où un individu persévère dans une décision ou un comportement qui s'avère négatif, simplement parce qu'il s'y est initialement engagé.

    1. L'Intérêt en Situation des Élèves en Musculation : Analyse des Formats d'Autorégulation

      Résumé Exécutif

      Ce document synthétise les résultats d'une étude menée par Arthur Lefebvre dans le cadre du projet REFPS, portant sur l'intérêt en situation des élèves de lycée lors de séances de musculation.

      L'objectif central était de déterminer s'il existe un format d'autorégulation de la charge idéal pour favoriser l'engagement des élèves selon leur niveau d'expertise (novice, intermédiaire, expert).

      Les conclusions majeures indiquent que :

      L'hétérogénéité est mieux gérée par le format RPE 8 (Échelle de perception de l'effort), qui s'avère être le format le plus inclusif, ne créant quasiment aucune différence d'intérêt entre les niveaux.

      Les formats plus complexes (APRE 10, Temps 2, RIR 2) favorisent systématiquement les élèves experts, créant un écart significatif en termes de plaisir et d'intention d'exploration par rapport aux novices.

      Le défi perçu est plus élevé chez les novices, ce qui peut nuire à leur plaisir si la tâche est perçue comme trop complexe.

      Une progression pédagogique est préconisée, débutant par le format RPE pour engager les novices, avant d'introduire des formats plus exigeants comme l'APRE pour développer l'attention et la précision du rapport à la charge.

      --------------------------------------------------------------------------------

      1. Cadre Théorique et Objectifs de l'Étude

      L'étude s'inscrit dans la continuité des travaux sur l'intérêt en situation, défini par Chen (2006) comme l'effet attractif des caractéristiques d'une tâche sur un individu.

      Contrairement aux études précédentes focalisées sur le badminton (activité d'opposition et compétitive), cette recherche explore la musculation, une activité autoréférencée et non compétitive.

      Les Dimensions de l'Intérêt en Situation

      L'analyse s'appuie sur quatre des cinq dimensions du modèle de Tienen (2014) :

      1. Le plaisir instantané : La satisfaction immédiate liée à la pratique.

      2. Le défi : La complexité perçue de la tâche.

      3. La demande d'attention : La concentration requise par l'activité.

      4. L'intention d'exploration : La volonté de découvrir et d'apprendre de nouveaux éléments.

      --------------------------------------------------------------------------------

      2. Méthodologie de la Recherche

      L'étude a suivi un protocole rigoureux sur une séquence complète de 9 leçons :

      Participants : 164 élèves (moyenne d'âge 17 ans) répartis en 5 classes de lycée et des étudiants de STAPS.

      Classification par expertise : 47 novices, 68 intermédiaires, 49 experts (déterminés par un questionnaire d'intérêt individuel).

      Formats testés : Quatre formats basés sur l'autorégulation de la charge :

      APRE 10 : Régulation progressive basée sur la performance.  

      Temps 2 : Format basé sur le temps de travail.    ◦ RIR 2 (Repetitions in Reserve) : Évaluation subjective des répétitions restantes possibles.  

      RPE 8 (Rate of Perceived Exertion) : Évaluation de l'effort perçu sur une échelle de 1 à 10.

      --------------------------------------------------------------------------------

      3. Analyse Comparative des Formats de Pratique

      L'analyse des résultats montre que l'intérêt des élèves varie considérablement selon le format utilisé et leur niveau initial.

      | Format | Impact sur les Experts | Impact sur les Novices | Conclusion Pédagogique | | --- | --- | --- | --- | | APRE 10 | Très favorable (Plaisir, Attention, Exploration élevés). | Moins favorable ; écart significatif avec les experts. | Convient aux élèves expérimentés. | | Temps 2 | Intérêt soutenu. | Différences significatives en faveur des experts. | Format exigeant pour les novices. | | RIR 2 | Plaisir et exploration élevés. | Écart marqué avec les experts. | Favorise l'expertise. | | RPE 8 | Intérêt élevé et constant. | Intérêt quasi identique à celui des experts. | Format idéal pour l'hétérogénéité. |

      Le cas spécifique du format RPE 8

      Le format RPE 8 se distingue comme le "format qui épouse le mieux l'hétérogénéité". Il ne présente qu'une seule différence significative entre novices et experts sur les quatre dimensions étudiées. C'est le format qui "parle le plus à tout le monde", indépendamment du niveau.

      --------------------------------------------------------------------------------

      4. Analyse par Dimensions de l'Intérêt

      Le Défi et le Plaisir

      Il existe une corrélation entre le plaisir et le défi. L'étude révèle que la dimension "défi" est significativement plus élevée chez les novices (2,91 contre 2,54 pour les experts).

      Si le défi est trop grand, la tâche n'est plus optimale et le plaisir diminue.

      L'Intention d'Exploration

      Le format RPE est identifié comme un excellent point d'entrée pour les novices dans l'exploration de l'activité.

      Cependant, il semble insuffisant à lui seul pour maintenir cette dynamique de progression sur le long terme, nécessitant le passage vers d'autres formats plus complexes au fur et à mesure que l'expertise augmente.

      La Demande d'Attention

      Le format APRE 10 est celui qui génère la plus grande différence d'attention entre experts et novices.

      Les résultats suggèrent que pour développer l'attention, il est nécessaire de travailler spécifiquement sur le rapport à la charge et le rapport à l'échec.

      --------------------------------------------------------------------------------

      5. Perspectives et Innovations Pédagogiques

      Proposition d'une Innovation : Le "Format au Tonnage"

      Pour pallier l'absence d'un format unique idéal, Arthur Lefebvre propose une hybridation entre l'APRE et le RPE :

      Principe : 5 séries de 10 répétitions avec le tonnage le plus élevé possible.

      Contrainte : Si l'élève ne parvient pas à réaliser les 10 répétitions (échec ou arrêt prématuré), le score de la série est de 0 kg.

      Objectif : Allier le ressenti sensoriel (RPE) et la rigueur cognitive de la charge (APRE).

      Recommandations pour la Séquence d'Enseignement

      1. Début de séquence : Privilégier le format RPE 8 pour garantir l'engagement de tous les élèves, particulièrement des novices.

      2. Milieu de séquence : Introduire progressivement des formats plus subjectifs ou objectifs (RIR, Temps).

      3. Fin de séquence : Utiliser des formats type APRE ou des formats hybrides pour affiner l'expertise et la concentration sur la performance.

      Limites de l'Étude

      L'auteur souligne que l'échantillon (164 participants) et l'absence de mesure de la temporalité (l'évolution de l'intérêt sur le long terme selon le modèle de Hidi et Renninger) constituent des limites à prendre en compte pour les recherches futures.

    1. Synthèse de Pratiques Pédagogiques : Créer du Lien et de la Bienveillance en Maternelle

      Résumé Exécutif

      Ce document synthétise les réflexions et les pratiques de Mélanie, enseignante en maternelle à Strasbourg forte de 12 ans d'expérience. Le point central de son approche est la création d'un lien affectif profond avec ses élèves, un élément qu'elle considère comme le socle indispensable à tout apprentissage. En rompant délibérément avec la distance professionnelle traditionnelle, elle préconise une « communication bienveillante » et une « fermeté bienveillante » pour instaurer un climat de confiance.

      Le document détaille comment cette posture se traduit concrètement par l'abandon des systèmes de notation du comportement, l'adoption de classes à niveaux multiples pour favoriser la douceur sociale, et une organisation spatiale et pédagogique axée sur l'autonomie. L'objectif ultime est de transformer la classe en un « cocon » serein où l'enfant, apaisé et respecté dans son rythme, développe un rapport positif durable avec l'institution scolaire.

      1. La Centralité du Lien Affectif

      Pour Mélanie, l'attachement entre l'enseignant et l'élève n'est pas un obstacle, mais un moteur pédagogique. Cette vision a évolué au cours de sa carrière, passant d'une réserve initiale à une affirmation assumée de l'affection envers ses élèves.

      Déconstruction des barrières professionnelles

      Contestation du dogme de la distance : L'enseignante s'oppose à l'idée, souvent transmise lors de la formation initiale ou par certains tuteurs, qu'il faut maintenir une distance stricte. Elle cite l'ouvrage Chagrin d'école de Daniel Pennac pour illustrer cette interdiction tacite d'aimer ses élèves.

      Expression de l'affection : Elle assume l'utilisation de surnoms et n'hésite pas à dire « je t'aime » à ses élèves. Selon elle, cette proximité ne nuit pas au respect des règles ; au contraire, la connexion établie renforce l'autorité naturelle et le respect mutuel.

      L'importance de la stabilité : Le lien est plus difficile à tisser pour les enseignants remplaçants ou à temps partiel (comme les stagiaires en quart de décharge). Avoir sa propre classe à temps plein est présenté comme un facteur déterminant pour l'épanouissement professionnel et relationnel.

      L'impact sur le climat scolaire

      Le but est de créer un « cocon » où la sérénité est palpable. En début d'année, l'enseignante privilégie délibérément la relation, le cadre de travail et l'autonomie au détriment immédiat des apprentissages purement académiques, afin de garantir une fluidité pour le reste de l'année.

      2. Une Gestion de Classe Basée sur la Communication

      La communication dans la classe de Mélanie repose sur une compréhension profonde de l'enfant et un rejet des méthodes de coercition classiques.

      Rejet des systèmes de comportement

      L'enseignante a abandonné les outils traditionnels de gestion du comportement (lions de couleur, systèmes de points, etc.) car elle a constaté qu'ils aggravaient souvent les difficultés des élèves les plus fragiles. Elle privilégie désormais :

      La discussion systématique : Même si cela peut sembler répétitif ou aboutir parfois à des impasses, le dialogue reste l'outil principal.

      La compréhension des besoins : L'effort est mis sur l'analyse de la cause du comportement plutôt que sur la sanction immédiate.

      La posture physique : Elle souligne l'importance de parler à « hauteur d'enfant », une pratique inspirée des modèles scandinaves.

      La « Fermeté Bienveillante »

      Cette approche ne signifie pas l'absence de règles. Elle a été qualifiée par un inspecteur de « bienveillante fermeté ».

      Exemple de gestion de conflit : Face à une bévue (ex: écrire sur une table par inadvertance), l'enseignante dédramatise (« tu n'as pas besoin d'avoir l'air triste ») tout en responsabilisant l'enfant (nettoyer avec un papier et de l'eau).

      Limites de la patience : L'agacement survient principalement lors de comportements nuisant aux relations sociales (moqueries, phrases méchantes répétées), plutôt que lors d'accidents matériels.

      3. Organisation Pédagogique et Structure de Classe

      La forme scolaire elle-même est pensée pour soutenir cette bienveillance et s'adapter au rythme biologique et psychologique des enfants.

      Les bénéfices des niveaux multiples

      Mélanie préconise la mixité des âges (Petits, Moyens, Grands) pour plusieurs raisons :

      Atténuation des effets de groupe : Le mélange casse les dynamiques de groupes trop soudés et potentiellement conflictuels qui se suivent depuis la crèche.

      Instauration d'une douceur naturelle : La présence de « petits » incite les plus grands à la protection et au calme, créant une ambiance de type « familial ».

      Bénéfice social : Placer un enfant difficile avec des plus petits peut s'avérer bénéfique pour son propre apaisement.

      Autonomie et différenciation

      Le fonctionnement en autonomie permet d'éviter la standardisation des tâches :

      • Les enfants ne sont pas obligés d'exécuter tous le même travail en même temps.

      • Cela réduit le stress lié à des tâches inadaptées (trop complexes ou trop simples).

      • L'apaisement qui en découle rend les élèves plus disponibles pour les apprentissages.

      Aménagement de l'espace de travail

      L'espace physique est segmenté en zones spécifiques pour favoriser différents types d'activités :

      | Espace | Fonction / Caractéristiques | | --- | --- | | L'Ellipse | Un tracé au sol au milieu de la classe pour les regroupements (préféré aux tables). | | Espaces à scénario | Zones dédiées à des jeux de rôle ou situations thématiques (ex: yoga, pressing). | | Ateliers autonomes | Meubles de rangement organisés par domaines (phonologie, motricité fine, etc.). | | Sous le bureau | Utilisation de l'espace sous le bureau de l'enseignante pour créer un « coin écoute » avec des boîtes à histoires. | | Tables spécifiques | Table en U pour les ateliers dirigés, petite table pour la peinture. |

      4. Posture et Défis de l'Enseignant

      L'enseignement en maternelle exige un investissement personnel et une vigilance constante sur sa propre santé.

      Évolution de la sérénité : La confiance en soi s'acquiert avec les années et la stabilité du poste, permettant d'investir davantage dans la dimension relationnelle.

      Santé vocale : Mélanie souligne la pénibilité du métier pour la voix et les oreilles (bruit de la cour, appels). Elle s'efforce de ne pas crier pour préserver ses cordes vocales et maintenir le calme ambiant.

      Compétences annexes : L'usage d'instruments, comme la guitare (apprise de manière autodidacte), est utilisé comme un outil de lien supplémentaire, très apprécié par les élèves malgré un niveau technique qu'elle juge modeste.

      Conclusion

      L'approche décrite dans ce document montre que la réussite scolaire en maternelle repose sur un équilibre entre une structure pédagogique flexible (autonomie, multi-niveaux) et une relation humaine forte. En cassant la barrière de la distance traditionnelle, l'enseignante crée un environnement sécurisant qui favorise l'appétence des enfants pour l'école dès leurs premières années.

    1. Trajectoires des Jeunes Protégés et Facteurs de Résilience : Note de Synthèse

      Résumé Exécutif

      Ce document synthétise les interventions de Laëtitia Sauvage, chercheuse en anthropologie de l'éducation et membre du Conseil national de la protection de l'enfance, concernant les parcours de résilience des jeunes issus de la protection de l'enfance.

      La thèse centrale établit que la résilience n'est pas une compétence individuelle intrinsèque, mais un processus complexe, dynamique et systémique qui se construit dans l'interaction entre l'individu et son environnement.

      L'institution scolaire est identifiée comme un « tuteur de résilience » potentiel, à condition qu'elle dépasse le cadre strictement disciplinaire pour investir la dimension psychosociale.

      Le rapport au savoir agit comme un levier de mentalisation essentiel, permettant au jeune de se projeter au-delà de ses traumatismes.

      La réussite de ce processus repose sur une approche pluridisciplinaire coordonnée (école, famille, travailleurs sociaux) et sur la capacité des professionnels à décoder les comportements de « résistance » (agressivité, provocation) comme des appels au lien éducatif plutôt que comme de simples manquements disciplinaires.

      --------------------------------------------------------------------------------

      1. Redéfinition Théorique de la Résilience

      La résilience doit être comprise non pas comme une capacité fixe, mais comme un phénomène psychosociologique en constante redéfinition.

      Un processus dynamique : la métaphore du « flipper »

      L'individu est comparé à une bille de flipper, ballotée par les traumatismes. Son parcours de résilience se divise en étapes clés :

      Résistance : Réaction immédiate pour éviter l'effondrement ou la désorganisation mentale.

      Reconstruction : Mécanismes de réparation à moyen terme.

      Remaniement psychique (Néo-développement) : Transformation durable et continue tout au long de la vie.

      Distinction entre les mécanismes de réaction

      Il est crucial de ne pas confondre la résilience avec d'autres modalités de réaction aux traumatismes :

      Résistance : Confrontation nécessaire à l'autorité, souvent perçue à tort comme de l'agressivité gratuite.

      Désilience : Incapacité totale à se mobiliser, pouvant mener à des addictions ou au retrait social.

      Désistance : Abandon d'une sphère spécifique (ex: décrochage scolaire) tout en maintenant un investissement dans d'autres domaines (social, associatif).

      --------------------------------------------------------------------------------

      2. Analyse Systémique et Environnementale

      Le développement de l'enfant s'inscrit dans le modèle écologique de Bronfenbrenner, complété par la notion d'ontosystème.

      | Système | Définition | Rôle dans la Résilience | | --- | --- | --- | | Ontosystème | Monde sensible, psyché et valeurs intimes de l'enfant. | Siège de la sensibilité et des affects traumatiques. | | Microsystème | Sphère immédiate (famille, substituts parentaux). | Souvent le lieu des « fracas » initiaux en protection de l'enfance. | | Mésosystème | Interactions entre les milieux (école, sport, associations). | L'école y joue un rôle pivot de décloisonnement. | | Macrosystème | Normes institutionnelles et politiques nationales. | Évolue vers une meilleure prise en compte de la vulnérabilité. |

      Citation clé : « La résilience est un tricot qui noue une laine développementale avec une laine affective et sociale. Ce n'est pas une substance, c'est un maillage. »

      --------------------------------------------------------------------------------

      3. Le Rôle de l'Institution Scolaire

      L'école peut agir comme un tuteur de résilience en offrant un cadre sécurisant et des opportunités de mentalisation.

      Le rapport au savoir comme levier

      Le rapport au savoir ne se limite pas à l'acquisition de connaissances ; il soutient les capacités de projection de soi.

      Pour les jeunes protégés, l'institution du savoir peut être le seul espace de « sécurité pleine et totale ».

      L'importance de l'« autrui significatif »

      Des gestes simples et humanisants, comme le sourire d'une gardienne ou l'accueil d'un chauffeur de bus, constituent des ancrages fondamentaux.

      Ces interactions valident l'existence de l'enfant et soutiennent son sentiment d'appartenance.

      Défis et statistiques alarmantes

      Le système actuel présente des failles majeures dans l'accompagnement des jeunes confiés :

      Accès aux études supérieures : Seulement 8 % des jeunes issus de la protection de l'enfance (contre 52 % en population générale).

      Retard scolaire : 40 % des enfants de 11 ans accueillis sont encore en primaire (contre 10 % en population générale).

      --------------------------------------------------------------------------------

      4. Facteurs de Risque et de Protection

      L'analyse doit porter sur l'équilibre entre les vulnérabilités et les ressources disponibles.

      Facteurs de risque (Freins)

      • Manque de coordination entre enseignants, familles et travailleurs sociaux.

      • Orientations scolaires contraintes par des impératifs d'autonomie financière rapide.

      • Instabilité géographique (déplacements fréquents de lieux d'accueil).

      • Réunions institutionnelles organisées durant le temps scolaire, entravant la scolarité.

      Facteurs de protection (Leviers)

      Relations stables : Présence d'adultes référents non-jugeants.

      Espaces sécures : Accès aux bibliothèques, foyers ou salles de repos.

      Renforcement positif : Valorisation systématique des forces de caractère et des efforts de l'élève.

      Compétences psychosociales : Développement de l'estime de soi et de la capacité d'agir.

      --------------------------------------------------------------------------------

      5. Stratégies et Outils Opérationnels

      Pour transformer un établissement en environnement porteur de résilience, trois étapes de professionnalisation sont proposées :

      1. Identifier et dissocier : Apprendre à distinguer les mécanismes de défense (souvent inconscients, comme la sur-intellectualisation) des stratégies d'adaptation (recherche active d'informations).

      2. Décoder la résistance : Comprendre que l'agressivité d'un jeune peut être une marque de confiance, une « porte ouverte à la relation éducative » dans un lieu où il s'autorise enfin à exprimer son traumatisme.

      3. Valoriser les ressources psychologiques : S'appuyer sur des modèles comme les 24 forces de caractère de Seligman ou les ressources de Pourtois (affectives, sociales, cognitives, conatives).

      Programmes de « résilience assistée » mentionnés :

      Spark : Utilisation de supports ludiques pour la mentalisation.

      Care Commites (Pays-Bas) : Approche communautaire intégrée.

      Mentorat (Espagne) : Accompagnement par les pairs ou des tuteurs externes.

      Projets personnels d'accompagnement : Création d'une alliance éducative entre le jeune, un enseignant de son choix et son éducateur.

      --------------------------------------------------------------------------------

      Conclusion

      La promotion de la résilience en milieu scolaire exige un changement de paradigme : il ne s'agit plus de se focaliser uniquement sur le traumatisme ou les lacunes disciplinaires, mais d'adopter une approche inclusive et systémique.

      En identifiant les forces intrinsèques des jeunes et en sécurisant leur rapport au savoir, l'école devient le terreau d'un nouveau développement, permettant à l'élève de transformer son « fracas » initial en un épanouissement original et durable.

    1. Rapport de Synthèse : Conclusion du Grand Témoin – Julien Gagnebien

      Résumé Exécutif

      Ce document synthétise l'intervention de Julien Gagnebien, Inspecteur Général, lors d'un séminaire à l'INSPÉ Lille.

      L'analyse souligne un changement de paradigme nécessaire dans l'enseignement de l'Éducation Physique et Sportive (EPS). Les points clés incluent :

      L'Éthique au cœur du métier : L'enseignement doit équilibrer l'éthique relationnelle et l'éthique conceptuelle pour répondre aux besoins fondamentaux des élèves.

      La mutation du Champ d'Apprentissage 5 (CA5) : Bien que populaire, le CA5 (musculation, step, etc.) doit se réinventer pour aider les élèves à développer un regard critique face à l'influence croissante des réseaux sociaux et des influenceurs fitness.

      De l'exécution à la conception : Les futurs enseignants sont encouragés à privilégier le « quoi » et le « pourquoi » pédagogique avant le « comment », en s'éloignant des formats d'enseignement exclusifs ou obsolètes.

      La posture du « Chercheur de solutions » : L'institution ne demande pas des enseignants conformistes, mais des praticiens capables de douter, d'expérimenter et de collaborer pour favoriser la réussite de tous les élèves.

      --------------------------------------------------------------------------------

      1. Posture Professionnelle et Analyse de la Pratique

      L'intervention met en avant l'importance de l'observation et de la collaboration entre la recherche et le terrain pour l'évolution des pratiques en EPS.

      L'importance de l'observation in situ

      L'observation n'est pas une perte de temps, mais un levier de transformation majeure. Julien Gagnebien souligne que :

      • L'observation outillée permet de réinterroger les méthodes de l'enseignant.

      • Elle aide à mesurer le lien de cause à effet entre le contexte créé par l'enseignant et l'engagement réel des élèves.

      • Pour les candidats aux concours (CAPEPS), cette phase nourrit directement les propositions pour les épreuves orales.

      La relation Recherche-Praticiens

      L'Inspection Générale accorde une valeur significative aux enseignants-chercheurs. Leur travail est perçu comme un service essentiel pour faire évoluer les praticiens, malgré les contraintes financières des laboratoires. Cette synergie permet de nourrir l'institution et d'impulser de nouvelles dynamiques pédagogiques.

      --------------------------------------------------------------------------------

      2. Éthique et Engagement : La « Fleur des Besoins »

      Le métier d'enseignant repose sur une double responsabilité : marquer positivement la vie des élèves et adopter une posture juste.

      L'équilibre des éthiques

      Les enseignants d'excellence se situent à l'équilibre entre deux piliers :

      1. L'éthique relationnelle : La qualité du lien avec les élèves (point fort actuel des enseignants d'EPS).

      2. L'éthique conceptuelle : La capacité à concevoir des contextes d'apprentissage pertinents.

      La satisfaction des besoins fondamentaux

      S'appuyant sur les travaux d'André Canvel et Damien Tessier (théories de l'autorégulation), l'intervention présente la « fleur des besoins ». L'engagement de l'élève dépend de la capacité de l'enseignant à nourrir ces bulles :

      | Catégorie de besoins | Éléments clés | | --- | --- | | Sécurité et Confiance | Création d'un climat de classe serein. | | Justice et Respect | Évaluations transparentes et équitables. | | Autonomie et Choix | Possibilité pour l'élève de s'exprimer et de décider. | | Appartenance et Estime | Sentiment de faire partie du groupe et valorisation de soi. |

      Constat : Dans une leçon, le désengagement survient souvent après le premier quart d'heure, lorsque l'élève perçoit que le « menu » proposé ne répond pas à ces besoins.

      --------------------------------------------------------------------------------

      3. Analyse Critique des Formats de Pratique

      L'enseignant a le devoir de questionner les formats sportifs traditionnels qui peuvent devenir des vecteurs d'exclusion.

      Le paradoxe du cross scolaire : Le format classique (course de distance par catégorie d'âge) devient souvent insignifiant dès la classe de 5ème pour les élèves connaissant déjà leur classement. Ce format exclut les trois quarts des élèves alors même que l'EPS prône l'inclusion.

      La nécessité de réinvention : Il est impératif de concevoir des formats qui conservent l'enjeu de performance (Champ d'Apprentissage 1) tout en garantissant l'accessibilité et l'inclusion scolaire.

      --------------------------------------------------------------------------------

      4. Le Champ d'Apprentissage 5 (CA5) : Enjeux et Paradoxes

      Le CA5 (activités de développement des ressources personnelles) occupe une place prépondérante mais fait face à des défis inédits.

      Un succès institutionnel et matériel

      • Les activités comme la musculation sont parmi les plus choisies par les élèves (voie professionnelle et GT).

      • Les collectivités territoriales ont investi massivement dans des salles dédiées.

      • Ces activités favorisent l'autonomie et le réinvestissement à long terme dans la vie adulte.

      Le défi de la légitimité face aux influenceurs

      L'enseignant de CA5 est désormais en concurrence avec les influenceurs YouTube.

      Le conflit de crédibilité : Un élève peut contester l'enseignement d'un professeur en s'appuyant sur le discours d'un influenceur dont le morphotype lui semble plus légitime.

      L'enjeu du regard critique : Le véritable défi de 2025 est de former des élèves capables d'analyser de manière critique les programmes d'entraînement extérieurs plutôt que de subir l'influence de modèles esthétiques ou commerciaux.

      --------------------------------------------------------------------------------

      5. Directives pour les Futurs Enseignants (Concours et Carrière)

      Julien Gagnebien livre des conseils stratégiques pour les candidats aux concours (notamment l'Oral 3) et pour la pratique professionnelle.

      Priorités de conception

      Les jurys attendent une hiérarchisation claire des intentions pédagogiques :

      1. Le Quoi et le Pourquoi : Définir avec précision ce que l'élève doit construire et les raisons du scénario pédagogique. C'est le « ticket d'entrée » dans la profession.

      2. Le Comment : Les modalités pratiques (exercices, situations) viennent en second lieu. Une plus grande tolérance est accordée aux erreurs sur le « comment » car il relève de l'expérience en construction.

      Sortir de l'éparpillement

      Il est crucial de renoncer à vouloir « tout faire ». Une séquence (en musculation ou badminton) doit cibler des apprentissages fondamentaux spécifiques à chaque niveau (6ème vs Terminale) pour éviter le syndrome de « l'éternel débutant ».

      Travailler par « Dilemmes »

      Une piste innovante consiste à entrer dans les champs d'apprentissage par les dilemmes (ex: s'engager vs se préserver). Amener l'élève à traiter ces compromis en classe le prépare à faire des choix éclairés en autonomie hors de l'école.

      --------------------------------------------------------------------------------

      Conclusion : L'Injonction à la Liberté

      Le document conclut sur une déconstruction du mythe de « l'injonction institutionnelle ».

      En dehors de la sécurité, de l'évaluation et de l'équité, les programmes offrent une grande liberté.

      « On n'a pas besoin d'enseignants qui se conforment, on a besoin d'enseignants qui cherchent des solutions et qui en trouvent. »

      La mission ultime de l'enseignant est de devenir un « chercheur de solutions », capable de douter et d'expérimenter pour répondre à la complexité des contextes scolaires et assurer la réussite de tous les élèves.

    1. Réceptivité des Formats de Pratique en Musculation et Développement de l’Intérêt

      Résumé Exécutif

      Ce document synthétise les recherches de Mehdi Belhouchat concernant l'engagement psychologique des élèves en musculation scolaire.

      L'étude s'appuie sur le modèle de développement de l'intérêt en quatre phases pour évaluer comment différents formats de pratique influencent la motivation des élèves. Les conclusions majeures révèlent une corrélation directe entre le niveau d'intérêt initial d'un élève et sa réceptivité à un format spécifique.

      Alors que les élèves experts autogénèrent leur intérêt quelle que soit la tâche, les élèves novices (phases 1 et 2) sont extrêmement dépendants du design pédagogique.

      Les formats favorisant un guidage externe (APRE) ou interne (RPE) s'avèrent les plus efficaces pour déclencher l'engagement chez les débutants, tandis que le format "au temps" doit être utilisé de manière stratégique et ponctuelle pour favoriser des sauts qualitatifs de progression.

      Cadre Théorique et Problématique

      La recherche s'inscrit dans le cadre de la théorie de l'intérêt, notamment développée par Cédric Roure en contexte francophone, et le design de tâches d'apprentissage (Olivier Dieu).

      Le Constat de Départ

      Expansion de la musculation : Une activité en forte croissance depuis 20 ans en milieu scolaire et sociétal.

      Hétérogénéité des profils : Les classes se composent d'élèves aux profils variés, allant de l'expert inscrit en salle de sport (intérêt individuel développé) au décrocheur sédentaire (intérêt faible ou nul).

      Décalage des formats : Il existe une rupture entre les formats scolaires traditionnels (souvent basés sur le ressenti subjectif/RPE) et les pratiques sociales plus objectives, guidantes et intenses.

      Le Modèle de l'Intérêt en Quatre Phases

      Le développement de l'intérêt est analysé comme un passage d'un état psychologique éphémère à un trait de personnalité intégré :

      1. Phase 1 : Intérêt individuel très faible.

      2. Phase 2 : Intérêt individuel faible.

      3. Phase 3 : Intérêt individuel émergent.

      4. Phase 4 : Intérêt individuel bien développé.

      L'intérêt en situation est mesuré par trois facteurs : le déclenchement, le maintien au ressenti (valence affective) et le maintien aux valeurs (ancrage profond).

      --------------------------------------------------------------------------------

      Analyse des Formats de Pratique

      L'étude identifie trois types de guidage dans l'autorégulation de la charge de travail :

      | Format | Nature du Guidage | Caractéristiques | | --- | --- | --- | | APRE (Autoregulation Progressive Resistance Exercise) | Externe | Protocole normatif strict (tableaux). L'élève a peu de choix ; l'environnement dicte l'action. | | Au Temps | Mixte | Équilibre entre l'individu et l'environnement. Repère temporel imposé, mais décision de charge laissée à l'élève. | | RPE (Rating of Perceived Exertion) | Interne | Poids de l'environnement très faible. L'élève est au cœur des décisions de régulation selon son ressenti. |

      --------------------------------------------------------------------------------

      Résultats Clés de la Recherche

      L'étude, menée auprès de 319 participants (10 classes de lycée et étudiants), met en évidence plusieurs phénomènes critiques :

      1. L'Indépendance des Experts

      Les élèves situés en phases 3 et 4 (intérêt émergent ou développé) ne montrent aucune réceptivité spécifique aux formats.

      Ils projettent leur propre intérêt dans n'importe quelle situation et sont capables de redéfinir le but de la tâche pour s'impliquer. Ils sont psychologiquement indépendants du design pédagogique.

      2. La Sensibilité des Novices

      Pour les élèves en phases 1 et 2, le format est déterminant :

      Le format APRE (guidage externe) est dominant pour les novices les plus éloignés de la pratique (Phase 1). Il agit comme un environnement "puissant" qui stimule l'affect et l'intensité physique.

      Le format RPE (guidage interne) est également efficace en Phase 2, car il permet à l'élève de connecter ses propres expériences aux connaissances à acquérir.

      Le format "Au Temps" est le moins efficace pour déclencher l'intérêt chez les novices.

      3. Dynamique de Développement de l'Intérêt

      Linéarité du RPE : Ce format favorise un développement constant de l'intérêt à travers toutes les phases. Il est idéal pour gérer l'hétérogénéité d'une classe.

      Non-linéarité du format "Au Temps" : Ce format ne produit des effets que lors d'une transition spécifique entre l'intérêt faible et l'intérêt émergent. Il provoque un "saut" qualitatif.

      Instabilité de l'APRE : Son impact est décrit comme "désordonné et fluctuant", suggérant qu'il doit être utilisé de façon percutante mais espacée.

      --------------------------------------------------------------------------------

      Préconisations Pédagogiques pour l'Enseignant

      L'objectif pour l'enseignant est de devenir un "designer pédagogique" capable d'agencer les formats pour maximiser l'engagement, particulièrement chez les élèves les moins motivés.

      Séquence Type Recommandée

      Plutôt que d'utiliser un format unique, l'étude suggère un agencement stratégique durant le cycle de musculation :

      1. Début de cycle (Novices) : Prioriser des formats hybrides ou le RPE. Cela permet d'épouser l'hétérogénéité de la classe et d'enclencher le processus de développement de l'intérêt.

      2. Milieu de cycle : Introduire ponctuellement le format APRE pour injecter de l'intensité et stimuler les facteurs externes de l'intérêt.

      3. Fin de cycle : Utiliser le format Au Temps. Ce format, moins adapté aux débutants complets, devient pertinent plus tard pour valider un saut qualitatif dans l'intérêt émergent.

      4. Évaluation : Utiliser des formats hybrides pour stabiliser les acquis.

      "Une bonne séquence pédagogique en musculation, c'est la capacité à agencer convenablement des formats, notamment pour les novices." — Mehdi Belhouchat

    1. The interest of citizens in one or other form of government is not necessarilyshared by their political representatives. Local politicians will lose, and centralpoliticians gain, from centralisation, whatever the benefits to their citizens. Fur-thermore, if some politicians are potentially mobile, in the sense that they have asignificant probability of forming the government under either central or localarrangements, their incentives are perverse. They will have an interest in arguingfor centralisation only when its impact in reducing accountability outweighs itsbeneficial impact on intemalising spillovers, since under these circumstances thelevel of effort they have to undertake will be reduced.

      Yes. Good. Let’s go painfully slow and do it sentence by sentence, very plainly.

      Here’s the paragraph again in pieces.


      Sentence 1:

      “The interest of citizens in one or other form of government is not necessarily shared by their political representatives.”

      Translation:

      What regular people want is not always what politicians want.

      Citizens care about good policy. Politicians care about their careers.


      Sentence 2:

      “Local politicians will lose, and central politicians gain, from centralisation, whatever the benefits to their citizens.”

      Translation:

      If power moves to the center:

      • Local politicians lose power.
      • National politicians gain power.

      And this is true even if centralisation is good for citizens.

      So politicians judge it by who gets power, not just by whether it helps people.


      Sentence 3:

      “Furthermore, if some politicians are potentially mobile, in the sense that they have a significant probability of forming the government under either central or local arrangements, their incentives are perverse.”

      Break this down.

      “Potentially mobile” = Some politicians could end up in power under either system (local or central).

      So they don’t care which level they rule at — they just want to rule somewhere.

      “Incentives are perverse” = Their motivations don’t line up nicely with what’s good for society.

      Why? We’re about to see.


      Sentence 4 (the big one):

      “They will have an interest in arguing for centralisation only when its impact in reducing accountability outweighs its beneficial impact on internalising spillovers…”

      Slow down.

      Centralisation does two things:

      1. Reduces accountability → Harder to remove politicians → Good for politicians

      2. Fixes spillovers → Makes policy more efficient → Good for society

      The sentence says:

      These mobile politicians support centralisation only if the reduction in accountability is more important (to them) than the spillover benefit.

      In other words:

      They support centralisation when it helps them personally more than it improves policy.


      Final part:

      “…since under these circumstances the level of effort they have to undertake will be reduced.”

      This is the key.

      Politicians have to put in effort to:

      • Make good policies
      • Satisfy voters
      • Win elections

      If accountability is high → they must work hard.

      If accountability drops → they can slack more.

      So they support centralisation when it reduces how hard they have to work.


      The whole paragraph in idiot language:

      Citizens might want the system that produces better policy.

      Politicians want the system that:

      • Gives them more power
      • Makes it harder to fire them
      • Lets them work less

      So some politicians will only support centralisation if it mainly makes their jobs safer and easier — not just because it improves coordination.

    Annotators

    1. Guide de Référence Solidatech : Solutions Numériques pour les Associations

      Synthèse Opérationnelle

      Solidatech est un programme de solidarité numérique créé en 2008, porté par les Ateliers du Bocage, une coopérative d'utilité sociale membre d'Emmaüs. Sa mission principale est de renforcer l'impact des associations, fondations et fonds de dotation par le biais du numérique.

      Le programme repose sur deux piliers stratégiques : permettre aux structures de réaliser des économies significatives sur leurs équipements (logiciels et matériel) et les accompagner dans leur montée en compétences.

      Avec plus de 45 000 structures accompagnées, Solidatech s'impose comme un intermédiaire clé entre le secteur technologique et le monde associatif.

      Le programme traverse actuellement une phase de transition importante suite à la fin de son partenariat historique avec le réseau international TechSoup, entraînant une restructuration interne et une autonomisation de son catalogue de solutions.

      --------------------------------------------------------------------------------

      1. Identité et Gouvernance de Solidatech

      L'organisation se distingue par son ancrage dans l'économie sociale et solidaire (ESS).

      Structure porteuse : Les Ateliers du Bocage, une entreprise d'insertion et entreprise adaptée située dans les Deux-Sèvres (79).

      Affiliation : Membre du mouvement Emmaüs.

      Écosystème : Accompagne environ 45 000 associations, fonds de dotation et fondations reconnues d'utilité publique.

      Accessibilité : L'inscription au programme est entièrement gratuite pour les structures éligibles.

      --------------------------------------------------------------------------------

      2. Le Pilier Économique : Équipements et Logiciels

      Solidatech facilite l'accès à des ressources technologiques à tarifs préférentiels via une boutique en ligne dédiée.

      Solutions Logicielles

      Le catalogue est en cours de reconstruction pour privilégier des solutions françaises, sécurisées et, de plus en plus, issues du logiciel libre.

      Domaines couverts : Travail collaboratif, communication, sécurité informatique, comptabilité et gestion.

      Modèle tarifaire : Les associations s'acquittent d'un coupon (frais de gestion) auprès de Solidatech pour obtenir des remises importantes (souvent 30 % à 50 %) sur les abonnements annuels ou mensuels des partenaires.

      Exemples d'offres : AssoConnect (gestion associative), Kaspersky (sécurité).

      Matériel Informatique

      Le matériel est majoritairement reconditionné en France, au sein des Ateliers du Bocage.

      Gamme "Les Cabossés" : Une offre spécifique de matériel présentant des défauts esthétiques mineurs (rayures) mais parfaitement fonctionnel, proposée à des tarifs encore plus réduits.

      Diversité des équipements : Ordinateurs portables, unités centrales, écrans, tablettes, smartphones et accessoires.

      Garantie : Tout le matériel est garanti 1 an, avec une option d'extension d'un an supplémentaire.

      Systèmes d'exploitation : Possibilité d'équiper les machines avec Windows, Linux (dont PrimTux pour les enfants) ou ChromeOS Flex.

      --------------------------------------------------------------------------------

      3. Le Pilier Compétences : Formation et Accompagnement

      Au-delà de l'équipement, Solidatech propose un écosystème de services pour professionnaliser les usages numériques.

      Formation Professionnelle

      Certification : Organisme certifié Qualiopi, permettant le financement des formations via les crédits OPCO (équivalent du CPF pour les structures employeuses).

      Thématiques : Intelligence Artificielle (IA), Canva, Microsoft 365, RGPD, communication digitale et outils de travail collaboratif.

      Accompagnement et Diagnostic

      Diagnostic Numérique : Un outil gratuit d'auto-évaluation basé sur sept piliers de maturité numérique pour identifier les priorités d'action.

      Services de Migration : Aide au passage vers des environnements Cloud (Microsoft 365, Google Workspace) pour sécuriser les données et favoriser la collaboration.

      Prestatech : Une plateforme répertoriant des prestataires de confiance sélectionnés par Solidatech, pratiquant souvent des tarifs solidaires pour les associations.

      --------------------------------------------------------------------------------

      4. Évolutions Stratégiques et Changements Structurels

      Le paysage opérationnel de Solidatech a été modifié de manière significative à la fin de l'année 2023.

      | Aspect | Ancienne Situation | Situation Actuelle (Post-31/12/2023) | | --- | --- | --- | | Partenariat majeur | TechSoup Global (depuis 2008) | Fin du partenariat (décision de TechSoup) | | Support utilisateur | Équipe support interne dédiée | Suppression de l'équipe support (6 départs) | | Gestion des licences | Centralisée via TechSoup | Directe via les partenaires ou le nouveau catalogue Solidatech | | Catalogue | Partagé internationalement | Catalogue autonome en cours de repeuplement |

      Conséquence pour les utilisateurs : Pour les licences historiques acquises via TechSoup (ex: anciennes licences Microsoft ou Adobe), les associations doivent désormais s'adresser directement à TechSoup Europe (basé en Pologne) ou aux éditeurs concernés, Solidatech n'ayant plus accès aux données de ces anciens comptes.

      --------------------------------------------------------------------------------

      5. Ressources et Pilotage de la Maturité Numérique

      Solidatech produit et diffuse des connaissances pour éclairer le secteur associatif.

      Étude Nationale : Publication triennale de l'enquête "La place du numérique dans le projet associatif" (5ème édition disponible), coproduite avec Recherches & Solidarités.

      Centre de Ressources : Articles conseils, replays de webinaires et guides pratiques (ex: alternatives libres à la suite Adobe).

      Veille et Information : Une newsletter mensuelle et des webinaires réguliers (format court d'une heure) sur des enjeux d'actualité comme LinkedIn ou l'IA.

      --------------------------------------------------------------------------------

      6. Modalités Pratiques d'Inscription

      Pour bénéficier des services, une structure doit suivre un processus simple :

      1. Inscription sur solidatech.fr : Nécessite le téléchargement des documents officiels de l'association.

      2. Création de compte boutique : Une étape unique pour accéder au catalogue matériel et logiciel.

      3. Mise à jour des contacts : Il est recommandé de renseigner plusieurs contacts pour assurer la continuité des échanges malgré le turn-over associatif.

      Solidatech encourage activement les associations à faire remonter leurs besoins spécifiques via des questionnaires pour orienter les futurs partenariats du catalogue en reconstruction.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript by Lin et al. presents a timely, technically strong study that builds patient-specific midbrain-like organoids (MLOs) from hiPSCs carrying clinically relevant GBA1 mutations (L444P/P415R and L444P/RecNcil). The authors comprehensively characterize nGD phenotypes (GCase deficiency, GluCer/GluSph accumulation, altered transcriptome, impaired dopaminergic differentiation), perform CRISPR correction to produce an isogenic line, and test three therapeutic modalities (SapC-DOPS-fGCase nanoparticles, AAV9-GBA1, and SRT with GZ452). The model and multi-arm therapeutic evaluation are important advances with clear translational value.

      My overall recommendation is that the work undergo a major revision to address the experimental and interpretive gaps listed below.

      Strengths:

      (1) Human, patient-specific midbrain model: Use of clinically relevant compound heterozygous GBA1 alleles (L444P/P415R and L444P/RecNcil) makes the model highly relevant to human nGD and captures patient genetic context that mouse models often miss.

      (2) Robust multi-level phenotyping: Biochemical (GCase activity), lipidomic (GluCer/GluSph by UHPLC-MS/MS), molecular (bulk RNA-seq), and histological (TH/FOXA2, LAMP1, LC3) characterization are thorough and complementary.

      (3) Use of isogenic CRISPR correction: Generating an isogenic line (WT/P415R) and demonstrating partial rescue strengthens causal inference that the GBA1 mutation drives many observed phenotypes.

      (4) Parallel therapeutic testing in the same human platform: Comparing enzyme delivery (SapC-DOPS-fGCase), gene therapy (AAV9-GBA1), and substrate reduction (GZ452) within the same MLO system is an elegant demonstration of the platform's utility for preclinical evaluation.

      (5) Good methodological transparency: Detailed protocols for MLO generation, editing, lipidomics, and assays allow reproducibility

      Weaknesses:

      (1) Limited genetic and biological replication

      (a) Single primary disease line for core mechanistic claims. Most mechanistic data derive from GD2-1260 (L444P/P415R); GD2-10-257 (L444P/RecNcil) appears mainly in therapeutic experiments. Relying primarily on one patient line risks conflating patient-specific variation with general nGD mechanisms.

      (b) Unclear biological replicate strategy. It is not always explicit how many independent differentiations and organoid batches were used (biological replicates vs. technical fields of view).

      (c) A significant disadvantage of employing brain organoids is the heterogeneity during induction and potential low reproducibility. In this study, it is unclear how many independent differentiation batches were evaluated and, for each test (for example, immunofluorescent stain and bulk RNA-seq), how many organoids from each group were used. Please add a statement accordingly and show replicates to verify consistency in the supplementary data.

      (d) Isogenic correction is partial. The corrected line is WT/P415R (single-allele correction); residual P415R complicates the interpretation of "full" rescue and leaves open whether the remaining pathology is due to incomplete correction or clonal/epigenetic effects.

      (e) The authors tested week 3, 4, 8, 15, and 28 old organoids in different settings. However, systematic markers of maturation should be analyzed, and different maturation stages should be compared, for example, comparing week 8 organoids to week 28 organoids, with immunofluorescent marker staining and bulk RNAseq.

      (f) The manuscript frequently refers to Wnt signaling dysregulation as a major finding. However, experimental validation is limited to transcriptomic data. Functional tests, such as the use of Wnt agonist/inhibitor, are needed to support this claim (see below).

      (g) Suggested fixes/experiments

      Add at least one more independent disease hiPSC line (or show expanded analysis from GD2-10-257) for key mechanistic endpoints (lipid accumulation, transcriptomics, DA markers)

      Generate and analyze a fully corrected isogenic WT/WT clone (or a P415R-only line) if feasible; at minimum, acknowledge this limitation more explicitly and soften claims.

      Report and increase independent differentiations (N = biological replicates) and present per-differentiation summary statistics.

      (2) Mechanistic validation is insufficient

      (a) RNA-seq pathways (Wnt, mTOR, lysosome) are not functionally probed. The manuscript shows pathway enrichment and some protein markers (p-4E-BP1) but lacks perturbation/rescue experiments to link these pathways causally to the DA phenotype.

      (b) Autophagy analysis lacks flux assays. LC3-II and LAMP1 are informative, but without flux assays (e.g., bafilomycin A1 or chloroquine), one cannot distinguish increased autophagosome formation from decreased clearance.

      (c) Dopaminergic dysfunction is superficially assessed. Dopamine in the medium and TH protein are shown, but no neuronal electrophysiology, synaptic marker co-localization, or viability measures are provided to demonstrate functional recovery after therapy.

      (d) Suggested fixes/experiments

      Perform targeted functional assays:

      (i) Wnt reporter assays (TOP/FOP flash) and/or treat organoids with Wnt agonists/antagonists to test whether Wnt modulation rescues DA differentiation.

      (ii)Test mTOR pathway causality using mTOR inhibitors (e.g., rapamycin) or 4E-BP1 perturbation and assay effects on DA markers and autophagy.

      Include autophagy flux assessment (LC3 turnover with bafilomycin), and measure cathepsin activity where relevant.

      Add at least one functional neuronal readout: calcium imaging, MEA recordings, or synaptic marker quantification (e.g., SYN1, PSD95) together with TH colocalization.

      (3) Therapeutic evaluation needs greater depth and standardization

      (a) Short windows and limited durability data. SapC-DOPS and AAV9 experiments range from 48 hours to 3 weeks; longer follow-up is needed to assess durability and whether biochemical rescue translates into restored neuronal function.

      (b) Dose-response and biodistribution are under-characterized. AAV injection sites/volumes are described, but transduction efficiency, vg copies per organoid, cell-type tropism quantification, and SapC-DOPS penetration/distribution are not rigorously quantified.

      (c) Specificity controls are missing. For SapC-DOPS, inclusion of a non-functional enzyme control (or heat-inactivated fGCase) would rule out non-specific nanoparticle effects. For AAV, assessment of off-target expression and potential cytotoxicity is needed.

      (d) Comparative efficacy lacking. It remains unclear which modality is most effective in the long term and in which cellular compartments.

      (e) Suggested fixes/experiments

      Extend follow-up (e.g., 6+ weeks) after AAV/SapC dosing and evaluate DA markers, electrophysiology, and lipid levels over time.

      Quantify AAV transduction by qPCR for vector genomes and by cell-type quantification of GFP+ cells (neurons vs astrocytes vs progenitors).

      Include SapC-DOPS control nanoparticles loaded with an inert protein and/or fluorescent cargo quantitation to show distribution and uptake kinetics.

      Provide head-to-head comparative graphs (activity, lipid clearance, DA restoration, and durability) with statistical tests.

      (4) Model limitations not fully accounted for in interpretation

      (a) Absence of microglia and vasculature limits recapitulation of neuroinflammatory responses and drug penetration, both of which are important in nGD. These absences could explain incomplete phenotypic rescues and must be emphasized when drawing conclusions about therapeutic translation.

      (b) Developmental vs degenerative phenotype conflation. Many phenotypes appear during differentiation (patterning defects). The manuscript sometimes interprets these as degenerative mechanisms; the distinction must be clarified.

      (c) Suggested fixes

      Tone down the language throughout (Abstract/Results/Discussion) to avoid overstatement that MLOs fully recapitulate nGD neuropathology.

      Add plans or pilot data (if available) for microglia incorporation or vascularization to indicate how future work will address these gaps.

      (5) Statistical and presentation issues

      (a) Missing or unclear sample sizes (n). For organoid-level assays, report the number of organoids and the number of independent differentiations.

      (b) Statistical assumptions not justified. Tests assume normality; where sample sizes are small, consider non-parametric tests and report exact p-values.

      (c) Quantification scope. Many image quantifications appear to be from selected fields of view, which are then averaged across organoids and differentiations.

      (d) RNA-seq QC and deposition. Provide mapping rates, batch correction details, and ensure the GEO accession is active. Include these in Methods/Supplement.

      (e) Suggested fixes

      Add a table summarizing biological replicates, technical replicates, and statistical tests used for each figure panel.

      Recompute statistics where appropriate (non-parametric if N is small) and report effect sizes and confidence intervals.

      (6) Minor comments and clarifications

      (a) The authors should validate midbrain identity further with additional regional markers (EN1, OTX2) and show absence/low expression of forebrain markers (FOXG1) across replicates.

      (b) Extracellular dopamine ELISA should be complemented with intracellular dopamine or TH+ neuron counts normalized per organoid or per total neurons.

      (c) For CRISPR editing: the authors should report off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus.

      (d) It should be clarified as to whether lipidomics normalization is to total protein per organoid or per cell, and include representative LC-MS chromatograms or method QC.

      (e) Figure legends should be improved in order to state the number of organoids, the number of differentiations, and the exact statistical tests used (including multiple-comparison corrections).

      (f) In the title, the authors state "reveal disease mechanisms", but the studies mainly exhibit functional changes. They should consider toning down the statement.

      (7) Recommendations

      This reviewer recommends a major revision. The manuscript presents substantial novelty and strong potential impact but requires additional experimental validation and clearer, more conservative interpretation. Key items to address are:

      (a) Strengthening genetic and biological replication (additional lines or replicate differentiations).

      (b) Adding functional mechanistic validation for major pathways (Wnt/mTOR/autophagy) and providing autophagy flux data.

      (c) Including at least one neuronal functional readout (calcium imaging/MEA/patch) to demonstrate functional rescue.

      (d) Deepening therapeutic characterization (dose, biodistribution, durability) and including specificity controls.

      (e) Improving statistical reporting and explicitly stating biological replicate structure.

    2. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript by Lin et al. presents a timely, technically strong study that builds patientspecific midbrain-like organoids (MLOs) from hiPSCs carrying clinically relevant GBA1 mutations (L444P/P415R and L444P/RecNcil). The authors comprehensively characterize nGD phenotypes (GCase deficiency, GluCer/GluSph accumulation, altered transcriptome, impaired dopaminergic differentiation), perform CRISPR correction to produce an isogenic line, and test three therapeutic modalities (SapC-DOPS-fGCase nanoparticles, AAV9GBA1, and SRT with GZ452). The model and multi-arm therapeutic evaluation are important advances with clear translational value.

      My overall recommendation is that the work undergo a major revision to address the experimental and interpretive gaps listed below.

      Strengths:

      (1) Human, patient-specific midbrain model: Use of clinically relevant compound heterozygous GBA1 alleles (L444P/P415R and L444P/RecNcil) makes the model highly relevant to human nGD and captures patient genetic context that mouse models often miss.

      (2) Robust multi-level phenotyping: Biochemical (GCase activity), lipidomic (GluCer/GluSph by UHPLC-MS/MS), molecular (bulk RNA-seq), and histological (TH/FOXA2, LAMP1, LC3) characterization are thorough and complementary.

      (3) Use of isogenic CRISPR correction: Generating an isogenic line (WT/P415R) and demonstrating partial rescue strengthens causal inference that the GBA1 mutation drives many observed phenotypes.

      (4) Parallel therapeutic testing in the same human platform: Comparing enzyme delivery (SapC-DOPS-fGCase), gene therapy (AAV9-GBA1), and substrate reduction (GZ452) within the same MLO system is an elegant demonstration of the platform's utility for preclinical evaluation.

      (5) Good methodological transparency: Detailed protocols for MLO generation, editing, lipidomics, and assays allow reproducibility

      Weaknesses:

      (1) Limited genetic and biological replication

      (a) Single primary disease line for core mechanistic claims. Most mechanistic data derive from GD2-1260 (L444P/P415R); GD2-10-257 (L444P/RecNcil) appears mainly in therapeutic experiments. Relying primarily on one patient line risks conflating patient-specific variation with general nGD mechanisms.

      We thank the reviewer for highlighting the importance of genetic and biological replication. An additional patient-derived iPSC line was included in the manuscript, therefore, our study includes two independent nGD patient-derived iPSC lines, GD2-1260 (GBA1<sup>L444P/P415R</sup>) and GD2-10-257 (GBA1<sup>L444P/RecNcil</sup>), both of which carry the severe mutations associated with nGD. These two lines represent distinct genetic backgrounds and were used to demonstrate the consistency of key disease phenotypes (reduced GCase activity, elevated substrate, impaired dopaminergic neuron differentiation, etc.) across different patient’s MLOs. Major experiments (e.g., GCase activity assays, substrate, immunoblotting for DA marker TH, and therapeutic testing with SapC-DOPS-fGCase, AAV9-GBA1) were performed using both patient lines, with results showing consistent phenotypes and therapeutic responses (see Figs. 2-6, and Supplementary Figs. 4-5). To ensure clarity and transparency, a new Supplementary Table 2 summarizes the characterization of both the GD2-1260 and GD2-10-257 lines.

      (b) Unclear biological replicate strategy. It is not always explicit how many independent differentiations and organoid batches were used (biological replicates vs. technical fields of view).

      Biological replication was ensured in our study by conducting experiments in at least 3 independent differentiations per line, and technical replicates (multiple organoids/fields per batch) were averaged accordingly. We have clarified biological replicates and differentiation in the figure legends. 

      (c) A significant disadvantage of employing brain organoids is the heterogeneity during induction and potential low reproducibility. In this study, it is unclear how many independent differentiation batches were evaluated and, for each test (for example, immunofluorescent stain and bulk RNA-seq), how many organoids from each group were used. Please add a statement accordingly and show replicates to verify consistency in the supplementary data.

      In the revision, we have clarified biological replicates and differentiation in the figure legend in Fig.1E; Fig.2B,2G; Fig.3F, 3G; Fig.4B-C,E,H-J, M-N; Fig.6D; and Fig.7A-C, I.

      (d) Isogenic correction is partial. The corrected line is WT/P415R (single-allele correction); residual P415R complicates the interpretation of "full" rescue and leaves open whether the remaining pathology is due to incomplete correction or clonal/epigenetic effects.

      We attempted to generate an isogenic iPSC line by correcting both GBA1 mutations (L444P and P415R). However, this was not feasible because GBA1 overlaps with a highly homologous pseudogene (PGBA), which makes precise editing technically challenging. Consequently, only the L444P mutation was successfully corrected, and the resulting isogenic line retains the P415R mutation in a heterozygous state. Because Gaucher disease is an autosomal recessive disorder, individuals carrying a single GBA1 mutation (heterozygous carriers) do not develop clinical symptoms. Therefore, the partially corrected isogenic line, which retains only the P415R allele, represents a clinically relevant carrier model. Consistent with this, our results show that GCase activity was restored to approximately 50% of wild-type levels (Fig.4B-C), supporting the expected heterozygous state. These findings also make it unlikely that the remaining differences observed are due to clonal variation or epigenetic effects.

      (e) The authors tested week 3, 4, 8, 15, and 28 old organoids in different settings. However, systematic markers of maturation should be analyzed, and different maturation stages should be compared, for example, comparing week 8 organoids to week 28 organoids, with immunofluorescent marker staining and bulk RNAseq.

      We agree that a systematic analysis of maturation stages is essential for validating the MLO model. Our data integrated a longitudinal comparison across multiple developmental windows (Weeks 3 to 28) to characterize the transition from progenitors to mature/functional states for nGD phenotyping and evaluation of therapeutic modalities: 1) DA differentiation (Wks 3 and 8 in Fig. 3): qPCR analysis demonstrated the progression of DA-specific programs. We observed a steady increase in the mature DA neuron marker TH and ASCL1. This was accompanied by a gradual decrease in early floor plate/progenitor markers FOXA2 and PLZF, indicating a successful differentiation path from progenitors to differentiated/mature DA neurons. 2) Glycosphingolipid substrates accumulation (Wks 15 and 28 in Fig 2): To assess late-stage nGD phenotyping, we compared GluCer and GluSph at Week 15 and Week 28. This comparison highlights the progressive accumulation of substrates in nGD MLOs, reflecting the metabolic consequences of the disease at different mature stage. 3) Organoid growth dynamics (Wks 4, 8, and 15 in new Fig. 4): The new Fig. 4 tracks physical maturation through organoid size and growth rates across three key time points, providing a macro-scale verification of consistent development between WT and nGD groups. By comparing these early (Wk 3-8) and late (Wk 15-28) stages, we confirmed that our MLOs transition from a proliferative state to a post-mitotic, specialized neuronal state, satisfied the requirement for comparing distinct maturation stages.

      (f) The manuscript frequently refers to Wnt signaling dysregulation as a major finding. However, experimental validation is limited to transcriptomic data. Functional tests, such as the use of Wnt agonist/inhibitor, are needed to support this claim (see below).

      We agree that the suggested experiments could provide additional mechanistic insights into this study and will consider them in future work.

      (g) Suggested fixes / experiments

      Add at least one more independent disease hiPSC line (or show expanded analysis from GD2-10-257) for key mechanistic endpoints (lipid accumulation, transcriptomics, DA markers).

      Additional line iPSC GD2-10-257 derived MLO was included in the manuscript. This was addressed above [see response to Weaknesses (1)-a]. 

      Generate and analyze a fully corrected isogenic WT/WT clone (or a P415R-only line) if feasible; at minimum, acknowledge this limitation more explicitly and soften claims.

      We attempted to generate an isogenic iPSC line by correcting both GBA1 mutations (L444P and P415R). However, this was unsuccessful because the GBA1 gene overlaps with a pseudogene (PGBA) located 16 kb downstream of GBA1, which shares 96-98% sequence similarity with GBA1 (Ref#1, #2), which complicates precise editing. GBA1 is shorter (~5.7 kb) than PGBA (~7.6 kb). The primary exonic difference between GBA1 and PGBA is a 55-bp deletion in exon 9 of the pseudogene. As a result, the isogenic line we obtained carries only the P415R mutation, and L444P was corrected to the normal sequence. We have included this limitation in the Methods as “This gene editing strategy is expected to also target the GBA1 pseudogene due to the identical target sequence, which limits the gene correction on certain mutations (e.g., P415R)”. 

      References:

      (1) Horowitz M., Wilder S., Horowitz Z., Reiner O., Gelbart T., Beutler E. The human glucocerebrosidase gene and pseudogene: structure and evolution. Genomics (1989). 4, 87–96. doi:10.1016/0888-7543(89)90319-4

      (2) Woo EG, Tayebi N, Sidransky E. Next-Generation Sequencing Analysis of GBA1: The Challenge of Detecting Complex Recombinant Alleles. Front Genet. (2021). 12:684067. doi:10.3389/fgene.2021.684067. PMCID: PMC8255797.

      Report and increase independent differentiations (N = biological replicates) and present per-differentiation summary statistics.

      This was addressed above [see response to Weaknesses (1)-b, (1)-c]. 

      (2) Mechanistic validation is insufficient

      (a) RNA-seq pathways (Wnt, mTOR, lysosome) are not functionally probed. The manuscript shows pathway enrichment and some protein markers (p-4E-BP1) but lacks perturbation/rescue experiments to link these pathways causally to the DA phenotype.

      (b) Autophagy analysis lacks flux assays. LC3-II and LAMP1 are informative, but without flux assays (e.g., bafilomycin A1 or chloroquine), one cannot distinguish increased autophagosome formation from decreased clearance.

      (c) Dopaminergic dysfunction is superficially assessed. Dopamine in the medium and TH protein are shown, but no neuronal electrophysiology, synaptic marker co-localization, or viability measures are provided to demonstrate functional recovery after therapy.

      (d) Suggested fixes/experiments

      Perform targeted functional assays:

      (i) Wnt reporter assays (TOP/FOP flash) and/or treat organoids with Wnt agonists/antagonists to test whether Wnt modulation rescues DA differentiation.

      (ii) Test mTOR pathway causality using mTOR inhibitors (e.g., rapamycin) or 4E-BP1 perturbation and assay effects on DA markers and autophagy.

      Include autophagy flux assessment (LC3 turnover with bafilomycin), and measure cathepsin activity where relevant.

      Add at least one functional neuronal readout: calcium imaging, MEA recordings, or synaptic marker quantification (e.g., SYN1, PSD95) together with TH colocalization.

      We thank the reviewer for these valuable suggestions. We agree that the suggested experiments could provide additional mechanistic insights into this study and will consider them in future work. Importantly, the primary conclusions of our manuscript, that GBA1 mutations in nGD MLOs resulted in nGD pathologies such as diminished enzymatic function, accumulation of lipid substrates, widespread transcriptomic changes, and impaired dopaminergic neuron differentiation, which can be corrected by several therapeutic strategies in this study, are supported by the evidence presented. The suggested experiments represent an important direction for future research using brain organoids.

      (3) Therapeutic evaluation needs greater depth and standardization

      (a) Short windows and limited durability data. SapC-DOPS and AAV9 experiments range from 48 hours to 3 weeks; longer follow-up is needed to assess durability and whether biochemical rescue translates into restored neuronal function.

      We agree with the reviewer. Because this is a proof-of-principle study, the treatment was designed within a short time window. Long-term studies with more comprehensive outcome assessments will be conducted in future work.

      (b) Dose-response and biodistribution are under-characterized. AAV injection sites/volumes are described, but transduction efficiency, vg copies per organoid, cell-type tropism quantification, and SapC-DOPS penetration/distribution are not rigorously quantified.

      We appreciate the reviewer’s concerns. This study was intended to demonstrate the feasibility and initial response of MLOs to AAV therapy. A comprehensive evaluation of AAV biodistribution will be considered in future studies.

      The penetration and distribution of SapC-DOPS have been extensively characterized in prior studies. In vivo biodistribution of SapC–DOPS coupled CellVue Maroon, a fluorescent cargo, was examined in mice bearing human tumor xenografts using real-time fluorescence imaging, where CellVue Maroon fluorescence in tumor remained for 48 hours (Ref. #3: Fig. 4B, mouse 1), 100 hours (Ref. #4: Fig. 5), up to 216 hours (Ref. #5: Fig. 3). Uptake kinetics were also demonstrated in cells, with flow cytometry quantification showing that fluorescent cargo coupled SapC-DOPS nanovesicles, were incorporated into human brain tumor cell membranes within minutes and remained stably incorporated into the cells for up to one hour (Ref. # 6: Fig. 1a and Fig. 1b). Building on these findings, the present study focuses on evaluating the restoration of GCase function rather than reexamining biodistribution and uptake kinetics.

      References:

      (3) X. Qi, Z. Chu, Y.Y. Mahller, K.F. Stringer, D.P. Witte, T.P. Cripe. Cancer-selective targeting and cytotoxicity by liposomal-coupled lysosomal saposin C protein. Clin. Cancer Res. (2009) 15, 5840-5851. PMID: 19737950.

      (4) Z. Chu, S. Abu-Baker, M.B. Palascak, S.A. Ahmad, R.S. Franco, and X. Qi. Targeting and cytotoxicity of SapC-DOPS nanovesicles in pancreatic cancer. PLOS ONE (2013) 8, e75507. PMID: 24124494.

      (5) Z. Chu, K. LaSance, V.M. Blanco, C.-H. Kwon, B., Kaur, M., Frederick, S., Thornton, L., Lemen, and X. Qi. Multi-angle rotational optical imaging of brain tumors and arthritis using fluorescent SapC-DOPS nanovesicles. J. Vis. Exp. (2014) 87, e51187, 17. PMID: 24837630.

      (6) J. Wojton, Z. Chu, C-H. Kwon, L.M.L. Chow, M. Palascak, R. Franco, T. Bourdeau, S. Thornton, B. Kaur, and X. Qi. Systemic delivery of SapC-DOPS has antiangiogenic and antitumor effects against glioblastoma. Mol. Ther. (2013) 21, 1517-1525. PMID: 23732993.

      (c) Specificity controls are missing. For SapC-DOPS, inclusion of a non-functional enzyme control (or heat-inactivated fGCase) would rule out non-specific nanoparticle effects. For AAV, assessment of off-target expression and potential cytotoxicity is needed.

      Including inactive fGCase would confound the assessment of fGCase in MLOs by immunoblot and immunofluorescence; therefore, saposin C–DOPS was used as the control instead. 

      We agree that assessment of Off-target expression and potential cytotoxicity for AAV is important; this will be included in future studies.

      (d) Comparative efficacy lacking. It remains unclear which modality is most effective in the long term and in which cellular compartments.

      To address this comment, we have added a new table (Supplementary Table 2) comparing the four therapeutic modalities and summarizing their respective outcomes. While this study focused on short-term responses as a proof-of-principle, future work will explore long-term therapeutic effects. 

      (e) Suggested fixes/experiments

      Extend follow-up (e.g., 6+ weeks) after AAV/SapC dosing and evaluate DA markers, electrophysiology, and lipid levels over time.

      We appreciate the reviewer’s suggestions. The therapeutic testing in patient-derived MLOs was designed as a proof-of-principle study to demonstrate feasibility and the primary response (rescue of GCase function) to the treatment. A comprehensive, long-term therapeutic evaluation of AAV and SapC-DOPS-fGCase is indeed important for a complete assessment; however, this represents a separate therapeutic study and is beyond the scope of the current work.

      Quantify AAV transduction by qPCR for vector genomes and by cell-type quantification of GFP+ cells (neurons vs astrocytes vs progenitors).

      For the AAV-treated experiments, we agree that measuring AAV copy number and GFP expression would provide additional information. However, the primary goal of this study was to demonstrate the key therapeutic outcome, rescue of GCase function by AAV-delivered normal GCase, which is directly relevant to the treatment objective.

      Include SapC-DOPS control nanoparticles loaded with an inert protein and/or fluorescent cargo quantitation to show distribution and uptake kinetics.

      As noted above [see response to Weakness (3)-c], using inert GCase would confound the assessment of fGCase uptake in MLOs; therefore, it was not suitable for this study. See response above for the distribution and uptake kinetics of SapC-DOPS [see response to Weaknesses (3)-b].

      Provide head-to-head comparative graphs (activity, lipid clearance, DA restoration, and durability) with statistical tests.

      We have added a new table (Supplementary Table 2) providing a head-to-head comparison of the treatment effects. 

      (4) Model limitations not fully accounted for in interpretation

      (a) Absence of microglia and vasculature limits recapitulation of neuroinflammatory responses and drug penetration, both of which are important in nGD. These absences could explain incomplete phenotypic rescues and must be emphasized when drawing conclusions about therapeutic translation.

      We agree that the absence of microglia and vasculature in midbrain-like organoids represents a limitation, as we have discussed in the manuscript. In this revision, we highlighted this limitation in the Discussion section and clarified that it may contribute to incomplete phenotyping and phenotypic rescue observed in our therapeutic experiments. Additionally, we have outlined future directions to incorporate microglia and vascularization into the organoid system to better recapitulate the in vivo environment and improve translational relevance (see 7th paragraph in the Discussion).

      (b) Developmental vs degenerative phenotype conflation. Many phenotypes appear during differentiation (patterning defects). The manuscript sometimes interprets these as degenerative mechanisms; the distinction must be clarified.

      We appreciate the reviewer’s comments. In the revised manuscript, we have clarified that certain abnormalities, such as patterning defects observed during early differentiation, likely reflect developmental consequences of GBA1 mutations rather than degenerative processes. Conversely, phenotypes such as substrate accumulation, lysosomal dysfunction, and impaired dopaminergic maturation at later stages are interpreted as degenerative features. We have updated the Results and Discussion sections to avoid conflating developmental defects with neurodegenerative mechanisms.

      (c) Suggested fixes

      Tone down the language throughout (Abstract/Results/Discussion) to avoid overstatement that MLOs fully recapitulate nGD neuropathology.

      The manuscript has been revised to avoid overstatements.

      Add plans or pilot data (if available) for microglia incorporation or vascularization to indicate how future work will address these gaps.

      The manuscript now includes further plans to address the incorporation of microglia and vascularization, described in the last two paragraphs in the Discussion. Pilot study of microglia incorporation will be reported when it is completed.

      (5) Statistical and presentation issues

      (a) Missing or unclear sample sizes (n). For organoid-level assays, report the number of organoids and the number of independent differentiations.

      We have clarified biological replicates and differentiation in the figure legend [see response to Weaknesses (1)-b, (1)-c]. 

      (b) Statistical assumptions not justified. Tests assume normality; where sample sizes are small, consider non-parametric tests and report exact p-values.

      We have updated Statistical analysis in the methods as described below:

      “For comparisons between two groups, data were analyzed using unpaired two-tailed Student’s t-tests when the sample size was ≥6 per group and normality was confirmed by the Shapiro-Wilk test. When the normality assumption was not met or when sample sizes were small (n < 6), the non-parametric Mann-Whitney U test was used instead. For comparisons involving three or more groups, one-way ANOVA followed by Tukey’s multiple comparison test was applied when data were normally distributed; otherwise, the nonparametric Dunn’s multiple comparison test was used. Exclusion of outliers was made based on cut-offs of the mean ±2 standard deviations. All statistical analyses were performed using GraphPad Prism 10 software. Exact p-values are reported throughout the manuscript and figures where feasible. A p-value < 0.05 was considered statistically significant.”

      (c) Quantification scope. Many image quantifications appear to be from selected fields of view, which are then averaged across organoids and differentiations.

      In this work, quantitative immunofluorescence analyses (e.g., cell counts for FOXP1+, FOXG1+, SOX2+ and Ki67+ cells, as well as marker colocalization) were performed on at least 3–5 randomly selected non-overlapping fields of view (FOVs) per organoid section, with a minimum of 3 organoids per differentiation batch. Each FOV was imaged at consistent magnification (60x) and z-stack depth to ensure comparable sampling across conditions. Data from individual FOVs were first averaged within each organoid to obtain an organoid-level mean, and then biological replicates (independent differentiations, n ≥ 3) were averaged to generate the final group mean ± SEM. This multilevel averaging approach minimizes bias from regional heterogeneity within organoids and accounts for variability across differentiations. Representative confocal images shown in the figures were selected to accurately reflect the quantified data. We believe this standardized quantification strategy ensures robust and reproducible results while appropriately representing the 3D architecture of the organoids.

      In the revision, we have clarified the method used for image analysis of sectioned MLOs as below:

      “Quantitative immunofluorescence analyses (e.g., cell counts for FOXP1+, FOXG1+, SOX2+ and Ki67+ cells, as well as marker colocalization) were performed using ImageJ (NIH) on at least 3–5 randomly selected non-overlapping fields of view (FOVs) per organoid section, with a minimum of 3 organoids per differentiation batch. Each FOV was imaged at consistent magnification (60x) and z-stack depth to ensure comparable sampling across conditions. Data from individual FOVs were first averaged within each organoid to obtain an organoid-level mean, and then biological replicates (independent differentiations, n ≥ 3) were averaged to generate the final group mean ± SEM.”

      (d) RNA-seq QC and deposition. Provide mapping rates, batch correction details, and ensure the GEO accession is active. Include these in Methods/Supplement.

      RNA-seq data are from the same batch. The mapping rate is >90%. GEO accession will be active upon publication. These were included in the Methods.

      (e) Suggested fixes

      Add a table summarizing biological replicates, technical replicates, and statistical tests used for each figure panel.

      We have revised the figure legends to include replicates for each figure and statistical tests [see response in weaknesses (1)-b, (1)-c].

      Recompute statistics where appropriate (non-parametric if N is small) and report effect sizes and confidence intervals.

      Statistical analysis method is provided in the revision [see response in Weaknesses (5)-b].

      (6) Minor comments and clarifications

      (a) The authors should validate midbrain identity further with additional regional markers (EN1, OTX2) and show absence/low expression of forebrain markers (FOXG1) across replicates.

      We validated the MLO identity by 1) FOXG1 and 2) EN1. FOXG1 was barely detectable in Wk8 75.1_MLO but highly present in ‘age-matched’ cerebral organoid (CO), suggesting our culturing method is midbrain region-oriented. In nGD MLO, FOXG1 expression is significantly higher than 75.1_MLO, indicating that there was aberrant anterior-posterior brain specification, consistent with the transcriptomic dysregulation observed in our RNA-seq data.

      To further confirm midbrain identity, we examined the expression of EN1, an established midbrain-specific marker. Quantitative RT-PCR analysis demonstrated that EN1 expression increased progressively during differentiation in both WT-75.1 and nGD2-1260 MLOs at weeks 3 and 8 (Author response image 1). EN1 reached 34-fold and 373-fold higher levels than in WT-75.1 iPSCs at weeks 3 and 8, respectively, in WT-75.1 MLOs. In nGD MLOs, although EN1 expression showed a modest reduction at week 8, the levels were not significantly different from those observed in age-matched WT-75.1 MLOs (p > 0.05, ns).

      Author response image 1.

      qRT-PCR quantification of midbrain progenitor marker EN1 expression in WT-75.1 and GD2-1260 MLOs at Wk3 and Wk8. Data was normalized to WT-75.1 hiPSC cells and presented as mean ± SEM (n = 3-4 MLOs per group).ns, not significant.<br />

      (b) Extracellular dopamine ELISA should be complemented with intracellular dopamine or TH+ neuron counts normalized per organoid or per total neurons.

      We quantified TH expression at both the mRNA level (Fig. 3F) and the protein level (Fig. 3G/H) from whole-organoid lysates, which provides a more consistent and integrative measure across samples. These TH expression levels correlated well with the corresponding extracellular (medium) dopamine concentrations for each genotype. In contrast, TH⁺ neuron counts may not reliably reflect total cellular dopamine levels because the number of cells captured on each organoid section varies substantially, making normalization difficult. Measuring intracellular dopamine is an alternative approach that will be considered in future studies.

      (c) For CRISPR editing: the authors should report off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus. (off-target analysis (GUIDE-seq or targeted sequencing of predicted off-targets) or at least in-silico off-target score and sequencing coverage of the edited locus). 

      The off-target effect was analyzed during gene editing and the chance to target other off-targets is low due to low off-target scores ranked based on the MIT Specificity Score analysis. The related method was also updated as stated below:

      “The chance to target other Off-targets is low due to low Off-target scores ranked based on the MIT Specificity Score analysis (Hsu, P., Scott, D., Weinstein, J. et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol 31, 827–832 (2013).https://doi.org/10.1038/nbt.2647).”

      (d) It should be clarified as to whether lipidomics normalization is to total protein per organoid or per cell, and include representative LC-MS chromatograms or method QC.

      The normalization was to the protein of the organoid lysate. This was clarified in the Methods section in the revision as stated below:

      “The GluCer and GluSph levels in MLO were normalized to total MLO protein (mg) that were used for glycosphingolipid analyses. Protein mass was determined by BCA assay and glycosphingolipid was expressed as pmol/mg protein. Additionally, GluSph levels in the culture medium were quantified and normalized to the medium volume (pmol/mL).”

      Representative LC-MS chromatograms for both normal and GD MLOs have been included in a new figure, Supplementary Figure 2.

      (e) Figure legends should be improved in order to state the number of organoids, the number of differentiations, and the exact statistical tests used (including multiplecomparison corrections).

      This was addressed above [see response to Weaknesses (1)-b and (5)-b].

      (f) In the title, the authors state "reveal disease mechanisms", but the studies mainly exhibit functional changes. They should consider toning down the statement.

      The title was revised to: Patient-Specific Midbrain Organoids with CRISPR Correction Recapitulate Neuronopathic Gaucher Disease Phenotypes and Enable Evaluation of Novel Therapies

      (7) Recommendations

      This reviewer recommends a major revision. The manuscript presents substantial novelty and strong potential impact but requires additional experimental validation and clearer, more conservative interpretation. Key items to address are:

      (a) Strengthening genetic and biological replication (additional lines or replicate differentiations).

      This was addressed above [see response to Weaknesses (1)-a, (1)-b, (1)-c].

      (b) Adding functional mechanistic validation for major pathways (Wnt/mTOR/autophagy) and providing autophagy flux data.

      (c) Including at least one neuronal functional readout (calcium imaging/MEA/patch) to demonstrate functional rescue.

      As addressed above [see response to Weaknesses (2)], the suggested experiments in b) and c) would provide additional insights into this study and we will consider them in future work. 

      (d) Deepening therapeutic characterization (dose, biodistribution, durability) and including specificity controls.

      This was addressed above [see response to Weaknesses (3)-a to e].

      (e) Improving statistical reporting and explicitly stating biological replicate structure.

      This was addressed above [see response to Weaknesses (1)-b, (5)-b].

      Reviewer #2 (Public review):

      Sun et al. have developed a midbrain-like organoid (MLO) model for neuronopathic Gaucher disease (nGD). The MLOs recapitulate several features of nGD molecular pathology, including reduced GCase activity, sphingolipid accumulation, and impaired dopaminergic neuron development. They also characterize the transcriptome in the MLO nGD model. CRISPR correction of one of the GBA1 mutant alleles rescues most of the nGD molecular phenotypes. The MLO model was further deployed in proof-of-principle studies of investigational nGD therapies, including SapC-DOPS nanovesicles, AAV9-mediated GBA1 gene delivery, and substrate-reduction therapy (GZ452). This patient-specific 3D model provides a new platform for studying nGD mechanisms and accelerating therapy development. Overall, only modest weaknesses are noted.

      We thank the reviewer for the supportive remarks.

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors describe modeling of neuronopathic Gaucher disease (nGD) using midbrain-like organoids (MLOs) derived from hiPSCs carrying GBA1 L444P/P415R or L444P/RecNciI variants. These MLOs recapitulate several disease features, including GCase deficiency, reduced enzymatic activity, lipid substrate accumulation, and impaired dopaminergic neuron differentiation. Correction of the GBA1 L444P variant restored GCase activity, normalized lipid metabolism, and rescued dopaminergic neuronal defects, confirming its pathogenic role in the MLO model. The authors further leveraged this system to evaluate therapeutic strategies, including: (i) SapC-DOPS nanovesicles for GCase delivery, (ii) AAV9-mediated GBA1 gene therapy, and (iii) GZ452, a glucosylceramide synthase inhibitor. These treatments reduced lipid accumulation and ameliorated autophagic, lysosomal, and neurodevelopmental abnormalities.

      Strengths:

      This manuscript demonstrates that nGD patient-derived MLOs can serve as an additional platform for investigating nGD mechanisms and advancing therapeutic development.

      Comments:

      (1) It is interesting that GBA1 L444P/P415R MLOs show defects in midbrain patterning and dopaminergic neuron differentiation (Figure 3). One might wonder whether these abnormalities are specific to the combination of L444P and P415R variants or represent a 

      general consequence of GBA1 loss. Do GBA1 L444P/RecNciI (GD2-10-257) MLOs also exhibit similar defects?

      We observed reduced dopaminergic neuron marker TH expression in GBA1 L444P/RecNciI (GD2-10-257) MLOs, suggesting that this line also exhibits defects in dopaminergic neuron differentiation. These data are provided in a new Supplementary Fig. 4E, and are summarized in new Supplementary Table 2 in the revision.

      (2) In Supplementary Figure 3, the authors examined GCase localization in SapC-DOPSfGCase-treated nGD MLOs. These data indicate that GCase is delivered to TH⁺ neurons, GFAP⁺ glia, and various other unidentified cell types. In fruit flies, the GBA1 ortholog, Gba1b, is only expressed in glia (PMID: 35857503; 35961319). Neuronally produced GluCer is transferred to glia for GBA1-mediated degradation. These findings raise an important question: in wild-type MLOs, which cell type(s) normally express GBA1? Are they dopaminergic neurons, astrocytes, or other cell types?

      All cell types in wild-type MLOs are expected to express GBA1, as it is a housekeeping gene broadly expressed across neurons, astrocytes, and other brain cell types. Its lysosomal function is essential for cellular homeostasis and is therefore not restricted to any specific lineage. (https://www.proteinatlas.org/ENSG00000177628GBA1/brain/midbrain). 

      (3) The authors may consider switching Figures 2 and 3 so that the differentiation defects observed in nGD MLOs (Figure 3) are presented before the analysis of other phenotypic abnormalities, including the various transcriptional changes (Figure 2).

      We appreciate the reviewer’s suggestion; however, we respectfully prefer to retain the current order of Figures 2 and 3, as we believe this structure provides the clearest narrative flow. Figure 2 establishes the core biochemical hallmarks: reduced GCase activity, substrate accumulation, and global transcriptomic dysregulation (1,429 DEGs enriched in neural development, WNT signaling, and lysosomal pathways), which together provide essential molecular context for studying the specific cellular differentiation defects presented in Figure 3. Presenting the broader disease landscape first creates a coherent mechanistic link to the subsequent analyses of midbrain patterning and dopaminergic neuron impairment.

      To enhance readability, we have added a brief transitional sentence at the start of the Figure 3 paragraph: “Building on the molecular and transcriptomic hallmarks of GCase deficiency observed in nGD MLOs (Figure 2), we next investigated the impact on midbrain patterning and dopaminergic neuron differentiation (Figure 3).”

    1. Author response:

      The following is the authors’ response to the original reviews.

      Joint Public reviews:

      (1) Stable annual dynamics vs. episodic outbreaks

      We agree that RVF is classically described as producing periodic epidemics interspersed with long inter-epidemic periods, often linked to extreme rainfall events. Our model predicts more regular seasonal dynamics, which reflects the endemic transmission patterns we have observed in The Gambia through serological surveys. In this revision, we have:

      - clarified that while epidemics occur in other parts of sub-Saharan Africa, our results are consistent with the epidemiological narrative of RVF in The Gambia, characterised by sustained, moderate transmission without resulting in substantial outbreaks (hyperendemicity).

      - discussed how model assumptions (e.g. seasonality, homogenous mixing) may bias our results toward an endemic quasi-equilibrium dynamic.

      - highlighted the implications of this for interpretation and for public health decision-making.

      (2) Use of network analysis

      We acknowledge the reviewer’s concern. The network analysis was conducted descriptively to characterize cattle movement patterns and the structure of herd connections, but it was not formally incorporated into the model. In this revision we have:

      - clarified this distinction in the manuscript to avoid overinterpretation.

      - emphasized the need for future modelling work using finer-scale movement data, which could support more realistic herd metapopulation dynamics and better capture heterogeneity in transmission.

      (3) RVFV reproductive impacts

      While RVF outbreaks are known to cause substantial abortions and neonatal deaths, these events occur during sporadic epidemics. In the Gambian context, where we’re not observing large outbreaks but rather low-level circulation, the annual impact of RVF infection on births is likely modest compared to baseline herd turnover. Moreover, cattle demography is partly managed, with replacement and movement buffering birth rates against short-term losses.

      Our model includes birth as a constant demographic process, it’s reasonable to assume stable population since we are not explicitly modelling outbreak-scale reproductive losses. This approach is consistent with other RVF transmission models that adopt a similar simplifying assumption. However, we have acknowledged this simplification as a limitation in the revised manuscript.

      (4) Missing ODEs for M herds in the dry season

      We thank the reviewer for identifying this omission. The ODEs for the M subpopulation in the dry season were not included in the appendix due to an oversight, though demographic turnover was implemented in the model code. We have now added the missing equations to the appendix.

      (5) Role of immunity loss and model structure (SIR vs. SIRS)

      We acknowledge that the decline of detectable antibodies over time (seropositivity decay) is an important consideration in RVFV serology; however, whether this decline reflects a true loss of protective immunity following natural infection remains unknown. Available evidence suggests that infected cattle likely develop long-lasting immunity, and findings in humans further support this assumption, although longitudinal field data regarding RVFV-specific antibody durability in animals are not available to the best of our knowledge. From a modelling perspective, our objective was to estimate FOI and use it to predict an age-seroprevalence curve consistent with the observed cross-sectional age-seroprevalence patterns. We therefore adopted a parsimonious SIR framework, interpreting loss of seropositivity as a potential explanation for discrepancies between observed and predicted age-seroprevalence rather than explicitly modelling waning immunity. We have now:

      - clarified this rationale, emphasising that there is no direct evidence for waning immunity following natural RVFV infection in cattle, although evidence of seropositivity decay has been suggested in human.

      - highlighted that while an SEIS/SIRS framework could theoretically generate different long-term dynamics, evaluating this approach requires stronger evidence for true immunity loss.

      (6) RVFV induced mortality in serocatalytic model

      We thank the reviewer for this comment and for raising an important conceptual point. However, the force of infection in our study is not estimated using a serocatalytic framework. Instead, FOI is estimated mechanistically within the transmission model as a function of the number of infectious cattle, rather than from age-stratified seroprevalence data.

      RVF-induced mortality is accounted for through its effect on the infectious compartment, where increased mortality reduces the number and duration of infectious cattle and therefore indirectly reduces FOI. Consequently, RVF-related cattle death does not need to be explicitly incorporated into the FOI expression itself. Seroreversion similarly does not influence FOI estimation under this modelling framework. We have clarified this distinction in the Methods section to avoid confusion between mechanistic transmission models and serocatalytic approaches.

      (7) Clarifying previous vs. current study components

      We have revised the Methods and Appendix to make clearer distinctions between our previous work (e.g. household survey data collection, seroprevalence estimates) and the analyses undertaken for this manuscript (e.g. model development and fitting).

      (8) Limitations paragraph

      We have expanded the limitations section to identify the sparse household movement data as contributing most to uncertainty. We have outlined how these limitations may have implications for our conclusions, and may lead to under- or over-estimation of periods of heightened transmission risk.

      (9) Movement ban simulations & suitability of model for vaccination interventions

      We appreciate the reviewer’s concerns regarding the movement ban simulation. On reassessment, we agree that our model structure might not ideally be suited to exploring a movement ban. In this revised manuscript, we have removed this analysis. We are currently developing separate work focused on RVF vaccination strategies in cattle, where this model structure might be more directly applicable, and will reserve a deeper investigation of vaccination interventions for that forthcoming publication.

      Reviewer #1 (Recommendations for the authors):

      We thank the reviewer for the recommendations regarding the Introduction, Methods, Results, and Supplementary Figures. We have addressed these points below and revised the manuscript accordingly.

      (1) Introduction: Should avoid describing as "inaccessible" the regions that are inhabited by nomadic and transhumant pastoralists.

      We have revised the wording to “hard-to-reach” regions.

      (2) Methods: Can the authors state what share of the animals included in the household survey data were cattle as opposed to other small ruminants? It would be helpful to understand what share of the data is "excluded"

      We have now included the total number of cattle sampled, providing clarity on the proportion of data used in the analyses.

      (3) Methods: When introducing the deterministic model, it seems unnecessary to mention the initialization conditions (i.e., introduction of a single infected individual at time 0) when this is later repeated in the Estimation of model parameters section, where it seems simulations were first conducted.

      We have removed the redundant description.

      (4) Results: Could the negative correlation between geographic distance of connected herds and mean seroprevalence simply indicate proximal exposure rather than common risk factors?

      We acknowledge that both mechanisms are plausible. RVFV transmission is strongly influenced by share environmental factors that shape mosquito dynamics; however, direct transmission between proximal cattle herds may also occur through close contact with infectious tissues, bodily fluids, or contaminated materials. We have clarified this interpretation in the Results section.

      (5) Figure S5: inconsistent notation for the scaling factor parameter (tau), which is expressed in equations and tables as psi.

      We thank the reviewer for identifying this issue and have corrected all instances to ensure consistent use of tau throughout the manuscript.

      (6) Figure S6: Why a density plot, isn't the number of temporary extinctions (x-axis) discrete?

      We have replaced the density plot with a bar plot in Figure S6.

    1. Author response:

      eLife Assessment

      This useful study examines whether the sugar trehalose, coordinates energy supply with the gene programs that build muscle in the cotton bollworm (Helicoverpa armigera). The evidence for this currently is incomplete. The central claim - that trehalose specifically regulates an E2F/Dp-driven myogenic program - is not supported by the specificity of the data: perturbations and sequencing are systemic, alternative explanations such as general energy or amino-acid scarcity remain plausible, and mechanistic anchors are also limited. The work will interest researchers in insect metabolism and development; focused, tissue-resolved measurements together with stronger mechanistic controls would substantially strengthen the conclusions.

      We thank the reviewer for the thoughtful and constructive evaluation of our work and for recognizing its potential relevance to researchers working on insect metabolism and development. We fully agree that our current evidence is preliminary and that the mechanistic link between trehalose and the E2F/Dp‑driven myogenic program needs to be strengthened.

      Our intention was to present trehalose-E2F/Dp coupling as a working model emerging from our data, rather than as a fully established pathway. We agree that systemic manipulations of trehalose and whole‑larval RNA‑seq cannot fully differentiate global metabolic stress from specific effects on myogenic programs. In the revision, we plan to include additional metabolic readouts (e.g., ATP/AMP ratio, key amino acids where available) to better discuss the overall energetic and nutritional state. We will reanalyze our RNA‑seq data to more clearly distinguish broad stress/metabolic signatures from cell‑cycle/myogenic signatures. Furthermore, we will reframe our discussion to explicitly state that we cannot completely rule out a contribution of general energy or amino‑acid scarcity at this stage.

      We acknowledge that, with our current experiments, the specificity for an E2F/Dp‑driven program is inferred mainly from enrichment of E2F targets among differentially expressed genes, and expression changes in canonical E2F partners and downstream cell‑cycle/myogenic regulators. To address this more rigorously, we are performing targeted qRT-PCR for a panel of well‑characterized E2F/Dp target genes and myogenic markers in larval muscle versus non‑muscle tissues, following trehalose perturbation. Where technically feasible, testing whether partial knockdown of HaE2F or HaDp modifies the effect of trehalose manipulation on selected myogenic markers. These data, even if limited, will help to provide a more direct functional link, and we will include them in the manuscript if completed in time. In parallel, we will soften statements that imply a fully established, trehalose‑specific regulation of E2F/Dp and instead present this as a strong candidate pathway suggested by the current data.

      We fully agree that tissue‑resolved analyses are essential to move from systemic correlations to causality in muscle. We are in the process of standardizing larval muscle dissections and isolating thoracic/abdominal body wall muscle for trehalose, glycogen, and expression assays. Comparing expression of key metabolic and myogenic genes in muscle versus fat body and midgut, under trehalose manipulation. These tissue‑resolved data will directly address whether the transcriptional changes we report are preferentially localized to muscle.

      We are grateful for the reviewer’s critical but encouraging comments. We will moderate our central claims, also explicitly consider and discuss alternative explanations. Further, we will add tissue‑resolved and more focused mechanistic data as far as possible within the current revision. We believe these changes will substantially strengthen the manuscript and better align our conclusions with the evidence we presently have.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this work by Mohite et al., they have used transcriptomic and metabolic profiling of H. armigera, muscle development, and S. frugiperda to link energy trehalose metabolism and muscle development. They further used several different bioinformatics tools for network analysis to converge upon transcriptional control as a potential mechanism of metabolite-regulated transcriptional programming for muscle development. The authors have also done rescue experiments where trehalose was provided externally by feeding, which rescues the phenotype. Though the study is exciting, there are several concerns and gaps that lead to the current results as purely speculative. It is difficult to perform any genetic experiments in non-model insects; the authors seem to suggest a similar mechanism could also be applicable in systems like Drosophila; it might be possible to perform experiments to fill some missing mechanistic details.

      A few specific comments below:

      The authors used N-(phenylthio) phthalimide (NPP), a trehalose-6-phosphate phosphatase (TPP) inhibitor. They also find several genes, including enzymes of trehalose metabolism, that change. Further, several myogenic genes are downregulated in bulk RNA sequencing. The major caveat of this experiment is that the NPP treatment leads to reduced muscle development, and so the proportion of the samples from the muscles in bulk RNA sequencing will be relatively lower, which might have led to the results. So, a confirmatory experiment has to be performed where the muscle tissues are dissected and sequenced, or some of the interesting targets could be validated by qRT-PCR. Further to overcome the off-target effects of NPP, trehalose rescue experiments could be useful.

      Thank you for this valuable comment. We will validate the gene expression data using qRT-PCR on muscle tissue samples from both treated and control groups. This will help determine whether the gene expression patterns observed in the RNA-seq data are muscle-specific or systemic.

      Even the reduction in the levels of ADP, NAD, NADH, and NMN, all of which are essential for efficient energy production and utilization, could be due to the loss of muscles, which perform predominantly metabolic functions due to their mitochondria-rich environment. So it becomes difficult to judge if the levels of these energy molecules' reduction are due to a cause or effect.

      We thank the reviewer for this thoughtful comment and agree that reduced levels of ADP, NAD, NADH, and NMN could arise either from a disturbance of energy metabolism or from loss of mitochondria‑rich muscles. Our current data cannot fully separate these two possibilities. Still, several studies support the interpretation that perturbing trehalose metabolism causes a primary systemic energy deficit that is coupled to mitochondrial function, not merely a passive consequence of tissue loss.

      For example:

      (1) Our previous study in H. armigera showed that chemical inhibition of trehalose synthesis results in depletion of trehalose, glucose, glucose‑6‑phosphate, and suppression of the TCA cycle, indicating reduced energy levels and dysregulated fatty‑acid oxidation (Tellis et al., 2023).

      (2) Chang et al. (2022) showed that trehalose catabolism and mitochondrial ATP production are mechanistically linked. HaTreh1 localizes to mitochondria and physically interacts with ATP synthase subunit α. 20‑hydroxyecdysone increases HaTreh1 expression, enhances its binding to ATP synthase, and elevates ATP content, while knockdown of HaTreh1 or HaATPs‑α reduces ATP levels.

      (3) Similarly, our previous study inhibition of Treh activity in H. armigera generates an “energy‑deficient condition” characterized by deregulation of carbohydrate, protein, fatty‑acid, and mitochondria‑related pathways, and a concomitant reduction in key energy metabolites (Tellis et al., 2024).

      (4) The starvation study in H. armigera has shown that reduced hemolymph trehalose is associated with respiratory depression and large‑scale reprogramming of glycolysis and fatty‑acid metabolism (Jiang et al., 2019).

      These findings support a direct coupling between trehalose availability and systemic energy/redox state. Therefore, the coordinated decrease in ADP, NAD, NADH, and NMN following TPS/TPP silencing is consistent with a primary disturbance of systemic energy and mitochondrial metabolism rather than exclusively a secondary consequence of muscle loss. We agree, however, that the present whole‑larva metabolite measurements do not allow a quantitative partitioning between changes due to altered muscle mass and those due to intrinsic metabolic impairment at the cellular level. Thus, tissue-specific quantification of these metabolites would allow us to directly test whether altered energy metabolites are a cause or consequence of muscle loss.

      References:

      (1) Tellis, M. B., Mohite, S. D., Nair, V. S., Chaudhari, B. Y., Ahmed, S., Kotkar, H. M., & Joshi, R. S. (2024). Inhibition of Trehalose Synthesis in Lepidoptera Reduces Larval Fitness. Advanced Biology, 8(2), 2300404.

      (2) Chang, Y., Zhang, B., Du, M., Geng, Z., Wei, J., Guan, R., An, S. and Zhao, W., 2022. The vital hormone 20-hydroxyecdysone controls ATP production by upregulating the binding of trehalase 1 with ATP synthase subunit α in Helicoverpa armigera. Journal of Biological Chemistry, 298(2).

      (3) Tellis, M., Mohite, S. and Joshi, R., 2024. Trehalase inhibition in Helicoverpa armigera activates machinery for alternate energy acquisition. Journal of Biosciences, 49(3), p.74.

      (4) Jiang, T., Ma, L., Liu, X.Y., Xiao, H.J. and Zhang, W.N., 2019. Effects of starvation on respiratory metabolism and energy metabolism in the cotton bollworm Helicoverpa armigera (Hübner)(Lepidoptera: Noctuidae). Journal of Insect Physiology, 119, p.103951.

      The authors have used this transcriptomic data for pathway enrichment analysis, which led to the E2F family of transcription factors and a reduction in the level of when trehalose metabolism is perturbed. EMSA experiments, though, confirm a possibility of the E2F interaction with the HaTPS/TPP promoter, but it lacks proper controls and competition to test the actual specificity of this interaction. Several transcription factors have DNA-binding domains and could bind any given DNA weakly, and the specificity is ideally known only from competitive and non-competitive inhibition studies.

      We thank the reviewer for this important comment and fully agree that EMSA alone, without appropriate competition and control reactions, cannot establish the specificity or functional relevance of a transcription factor-DNA interaction. In our study, we found the E2F family from GRN analysis of the RNA seq data obtained upon HaTPS/TPP silencing, suggesting a potential regulatory connection. After that, we predicted E2F binding sites on the promoter of HaTPS/TPP. The EMSA experiments were intended as preliminary evidence that E2F can associate with the HaTPS/TPP promoter in vitro. We will clarify this in the manuscript by softening our conclusion to indicate that our data support a “possible E2F-HaTPS/TPP interaction”. We also perform EMSA with specific and non‑specific competitors to confirm the E2F binding to the HaTPS/TPP promoter.

      The work seems to have connected the trehalose metabolism with gene expression changes, though this is an interesting idea, there are no experiments that are conclusive in the current version of the manuscript. If the authors can search for domains in the E2F family of transcription factors that can bind to the metabolite, then, if not, a chip-seq is essential to conclusively suggest the role of E2F in regulating gene expression tuned by the metabolites.

      A previous study in D. melanogaster, Zappia et al., (2016) showed vital role of E2F in skeletal muscle required for animal viability. They have shown that Dp knockdown resulted in reduced expression of genes encoding structural and contractile proteins, such as Myosin heavy chain (Mhc), fln, Tropomyosin 1 (Tm1), Tropomyosin 2 (Tm2), Myosin light chain 2 (Mlc2), sarcomere length short (sals) and Act88F, and myogenic regulators, such as held out wings (how), Limpet (Lmpt), Myocyte enhancer factor 2 (Mef2) and spalt major (salm). Also, ChiP-qRT-PCR showed upstream regions of myogenic genes, such as how, fln, Lmpt, sals, Tm1 and Mef2, were specifically enriched with E2f1, E2f2, and Dp antibodies in comparison with a nonspecific antibody. Further, Zappia et al. (2019) reported a chip-seq dataset that suggests that E2F/Dp directly activates the expression of glycolytic and mitochondrial genes during muscle development. Zappia et al., (2023) showed the regulation of one of the glycolytic genes, Phosphoglycerate kinase (Pgk) by E2F during Drosophila development.

      However, the regulation of trehalose metabolic genes by E2F/Dp and vice versa was not studied previously. So here in our study, we tried to understand the correlation of trehalose metabolism and E2F/Dp in the muscle development of H. armigera.

      References:

      (1) Zappia, M.P. and Frolov, M.V., 2016. E2F function in muscle growth is necessary and sufficient for viability in Drosophila. Nature Communications, 7(1), p.10509.

      (2) Zappia, M.P., Rogers, A., Islam, A.B. and Frolov, M.V., 2019. Rbf activates the myogenic transcriptional program to promote skeletal muscle differentiation. Cell reports, 26(3), pp.702-719.

      (3) Zappia, M. P., Kwon, Y.-J., Westacott, A., Liseth, I., Lee, H. M., Islam, A. B., Kim, J., & Frolov, M. V. (2023a). E2F regulation of the Phosphoglycerate kinase gene is functionally important in Drosophila development. Proceedings of the National Academy of Sciences, 120(15), e2220770120.

      Some of the above concerns are partially addressed in experiments where silencing of E2F/Dp shows similar phenotypes as with NPP and dsRNA. It is also notable that silencing any key transcription factor can have several indirect effects, and delayed pupation and lethality could not be definitely linked to trehalose-dependent regulation.

      Yes. It’s true that silencing of any key transcription factor can have several indirect effects. Our intention was not to argue that delayed pupation and lethality are exclusively due to trehalose-dependent regulation, but that E2F/Dp and HaTPS/TPP silencing showed a consistent set of phenotypes and molecular changes, such as (i) transcriptomic enrichment of E2F targets upon trehalose perturbation, (ii) reduced HaTPS/TPP expression following E2F/Dp silencing, (iii) reduced myogenic gene expression that parallels the phenotypes observed with HaTPS/TPP silencing and (iv) restoration of E2F and Dp expression in E2F/Dp‑silenced insects upon trehalose feeding in the rescue assay. Together, these findings support a functional association between E2F/Dp and trehalose homeostasis. At the same time, we fully acknowledge that these results do not exclude additional, trehalose‑independent roles of E2F/Dp in development.

      Trehalose rescue experiments that rescue phenotype and gene expression are interesting. But is it possible that the fed trehalose is metabolized in the gut and might not reach the target tissue? In which case, the role of trehalose in directly regulating transcription factors becomes questionable. So, a confirmatory experiment is needed to demonstrate that the fed trehalose reaches the target tissues. This could possibly be done by measuring the trehalose levels in muscles post-rescue feeding. Also, rescue experiments need to be done with appropriate control sugars.

      Yes, it’s possible that, to some extent, trehalose is metabolized in the gut. Even though trehalase is present in the insect gut, some of the trehalose will be absorbed via trehalose transporters on the gut lining. Trehalose feeding was not rescued in insects fed with the control diet (empty vector and dsHaTPP), which contains chickpea powder, which is composed of an ample amount of amino acids and carbohydrates. Insects fed exclusively on a trehalose-containing diet are rescued, but not on a control diet that contains other carbohydrates. We agree that direct measurement of trehalose in target tissues will provide important confirmation. In the manuscript, we will measure trehalose levels in muscle, gut, and haemolymph after trehalose feeding.

      No experiments are performed with non-target control dsRNA. All the experiments are done with an empty vector. But an appropriate control should be a non-target control.

      Yes, there was no experiment with non-target dsRNA. Earlier, we have optimized a protocol for dsRNA delivery and its effectiveness in target knockdown (concentration, time) experiment, and published several research articles using a similar protocol:

      (1) Chaudhari, B.Y., Nichit, V.J., Barvkar, V.T. and Joshi, R.S., 2025. Mechanistic insights in the role of trehalose transporter in metabolic homeostasis in response to dietary trehalose. G3: Genes, Genomes, Genetics, p. jkaf303.

      (2) Barbole, R.S., Sharma, S., Patil, Y., Giri, A.P. and Joshi, R.S., 2024. Chitinase inhibition induces transcriptional dysregulation altering ecdysteroid-mediated control of Spodoptera frugiperda development. Iscience, 27(3).

      (3) Patil, Y.P., Wagh, D.S., Barvkar, V.T., Gawari, S.K., Pisalwar, P.D., Ahmed, S. and Joshi, R.S., 2025. Altered Octopamine synthesis impairs tyrosine metabolism affecting Helicoverpa armigera vitality. Pesticide Biochemistry and Physiology, 208, p.106323.

      (4) Tellis, M.B., Chaudhari, B.Y., Deshpande, S.V., Nikam, S.V., Barvkar, V.T., Kotkar, H.M. and Joshi, R.S., 2023. Trehalose transporter-like gene diversity and dynamics enhances stress response and recovery in Helicoverpa armigera. Gene, 862, p.147259.

      (5) Joshi, K.S., Barvkar, V.T., Hadapad, A.B., Hire, R.S. and Joshi, R.S., 2025. LDH-dsRNA nanocarrier-mediated spray-induced silencing of juvenile hormone degradation pathway genes for targeted control of Helicoverpa armigera. International Journal of Biological Macromolecules, p.148673.

      The same vector backbone and preparation procedures were used for both control and experimental constructs, allowing us to specifically compare the effects of the target dsRNA. The phenotypes and gene expression changes we observed were specific to the target genes and were not seen in the empty vector controls, suggesting that the effects are not due to nonspecific responses of dsRNA delivery or vector components.<br /> We acknowledge your suggestions, and in future studies, we will keep non-target dsRNA as a control in silencing assays.

      Reviewer #2 (Public review):

      Summary:

      This study shows that the knockdown of the effects of TPS/TPP in Helicoverpa armigera and Spodoptera frugiperda can be rescued by trehalose treatment. This suggests that trehalose metabolism is necessary for development in the tissues that NPP and dsRNA can reach.

      Strengths:

      This study examines an important metabolic process beyond model organisms, providing a new perspective on our understanding of species-specific metabolism equilibria, whether conserved or divergent.

      Weaknesses:

      While the effects observed may be truly conserved across Lepidopterans and may be muscle-specific, the study largely relies on one species and perturbation methods that are not muscle-specific. The technical limitations arising from investigations outside model systems, where solid methods are available, limit the specificity of inferences that may be drawn from the data.

      Thank you for this potting out this experimental weakness. We will validate the gene expression data using qRT-PCR on muscle tissue samples from both treated and control groups. We will also perform metabolite analysis with muscle samples. This will help to determine whether the observed gene expression patterns and metabolite changes are muscle-specific or systemic.

      Reviewer #3 (Public review):

      The hypothesis is that Trehalose metabolism regulates transcriptional control of muscle development in lepidopteran insects.

      The manuscript investigates the role of Trehalose metabolism in muscle development. Through sequencing and subsequent bioinformatics analysis of insects with perturbed trehalose metabolism (knockdown of TPS/TPP), the authors have identified transcription factor E2F, which was validated through RT-PCR. Their hypothesis is that trehalose metabolism regulates E2F, which then controls the myogenic genes. Counterintuitive to this hypothesis, the investigators perform EMSAs with the E2F protein and promoter of the TPP gene and show binding. Their knockdown experiments with Dp, the binding partner of E2F, show direct effect on several trehalose metabolism genes. Similar results are demonstrated in the trehalose feeding experiment, where feeding trehalose leads to partial rescue of the phenotype observed as a result of Dp knockdown. This seems contradictory to their hypothesis. Even more intriguing is a similar observation between paramyosin, a structural muscle protein, and E2F/Dp - they show that paramyosin regulates E2F/Dp and E2F/Dp regulated paramyosin. The only plausible way to explain the results is the existence of a feed-forward loop between TPP-E2F/Dp and paramyosin-E2F/Dp. But the authors have mentioned nothing in this line. Additionally, I think trehalose metabolism impacts amino acid content in insects, and that will have a direct bearing on muscle development. The sequencing analysis and follow-up GSEA studies have demonstrated enrichment of several amino acid biosynthetic genes. Yet authors make no efforts to measure amino acid levels or correlate them with muscle development. Any study aiming to link trehalose metabolism and muscle development and not considering the above points will be incomplete.

      We appreciate the reviewer’s efforts in the careful evaluation of this manuscript and constructive comments. From our and earlier data we found it was difficult to consider linear pathway “trehalose → E2F → muscle,” but rather a regulatory module in which trehalose metabolism and E2F/Dp form an interdependent circuit controlling myogenic genes. E2F/Dp binds and activates trehalose metabolism genes (TPS/TPP, Treh1) and myogenic structural genes, consistent with EMSA (TPS/TPP-E2F) and predicted binding sites of E2F on metabolic genes, Treh1, Pgk, and myogenic genes such as Act88F, Prm, Tm1, Fln, etc. At the same time, perturbing trehalose synthesis reduces E2F/Dp expression and myogenic gene expression, and trehalose feeding partially restores all three. This bidirectional influence is similar to E2F‑dependent control of carbohydrate metabolism and systemic sugar homeostasis described in D. melanogaster, where E2F/Dp both regulates metabolic genes and is itself constrained by metabolic state (Zappia et al., 2023a; Zappia et al., 2021).

      The reciprocal regulation between Prm and E2F/Dp is indeed intriguing. Rather than a paradox, we interpret this as evidence that E2F/Dp couples metabolic genes and structural muscle genes within a shared module, and that key sarcomeric components (such as paramyosin) feed back on this transcriptional program. Similar cross‑talk between E2F‑controlled metabolic programs and tissue function has been documented in D. melanogaster muscle and fat body, where E2F loss in one tissue elicits systemic changes in the other (Zappia et al., 2021). For further confirmation of E2F-regulated Prm, we will perform EMSA on the Prm promoter with appropriate controls.

      We fully agree that amino‑acid metabolism is a critical missing piece. In the manuscript, we will quantify the amino acid levels and include the results: “Amino acids display differential levels showing cysteine, leucine, histidine, valine, and proline showed significant reductions, while isoleucine and lysine showed non-significant reductions upon trehalose metabolism perturbation. These results are consistent with previous reports published by Tellis et al. (2024) and Shi et al. (2016)”. We will reframe our conclusions more cautiously as establishing a trehalose-E2F/Dp-muscle development, while stating that “definitive causal links via amino‑acid metabolism remain to be demonstrated”.

      Reference:

      (1) Zappia, M. P., Kwon, Y.-J., Westacott, A., Liseth, I., Lee, H. M., Islam, A. B., Kim, J., & Frolov, M. V. (2023a). E2F regulation of the Phosphoglycerate kinase gene is functionally important in Drosophila development. Proceedings of the National Academy of Sciences, 120(15), e2220770120.

      (2) Zappia, M.P., Guarner, A., Kellie-Smith, N., Rogers, A., Morris, R., Nicolay, B., Boukhali, M., Haas, W., Dyson, N.J. and Frolov, M.V., 2021. E2F/Dp inactivation in fat body cells triggers systemic metabolic changes. elife, 10, p.e67753.

      (3)Tellis, M., Mohite, S. and Joshi, R., 2024. Trehalase inhibition in Helicoverpa armigera activates machinery for alternate energy acquisition. Journal of Biosciences, 49(3), p.74.

      (4) Shi, J.F., Xu, Q.Y., Sun, Q.K., Meng, Q.W., Mu, L.L., Guo, W.C. and Li, G.Q., 2016. Physiological roles of trehalose in Leptinotarsa larvae revealed by RNA interference of trehalose-6-phosphate synthase and trehalase genes. Insect Biochemistry and Molecular Biology, 77, pp.52-68.

      Author response image 1.

      The result section of the manuscript is quite concise, to my understanding (especially the initial few sections), which misses out on mentioning details that would help readers understand the paper better. While technical details of the methods should be in the Materials and Methods section, the overall experimental strategy for the experiments performed should be explained in adequate detail in the results section itself or in figure legends. I would request authors to include more details in the results section. As an extension of the comment above, many times, abbreviations have been used without introducing them. A thorough check of the manuscript is required regarding this.

      Thank you very much for pointing out this issue. We will revise the manuscript content according to these suggestions.

      The Spodoptera experiments appear ad hoc and are insufficient to support conservation beyond Helicoverpa. To substantiate this claim, please add a coherent, minimal set of Spodoptera experiments and present them in a dedicated subsection. Alternatively, consider removing these data and limiting the conclusions (and title) to H. armigera.

      We thank the reviewer for this helpful comment. We agree that, in this current version of the manuscript, the S. frugiperda experiments are not sufficiently systematic to support strong claims about conservation beyond H. armigera. Our primary focus in this study is indeed on H. armigera, and the addition of the S. frugiperda data was intended only as preliminary, supportive evidence rather than a central component of our conclusions. To avoid over‑interpretation and to keep the manuscript focused and coherent, we will remove all S. frugiperda data from the revised version, including the corresponding text and figures. We will also adjust the title, abstract, and conclusion to clearly state that our findings are limited to H. armigera.

      In order to check the effects of E2F/Dp, a dsRNA-mediated knockdown of Dp was performed. Why was the E2F protein, a primary target of the study, not chosen as a candidate? The authors should either provide justification for this or perform the suggested experiments to come to a conclusion. I would like to point out that such experiments were performed in Drosophila.

      Thank you for this thoughtful comment and the specific suggestion. We agree that directly targeting E2F would, in principle, be an informative complementary approach. In our study, however, we prioritized Dp knockdown for two main reasons. First, E2F is a large family, and E2F-Dp functions as an obligate heterodimer. Previous work in D. melanogaster has shown that depletion of Dp is sufficient to disrupt E2F-dependent transcription broadly, often with more efficient loss of complex activity than targeting individual E2F isoforms (Zappia et al., 2021; Zappia et al., 2016). Second, in our preliminary trials, we performed a dsRNA feeding assay with dsHaE2F, dsHaDp, and combined dsHaE2F plus dsHaDp. In that assay, we did not achieve silencing of E2F in dsRNA targeting HaE2F (dsHaE2F). So here, as E2F is a large family, other E2F isoforms may be compensating for the silencing effect of targeted HaE2F. However, HaE2F showed significantly reduced expression upon dsHaDp and combined dsHaE2F plus dsHaDp feeding (Figure A), whereas HaDp showed a significant reduction in its expression in all three conditions (Figure B).  As we observed reduced expression of both HaE2F and HaDp upon combined feeding of dsHaE2F and dsHaDp, we further performed a rescue assay by exogenous feeding of trehalose. We observed the significant upregulation of HaE2F, HaDp, trehalose metabolic genes (HaTPS/TPP and HaTreh1), and myogenic genes (HaPrm and HaTm2) (Figure C). For these reasons, we focused on Dp silencing as a more reliable way to impair E2F/Dp complex function in H. armigera.

      Author response image 2.

      References:

      (1) Zappia, M.P. and Frolov, M.V., 2016. E2F function in muscle growth is necessary and sufficient for viability in Drosophila. Nature Communications, 7(1), p.10509.

      (2) Zappia, M.P., Guarner, A., Kellie-Smith, N., Rogers, A., Morris, R., Nicolay, B., Boukhali, M., Haas, W., Dyson, N.J. and Frolov, M.V., 2021. E2F/Dp inactivation in fat body cells triggers systemic metabolic changes. elife, 10, p.e67753.

      Silencing of HaDp resulted in a significant decrease in HaE2F expression. I find this observation intriguing. DP is the cofactor of E2F, and they both heterodimerise and sit on the promoter of target genes to regulate them. I would request authors to revisit this result, as it contradicts the general understanding of how E2F/Dp functions in other organisms. If Dp indeed controls E2F expression, then further experiments should be conducted to come to a conclusion convincingly. Additionally, these results would need thorough discussion with citations of similar results observed for other transcription factor-cofactor complexes.

      Thank you for highlighting this point and for prompting us to examine these data more carefully. Silencing HaDp leading to reduced HaE2F mRNA is indeed unexpected if one only considers the canonical view of E2F/Dp as a heterodimer that co-occupies target promoters without strongly regulating each other’s expression. However, several lines of work suggest that transcription factor-cofactor networks frequently include feedback loops in which cofactors influence the expression of their partner TFs. First, in multiple systems, transcription factors and their cofactors are known to regulate each other’s transcription, forming positive or negative feedback loops. For example, in hematopoietic cells, the transcription factor Foxp3 controls the expression of many of its own cofactors, and some of these cofactors in turn facilitate or stabilize Foxp3 expression, forming an interconnected regulatory network rather than a simple one‑way interaction (Rudra et al., 2012). Second, E2F/Dp complexes exhibit non‑canonical regulatory mechanisms and can regulate broad sets of targets, including other transcriptional regulators. Several studies show that E2F/Dp proteins not only control classical cell‑cycle genes but also participate in diverse processes such as DNA damage signaling, mitochondrial function, and differentiation (Guarner et al., 2017; Ambrus et al., 2013; Sánchez-Camargo et al., 2021). In D. melanogaster, complete loss of dDP alters the expression of direct targets E2F/DP, including dATM (Guarner et al., 2017).

      All these reports indicate that the E2F-Dp complex sits at the top of multi‑layer regulatory hierarchies. Such architectures make it plausible that Dp silencing in H. armigera could modulate HaE2F expression in a non-canonical way.

      References:

      (1) Rudra, D., DeRoos, P., Chaudhry, A., Niec, R.E., Arvey, A., Samstein, R.M., Leslie, C., Shaffer, S.A., Goodlett, D.R. and Rudensky, A.Y., 2012. Transcription factor Foxp3 and its protein partners form a complex regulatory network. Nature immunology, 13(10), pp.1010-1019.

      (2) Guarner, A., Morris, R., Korenjak, M., Boukhali, M., Zappia, M.P., Van Rechem, C., Whetstine, J.R., Ramaswamy, S., Zou, L., Frolov, M.V. and Haas, W., 2017. E2F/DP prevents cell-cycle progression in endocycling fat body cells by suppressing dATM expression. Developmental cell, 43(6), pp.689-703.

      (3) Ambrus, A.M., Islam, A.B., Holmes, K.B., Moon, N.S., Lopez-Bigas, N., Benevolenskaya, E.V. and Frolov, M.V., 2013. Loss of dE2F compromises mitochondrial function. Developmental cell, 27(4), pp.438-451.

      (4) Sánchez-Camargo, V.A., Romero-Rodríguez, S. and Vázquez-Ramos, J.M., 2021. Non-canonical functions of the E2F/DP pathway with emphasis in plants. Phyton, 90(2), p.307.

      I consider the overall bioinformatics analysis to remain very poorly described. What is specifically lacking is clear statements about why a particular dry lab experiments were conducted.

      We again thank the reviewer for advising us to give a biological context/motivation for every bioinformatics analysis performed. The bioinformatics analyses devised here, try to explain the systems-level perturbations of HaTPS/TPP silencing to explain the observed phenotype and to discover transcription factors potentially modulating the HaTPS/TPP induced gene regulatory changes.

      (1) Gene set enrichment analyses:

      Differential gene expression analyses of the bulk RNA sequencing data followed by qRT-PCR confirmed the transcriptional changes in myogenic genes and gene expression alterations in metabolic and cell cycle-related genes. These perturbations merely confirmed the effect induced by HaTPS/TPP silencing in obviously expected genes. We wanted to see whether using an “unbiased” system-level statistical analyses like gene set enrichment analyses (GSEA), can reveal both expected and novel biological processes that underlie HaTPS/TPP silencing. GSEA results revealed large-scale transcriptional changes in 11 enriched processes, including amino acid metabolism, energy metabolism, developmental regulatory processes, and motor protein activity. GSEA not only divulged overall transcriptionally enriched pathways but also identified the genes undergoing synchronized pathway-level transcriptional change upon HaTPS/TPP silencing.

      (2) Gene regulatory network analysis:

      Although GSEA uncovered potential pathway-level changes, we were also interested in identifying the gene regulatory network associated with such large-scale process-level transcriptional perturbations. Interestingly, the biological processes undergoing perturbations were also heterogeneous (e.g., motor protein activity, energy metabolism, amino acid metabolism, etc.). We hypothesized that the inference of a causal gene regulatory network associated with the genes associated with GSEA-enriched biological processes should predict core/master transcription factors that might synchronously regulate metabolic and non-metabolic processes related to HaTPS/TPP silencing, thereby providing a broad understanding of the perturbed phenotype. The gene regulatory network analysis statistically inferred an “active” gene regulatory network corresponding to the GSEA-enriched KEGG gene sets. Ranking the transcription factors (TFs) based on the number of outgoing connections (outdegree centrality) within the active gene regulatory network, E2F family TFs were identified to be top-ranking, highly connected transcription factors associated with the transcriptionally enriched processes. This suggests that E2F family TFs are central to controlling the flow of regulatory information within this network. Intriguingly, E2F has been previously implicated in muscle development in insects (Zappia et al., 2016). Further extracting the regulated targets of E2F family TFs within this network revealed the mechanistic connection with the 11 enriched processes. This GRN analysis was crucial in discovering and prioritizing E2F TFs as central transcription factors mediating HaTPS/TPP silencing effects, which was not apparent using trivial analyses like differential gene expression analysis.

      As per the reviewer’s suggestions, we will add these outlined points in the text of the manuscript (Results section) to further give context and clarity to the bioinformatics analyses conducted in this study.

      In my judgement, the EMSA analysis presented is technically poor in quality. It lacks positive and negative controls, does not show mutation analysis or super shifts. Also, it lacks any competition assays that are important to prove the binding beyond doubt. I am not sure why protein is not detected at all in lower concentrations. Overall, the EMSA assays need to be redone; I find the current results to be unacceptable.

      Thank you for pointing out this issue. We will reperform the EMSA analysis with appropriate controls.  Although the gel image was not clear, there was a light band of protein (indicated by the white square) observed in well No. 8, where we used 8 μg of E2F protein and 75 ng of HaTPS/TPP promoter, upon gel stained with SYPRO Ruby protein stain, suggesting weak HaTPS/TPP-E2F complex formation.

      GSEA studies clearly indicate enrichment of the amino acid synthesis gene in TPP knockdown samples. This supports the plausible theory that a lack of Trehalose means a lack of enough nutrients, therefore less of that is converted to amino acids, and therefore muscle development is compromised. Yet the authors make no effort to measure amino acid levels. While nutrients can be sensed through signalling pathways leading to shut shutdown of myogenic genes, a simple and direct correlation between less raw material and deformed muscle might also be possible.

      We quantified amino acid levels as per the suggestion, and we observed differential levels of amino acids upon trehalose metabolism perturbation.

      However, we observed that insect were failed to rescue when fed a control chickpea-based artificial diet that contained nutrients required for normal growth and development. Based on this observation, we conclude that trehalose deficiency is the only possible cause for the defect in muscle development.

      The authors are encouraged to stick to one color palette while demonstrating sequencing results. Choosing a different color palette for representing results from the same sequencing analysis confuses readers.

      Thank you for the comment. We will revise the color palette as per the suggestion.

      Expression of genes, as understood from sequencing analysis in Figure 1D, Figure 2F, and Figure 3D, appears to be binary in nature. This result is extremely surprising given that the qRT-PCR of these genes have revealed a checker and graded expression.

      Thank you for pointing out this issue. We will revise the scale range for these figures to get more insights about gene expression levels and include figures as per the suggestion.

      In several graphs, non-significant results have been interpreted as significant in the results section. In a few other cases, the reported changes are minimal, and the statistical support is unclear; please recheck the analyses and include exact statistics. In the results section, fold changes observed should be discussed, as well as the statistical significance of the observed change.

      We will revise the analyses and include exact statistics as per the suggestion.

      Finally, I would add that trehalose metabolism regulates cell cycle genes, and muscle development genes establish correlation and causation. The authors should ensure that any comments they make are backed by evidence.

      We thank the reviewer for this insightful comment.  Although direct evidence in insects is currently lacking, multiple independent studies in yeast, plants and mammalian systems support a regulatory link between trehalose metabolism and the cell cycle. In budding yeast Saccharomyces cerevisiae, neutral Treh (Nth1) is directly phosphorylated and activated by the major cyclin‑dependent kinase Cdk1 at G1/S, routing stored trehalose into glycolysis to fuel DNA replication and mitosis (Ewald et al., 2016). CDK‑dependent regulation of trehalase activity has also been reported in plants, where CDC28‑mediated phosphorylation channels glucose into biosynthetic pathways necessary for cell proliferation (Lara-núñez et al., 2025). Furthermore, budding yeast cells accumulate trehalose and glycogen upon entry into quiescence and subsequently mobilize these stores to generate a metabolic “finishing kick” that supports re‑entry into the cell cycle (Silljé et al., 1999; Shi et al., 2010). Exogenous trehalose that perturbs the trehalose cycle impairs glycolysis, reduces ATP, and delays cell cycle progression in S. cerevisiae, highlighting a dose‑ and context‑dependent control of growth versus arrest (Zhang, Zhang and Li, 2020). In mammalian systems, trehalose similarly modulates proliferation-differentiation decisions. In rat airway smooth muscle cells, low trehalose concentrations promote autophagy, whereas higher doses induce S/G2–M arrest, downregulate Cyclin A1/B1, and trigger apoptosis, indicating a shift from controlled growth to cell elimination at higher exposure (Xiao et al., 2021). In human iPSC‑derived neural stem/progenitor cells, low‑dose trehalose enhances neuronal differentiation and VEGF secretion, while higher doses are cytotoxic, again highlighting a tunable impact on cell‑fate outcomes (Roose et al., 2025). In wheat, exogenous trehalose under heat stress reduces growth, lowers auxin, gibberellin, abscisic acid and cytokinin levels, and represses CycD2 and CDC2 expression, suggesting that trehalose signalling integrates with hormone pathways and core cell‑cycle regulators to restrain proliferation during stress (Luo, Liu, and Li, 2021). Together, these studies showed the importance of trehalose metabolism in cell‑cycle regulation to decide whether cells and tissues proliferate, differentiate, or remain quiescent.

      With respect to muscle development, previous work has implicated glycolytic metabolism in myogenesis and muscle growth. Tixier et al. (2013) showed that loss of key glycolytic genes results in abnormally thin muscles, while Bawa et al. (2020) demonstrated that loss of TRIM32 decreases glycolytic flux and reduces muscle tissue size. These findings indicate that carbohydrate and energy metabolism pathways are important determinants of muscle structure and growth. However, there are no previous studies about the role of trehalose metabolism in muscle development, other than as an energy source, so here we specifically set out to establish the involvement of trehalose metabolism in muscle development.

      References:

      (1) Ewald, J.C. et al. (2016) “The yeast cyclin-dependent kinase routes carbon fluxes to fuel cell cycle progression,” Molecular cell, 62(4), pp. 532–545.

      (2) Lara-núñez, A. et al. (2025) “The Cyclin-Dependent Kinase activity modulates the central carbon metabolism in maize during germination,” (January), pp. 1–16.

      (3) Silljé, H.H.W. et al. (1999) “Function of trehalose and glycogen in cell cycle progression and cell viability in Saccharomyces cerevisiae,” Journal of bacteriology, 181(2), pp. 396–400.

      (4) Shi, L. et al. (2010) “Trehalose Is a Key Determinant of the Quiescent Metabolic State That Fuels Cell Cycle Progression upon Return to Growth,” 21, pp. 1982–1990.

      (5) Zhang, X., Zhang, Y. and Li, H. (2020) “Regulation of trehalose, a typical stress protectant, on central metabolisms, cell growth and division of Saccharomyces cerevisiae CEN. PK113-7D,” Food Microbiology, 89, p. 103459.

      (6) Xiao, B. et al. (2021) “Trehalose inhibits proliferation while activates apoptosis and autophagy in rat airway smooth muscle cells,” Acta Histochemica, 123(8), p. 151810.

      (7) Roose, S.K. et al. (2025) “Trehalose enhances neuronal differentiation with VEGF secretion in human iPSC-derived neural stem / progenitor cells,” Regenerative Therapy, 30, pp. 268–277.

      (8) Luo, Y., Liu, X. and Li, W. (2021) “Exogenously-supplied trehalose inhibits the growth of wheat seedlings under high temperature by affecting plant hormone levels and cell cycle processes,” Plant Signaling & Behavior, 16(6).

      (9) Tixier, V., Bataillé, L., Etard, C., Jagla, T., Weger, M., DaPonte, J.P., Strähle, U., Dickmeis, T. and Jagla, K., 2013. Glycolysis supports embryonic muscle growth by promoting myoblast fusion. Proceedings of the National Academy of Sciences, 110(47), pp.18982-18987.

      (10) Bawa, S., Brooks, D.S., Neville, K.E., Tipping, M., Sagar, M.A., Kollhoff, J.A., Chawla, G., Geisbrecht, B.V., Tennessen, J.M., Eliceiri, K.W. and Geisbrecht, E.R., 2020. Drosophila TRIM32 cooperates with glycolytic enzymes to promote cell growth. elife, 9, p.e52358.

      Finally, we appreciate the meticulous review of this manuscript and constructive comments. We will perform the recommended experiments, data analysis, and revise the manuscript accordingly.

    1. Soutenir les compétences socio-émotionnelles chez les jeunes enfants : Approches et Dispositifs

      Résumé Exécutif

      Ce document synthétise les interventions de Sylvie Richard (Université de Genève / HP Valais) concernant le soutien aux apprentissages socio-émotionnels durant les premières années de scolarité.

      La recherche scientifique identifie deux leviers complémentaires : l'approche directe (structurée et dirigée par l'enseignant) et l'approche indirecte (développementale, centrée sur le jeu de faire semblant).

      Les données probantes, issues notamment de méta-analyses incluant plus d'un million d'élèves, démontrent que le renforcement des compétences socio-émotionnelles améliore non seulement le bien-être et les comportements sociaux, mais aussi les résultats académiques à long terme.

      La transition vers une pédagogie intégrant le jeu accompagné nécessite toutefois une formation approfondie des enseignants (plus de 20 heures) et un travail réflexif sur leurs propres compétences émotionnelles.

      --------------------------------------------------------------------------------

      1. Cadre Conceptuel des Compétences Socio-Émotionnelles

      Les compétences socio-émotionnelles sont définies selon le modèle de l'organisation Casel, qui regroupe trois grands domaines d'apprentissage :

      Conscience de soi et des autres : Identifier ses propres émotions et comprendre celles d'autrui.

      Gestion des émotions et des relations : Établir et maintenir des relations sociales positives.

      Prise de décision responsable : Apprendre à agir de manière éthique et constructive.

      --------------------------------------------------------------------------------

      2. L'Approche Directe : Programmes Structurés et Dirigés

      L'approche directe repose sur des activités planifiées où l'enseignant cible des savoirs spécifiques via des supports dédiés (jeux de plateau, fiches, lectures).

      Preuves d'Efficacité et Recherche

      La littérature scientifique internationale (méta-analyses de 2022 et 2025) souligne des bénéfices majeurs :

      Impact scolaire : Amélioration significative des résultats académiques comparativement aux élèves ne bénéficiant pas de ces programmes.

      Impact comportemental : Réduction des comportements problématiques et de la détresse émotionnelle.

      Impact à long terme : Diminution de la consommation de drogues à l'entrée de l'âge adulte.

      Programmes en Contexte Francophone

      Il existe un manque de programmes francophones validés par rapport aux modèles anglo-saxons. La simple traduction est jugée insuffisante ; une adaptation socio-culturelle est nécessaire. Deux outils se distinguent :

      | Programme | Origine | Compétences Ciblées | Accessibilité | | --- | --- | --- | --- | | Emotimat | France (Grenoble) | Identification, compréhension et expression des émotions. | Libre d'accès (en ligne). | | Emoti | Suisse (Genève) | Reconnaissance émotionnelle, besoins et régulation. | Payant (coût d'impression des cartes). |

      --------------------------------------------------------------------------------

      3. L'Approche Indirecte : La Pédagogie par le Jeu de Faire Semblant

      Le jeu de faire semblant est une activité où les objets, les paroles et les actions représentent autre chose que leur réalité immédiate. C'est une fonction mentale de haut niveau mobilisant l'imagination.

      Les Composantes du Jeu Mature

      Pour qu'un jeu génère des apprentissages, il doit tendre vers la maturité, caractérisée par plusieurs éléments :

      Substitution d'objets : Utiliser un bâton pour représenter une fusée (inhibition de la fonction réelle de l'objet).

      Attribution de rôles : Endosser une identité (docteur, pirate) et respecter le registre de comportement associé.

      Méta-communication : Planifier et négocier le scénario avec les pairs ("On dirait que tu étais...").

      Raisonnement par hypothèses : Utiliser la logique "Et si..." pour explorer des mondes possibles et des relations de cause à effet.

      Un Laboratoire de Développement

      Le jeu de faire semblant permet à l'enfant :

      1. De s'autoréguler : En s'imposant des règles de comportement liées au rôle choisi.

      2. D'expérimenter sans risque : Tester des situations sociales complexes dans un cadre "pour de faux", sans enjeu de performance.

      3. De traiter le réel : Mettre en scène sa compréhension du monde (ex: jeux liés à la pandémie ou aux soins médicaux) pour réguler ses frustrations ou ses peurs.

      --------------------------------------------------------------------------------

      4. Rôle et Posture de l'Enseignant

      Le passage d'un "jeu libre" à un "jeu accompagné" est crucial. L'enseignant ne doit pas être un simple spectateur, mais un acteur capable d'adopter plusieurs postures :

      Régisseur de scène : Fournir les accessoires et l'espace nécessaires.

      Co-joueur ou Joueur : Entrer dans le scénario pour enrichir le contenu et proposer des défis émotionnels.

      Observateur-Évaluateur : Identifier le niveau de maturité du jeu pour intervenir au bon moment.

      L'Importance de la Formation

      Les recherches indiquent que l'efficacité de ces dispositifs dépend de la préparation de l'adulte :

      Formation technique : Un minimum de 20 heures de formation est recommandé pour maîtriser l'accompagnement du jeu et les concepts socio-émotionnels.

      Dimension réflexive : L'enseignant doit évaluer ses propres compétences émotionnelles et sa capacité à jouer, car il sert de modèle par imitation pour les jeunes enfants.

      --------------------------------------------------------------------------------

      5. Conclusions et Recommandations

      La littérature scientifique actuelle récuse l'idée que le temps alloué au développement socio-émotionnel serait une "perte de temps" au détriment du scolaire. Au contraire :

      Complémentarité : Il est impératif de combiner les séances structurées et les temps de jeu accompagné.

      Enjeu de santé publique : Le déclin de l'engagement des enfants dans le jeu de faire semblant fait de son soutien à l'école une priorité de développement psychologique.

      Apprendre à jouer pour jouer pour apprendre : Le jeu de faire semblant n'est pas inné à un niveau mature ; il doit être enseigné pour devenir un outil d'apprentissage efficace.

    1. eliable, storable, staple food supplies are a necessary precondition for long-term settlement and population growth – in other words the creation of cities. Like the Europeans, Africans, and Asians, once they had created a reliable food supply, many (not all) American natives built remarkable cities, especially in Central and South America. From present-day Mexico’s Yucatan Peninsula south through Guatemala, the Maya developed a complex society which reached its most intense flourishing from 250 CE to 900 CE. However, the Maya changed their social organization and by the time the Spanish arrived, they were living in more separated independent city-states; seemingly having abandoned some of their more impressive temples and structures such as Chichén Itzá in Yucatan. This led to an interpretation that the original society had suffered a partial collapse sometime around 900 CE due to ecological collapse and/or feuding among these separate cities. More recently, anthropologists have begun to suggest the Maya people may just have wanted to live a lifestyle with less centralized control.Next: 4.3 - Maya Culture Back : 4.1 - The "Old" World

      This shows that Native peoples in the Americas were able to build large, successful cities once they had steady food supplies, just like people in other parts of the world. The Maya are a good example they created a complex society with impressive buildings and cities. Their way of life changed over time, but that doesn’t mean their civilization suddenly collapsed. Instead, they may have simply chosen to live differently. Overall, this shows that American societies were advanced and capable long before Europeans arrived.

    1. A POLITICO review of hundreds of cases brought by ICE detainees across the country shows judges increasingly furious and exhausted by the Trump administration’s tactics.

      General comment about the article as a whole:

      1. Cited sources are numerous, reliable, and relevant
      2. The article sticks almost exclusively to direct quotes and verifiable facts
      3. Descriptions of the judges are limited to which administration appointed them, which helps with transparency around potential political bias of the judges in the article.
      4. The article stays on topic all the way through.
    1. There are two types of formal outlines: the topic outline and the sentence outline. Format both types of formal outlines similarly. Place your introduction and thesis statement at the beginning, under roman numeral I. Use roman numerals (II, III, IV, V, etc.) to identify main points that develop the thesis statement. Use capital letters (A, B, C, D, etc.) to divide your main points into parts. Use arabic numerals (1, 2, 3, 4, 5, etc.) if you need to subdivide any As, Bs, or Cs into smaller parts. End with the final roman numeral expressing your idea for your conclusion. Here is what the skeleton of a traditional formal outline looks like. The indentation helps clarify how the ideas are related.

      Two types of formal outlines and the sentence outline.

    2. Chronological To tell a story or relate an experience To explain the history of an event or a topic To introduce the steps in a process Spatial To help readers visualize something as you want them to see it To create a main impression using the senses (sight, touch, taste, smell, and sound) Order of Importance To persuade or convince To rank items by their importance, benefit, or significance Organizing Your Writing Descriptive writing is most effective when it is organized well. Use the following information to decide what organization best fits your goals. Chronological order → best for describing events Spatial order → best for describing places Order of importance →  best for describing objects and people Types of Outlines A formal outline is a detailed guide that shows how all your supporting ideas relate to each other. This outline helps you distinguish between ideas that are equally important and ones that are less important. You can build your paper based on the framework you created in the outline. There are two types of formal outlines: the topic outline and the sentence outline. Format both types of formal outlines similarly. Place your introduction and thesis statement at the beginning, under roman numeral I. Use roman numerals (II, III, IV, V, etc.) to identify main points that develop the thesis statement. Use capital letters (A, B, C, D, etc.) to divide your main points into parts. Use arabic numerals (1, 2, 3, 4, 5, etc.) if you need to subdivide any As, Bs, or Cs into smaller parts. End with the final roman numeral expressing your idea for your conclusion. Here is what the skeleton of a traditional formal outline looks like. The indentation helps clarify how the ideas are related. Outlining a Paper Outlining a Paper Quick Guide to Topic Outlines Adapted from “Chapter Seven” of English for Business Success, 2012, used according to Creative Commons CC BY-NC-SA 3.0 License

      Three common ways to structure your writing with information for each step.

    1. Colleges and universities usually require students to select a major by the time they’ve completed 30 total credits.

      Can someone help? i don't really get what is meant here selecting a major, how?

    2. If you intend to transfer upon graduation: Is your college regionally accredited? Does your college have any special transfer agreements for guaranteed transfer of credits or perhaps for discounted tuition? Does your state have special transfer agreements or requirements that make it easier to transfer to colleges or universities within the same state?

      I think i have to consider this. this is related to me

    3. Fieldwork and internships provide students with opportunities to practice the skills they’ve learned in the classroom while also introducing them to the values and culture of the organizations and communities in which they hope to be employed.

      This is a very good experience to allow the student to see the "real world" aspect of their career field.

    4. Transfer-focused associate’s degrees may be called Associate of Arts (AA) or Associate of Science (AS), or other titles, depending on the focus of study.

      What do they mean by "transfer-focused"?

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Thompson et al. investigate the impact of prior ATP exposure on later macrophage functions as a mechanism of immune training. They describe that ATP training enhances bactericidal functions which they connect to the P2x7 ATP receptor, Nlrp3 inflammasome activation, and TWIK2 K+ movement at the cell surface and subsequently at phagosomes during bacterial engulfment. This is an incremental addition to existing literature, which has previously explored how ATP alters TWIK2 and K+, and linked it to Nlrp3 activation. The novelty here is in discovering the persistence of TWIK2 change and exploring the impact this biology may have on bacterial clearance. Additional experiments could strengthen their hypothesis that the in vivo protective effect of ATP-training on bacterial clearance is mediated by alveolar macrophages.

      Strengths:

      The authors demonstrate three novel findings: 1) prolonged persistence of TWIK2 at the macrophage plasma membrane following ATP that can translocate to the phagosome during particle engulfment, 2) a persistent impact of ATP exposure on remodeling chromatin around nlrp3, and 3) administering mice intra-nasal ATP to 'train' lungs protects mice from otherwise fatal bacterial infection.

      Weaknesses:

      (1) Some methods remain unclear including the timing and method by which lung cellularity was assessed in Figure 2. It is also difficult to understand how many mice were used in experiments 1, 2 and 6 and thus how rigorous the design was. A specific number is only provided for 1D and the number of mice stated in legend and methods do not match.

      (2) The study design is not entirely ideal for the authors' in vivo question. Overall, the discussion would benefit from a clear summary of study caveats, which are primarily that that 1) in vitro studies attributing ATP training-mediated bacterial killing to persistent TWIK2 relocation, K+ influx, a glycolytic metabolic shift , and epigenetic nlrp3 reprogramming were performed in BMDM or RAW cells and not primary AMs, 2) data does not eliminate the possibility that non-AM immune or non-immune cells in the lung are "trained" and responsible for ATP-mediated protection in vivo; flow data examined total lung digest which may obscure important changes in alveolar recruitment, and 3) in vivo work shows data on acute bacterial clearance but does not explore potential risks that "training" for a more responsive inflammasome may have for the severity of lung injury during infection.

      (3) The is some lack of transparency on data and rigor of methods. Clear data is not provided regarding the RNA-sequencing results. Specific identities of DEGs is not provided, only one high-level pathway enrichment figure. It would also be ideal if controls were included for subcellular fractionating to confirm pure fractions and for dye microscopy to show negative background.

      (4) In results describing 5A, the text states that "ATP-induced macrophage training effects, as measured by augmented bactericidal activity, were diminished in macrophages treated with protease inhibitors". However, these data are not identified significant in the figure; protease dependence can be described as a trend that supports the authors' hypothesis but should not be stated as significant data in text.

      In summary, this work contains some useful data showing how ATP can train macrophages via TWIK2/Nlrp3. Revisions have significantly improved methods reporting, added some data to strengthen the conclusions, and toned down on overstatements to bring conclusions more in line with data presented. The title still overstates what the authors have actually tested, since no macrophage-specific targeting in vivo (no conditional gene deletion, macrophage depletion etc) was performed in infection studies. However, in vitro data provide clear evidence that macrophages can be trained by ATP, and through caveats remain, it is plausible that macrophage training is a key mechanism for the protection observed here in the lung.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      (1) First, the concept of training or trained immunity refers to long-term epigenetic reprogramming in innate immune cells, resulting in a modified response upon exposure to a heterologous challenge. The investigations presented demonstrate phenotypic alterations in AMs seven days after ATP exposure; however, they do not assess whether persistent epigenetic remodeling occurs with lasting functional consequences. Therefore, a more cautious and semantically precise interpretation of the findings would be appropriate.

      In response, we have performed epigenetic analysis (ATAC seq analysis) as requested (Supp Fig. 1).

      (2) Furthermore, the in vivo data should be strengthened by additional analyses to support the authors' conclusions. The authors claim that susceptibility to Pseudomonas aeruginosa infection differs depending on the ATP-induced training effect. Statistical analyses should be provided for the survival curves, as well as additional weight curves or clinical assessments. Moreover, it would be appropriate to complement this clinical characterization with additional measurements, such as immune cell infiltration analysis (by flow cytometry), and quantification of pro-inflammatory cytokines in bronchoalveolar lavage fluid and/or lung homogenates.

      We have added the statistical analyses provided for the survival curves (new Fig. 1D), immune cell infiltration analysis, and quantification of pro-inflammatory cytokines in the lung (new Figs. 1, 2).

      (3) Moreover, the authors attribute the differences in resistance to P. aeruginosa infection to the ATP-induced training effect on AMs, based on a correlation between in vivo survival curves and differences in bacterial killing capacity measured in vitro. These are correlative findings that do not establish a causal role for AMs in the in vivo phenotype. ATP-mediated effects on other (i.e., non-AM) cell populations are omitted, and the possibility that other cells could be affected should be, at least, discussed. Adoptive transfer experiments using AMs would be a suitable approach to directly address this question.

      We have performed additional experiments and found that the numbers of lung macrophages were not significantly altered before and after ATP training (new Fig. 2), indicating the training effects are focused on lung resident macrophages.

      Reviewer #2 (Public review):

      (1) Missing details from methods/reported data: Substantial sections of key methods have not been disclosed (including anything about animal infection models, RNA-sequencing, and western blotting), and the statistical methods, as written, only address two-way comparisons, which would mean analysis was improperly performed. In addition, there is a general lack of transparency - the methods state that only representative data is included in the manuscript, and individual data points are not shown for assays.

      We have revised the methods and statistical analysis.

      (2) Poor experimental design including missing controls: Particularly problematic are the Seahorse assay data (requires normalization to cell numbers to interpret this bulk assay - differences in cell growth/loss between conditions would confound data interpretation) and bacterial killing assays (as written, this method would be heavily biased by bacterial initial binding/phagocytosis which would confound assessment of killing). Controls need to be included for subcellular fractionating to confirm pure fractions and for dye microscopy to show a negative background. Conclusions from these assays may be incorrect, and in some cases, the whole experiment may be uninterpretable.

      Seahorse assay methodology was updated to confirm the order of cell counting, time at seeding and cell counts. Methods were also updated to address the distinction between bacterial killing (Fig. 1B) and overall decrease in bacterial load.

      (3) The conclusions overstate what was tested in the experiments: Conceptually, there are multiple places where the authors draw conclusions or frame arguments in ways that do not match the experiments used. Particularly:

      (a) The authors discuss their findings in the context of importance for AM biology during respiratory infection but in vitro work uses cells that are well-established to be poor mimics of resident AMs (BMDM, RAW), particularly in terms of glycolytic metabolism.

      We have adjusted the text to reflect that the metabolic assay was performed on BMDMs. AMs are fragile for certain manipulations in vitro. We expect that the metabolic change is similar across several macrophage systems as well as the bacterial load reduction.

      (b) In vivo work does not address whether immune cell recruitment is triggered during training.

      We have performed immune cell infiltration analysis (new Fig. 2).

      (c) Figure 3 is used to draw conclusions about K+ in response to bacterial engulfment, but actually assesses fungal zymosan particles.

      We have corrected this in the manuscript.

      (d) Figure 5 is framed in bacterial susceptibility post-viral infection, but the model used is bacterial post-bacterial.

      We have corrected this in the manuscript.

      (e) In their discussion, the authors propose to have shown TWIK2-mediated inflammasome activation. They link these separately to ATP, but their studies do not test if loss of TWIK2 prevents inflammasome activation in response to ATP (Figure 4E does not use TWIK2 KO).

      We have now added the TWIK2 KO results (new Fig. 5E).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      As noted in the public review, it would be advisable to further characterize the in vivo phenotype in order to strengthen the conclusions. Specifically, it would be useful to quantify the bacterial load in the bronchoalveolar lavage fluid and lung homogenates, as well as to measure cytokine levels both in the respiratory compartment and systemically. Additionally, a broader characterization of the immune response in the presence or absence of ATP-induced training would be valuable. In the absence of direct evidence demonstrating that trained AMs mediate the observed phenotype, the authors should adopt a more cautious interpretation of their results. Moreover, careful attention to semantic accuracy is recommended. The concept of trained immunity refers specifically to long-term epigenetic reprogramming that leads to an altered response of target cells upon a secondary challenge, distant from the initial stress. The data presented do not fully demonstrate this phenomenon, and the interpretations should remain aligned with the evidence provided.

      Bacterial load has been quantified (see more details in the Methods part). And we also measured immune cell infiltration, quantification of pro-inflammatory cytokines in the lung (new Figs. 1, 2), and epigenetic evaluation of vehicle- and ATP-treated cells (Supp. Fig. 1).

      Reviewer #2 (Recommendations for the authors):

      (1) It cannot be overstated how lacking the methods are. This includes no discussion of IACUC approval for animal procedures, which must be included as part of research ethics. It also needs to be made clear where raw data is being archived. This notably includes an accession for deposited RNA-sequencing data, although unmanipulated microscopy and western blot images should also be shown. Methods should discuss any pre-processing that occurred with images.

      We have revised the methods in the manuscript.

      (2) Per statistics, in addition to generally providing more detail and adjusting analyses if they have not been correctly performed, please disclose if SD or SEM is shown. Reporting aggregate data versus representative data would provide more rigor. Perhaps replicate experiments could be included in the supplemental if they cannot, for some reason, be aggregated. Detailed statistical methods for RNA-seq analysis also need to be included.

      More details have been provided in the methods section.

      (3) It is unclear whether bacterial killing assays were correctly designed and can be interpreted. What does cells collected mean? If the assay was focused on intracellular macrophage bacterial load, it is critical to assess and report phagocytosis since different input loads would confound the assessment of killing. A rigorous wash or an antibiotic to eliminate extracellular bacteria should also have been performed and be described in this case. If the total bacterial burden was assessed, that would use cells+media and also needs to be clear and described. With the information provided, it is unclear whether the assays performed are sufficiently rigorous to assess bacterial killing. In addition, Figure 1B reports using an MOI of 50-100, but all data is compiled in one graph - data from different levels of infection should be separated. Figure 5A shows a model with E.coli followed by PA, but that does not appear to be how the assay was structured in B or C. This also does not match how the experiment is written in the results section, which references S. aureus. It is unclear what tissue (or cells) were assessed in Figure 5. Whole lung? BAL? As written, no data provided regarding bacterial killing is of sufficient quality to be considered valid.

      We have re-written the bacterial killing assay in the manuscript. The methodology was corrected to distinguish bacterial killing vs load decrease and generally accurate methodology.

      (4) The in vitro data provide reasonable evidence that BMDM/RAW macrophage training can occur in response to ATP exposure. However, it is unclear whether training is an important mechanism for resident AM in vivo, or whether, in vivo, a broader inflammatory response is generated, recruiting additional immune cells that persist and change infection susceptibility. The authors argue for resident AM immune training, but do not provide sufficient evidence to counter the latter possibility (resident AM are never themselves directly assessed, and the presence of other immune cells in vivo is not excluded). See Iliakis et al 2023 (PMID 37640788) for discussion of how this issue continues to drive uncertainty in the field. For this study, at least providing flow cytometry data quantifying myeloid and lymphoid immune populations in BALF before and after various treatments would help address this caveat. Without knowing this, it also confounds the interpretation of Figure 1B; if BAL is not pure AM after training, perhaps 1B could be repeated with ex vivo training or resident AM could be purified?

      We have performed immune cell infiltration analysis in the lung (both to BALF and in-tissue, new Fig. 2).

      (5) Figure 3A appears to show that fewer than 50% of cells express GFP. Is it expected that only a fraction of RAW cells express TWIK2-GFP? How was this addressed in the analyses for Figure 3? Were cells not appearing to express any significant GFP, included in phagosomal-negative or excluded from analysis? Please include in the methods.

      The RAW cells were transfected with TWIK2-GFP and variable GFP expression was expected. These cells were expressing a non-integrated transgene, which has been added to the methods as well as the consideration of cells for the analysis. Cells without visible GFP expression were excluded.

      (6) Why are many data points in Figure 3D negative? This suggests that settings were not optimized for microscopy - perhaps there is a very high background signal and the ION stain is barely above it. This is concerning for the quality of data. Further, is it expected that only some cells are positive for ION K+? The images shown clearly differentiate phagosomal K with ATP versus the absence of K without, but it is surprising that some cells appear not to contain any ION K+ signal (not completely clear given lack of brightfield or other cell staining) - this may again point to issues with imaging settings that confound data interpretation. This analysis should be carefully assessed.

      This has been updated in the methodology. In old Fig. 3D (new Fig. 4D), the presented data is the net intensity of the phagosome, subtracting the average cytoplasmic MFI from that of the area corresponding to an engulfed zymosan-af594 bead. Thus, a negative value has higher cytoplasmic IonK signal than that of the phagosome.

      (7) The Discussion states that it will be interesting to test whether ATP-TWIK2 is a common mechanism of training and specifically references LPS as an ATP-generating signal. However, Figure 2D data show that LPS induces only transient TWIK2 translocation; the authors have data suggesting that, in the context of LPS, TWIK2 'training' will not be engaged. This line of discussion shows incomplete consideration of the data.

      We have further limited this language in the text such that this may require differential sensitivity/damage sustained by macrophages as compared to that of epi/endothelial cells in response to bacterial endotoxin.

      (8) For RNA-sequencing, plots of the actual genes changed for the mitochondrial pathways of interest would be helpful information for readers, as would a heat map showing sample purity between groups for macrophage markers versus possible contaminant cells, which can also be generated from precursors in BMDM cultures. In general, information in Methods regarding how the analyses in Figure 4B were run is necessary, per cutoffs used to determine DEGs, number of samples in each group, sex of samples used, etc. Greater transparency of data would be appreciated, so plots that show variation between replicates, such as heat maps, would be ideal. Supplemental tables would also be nice.

      We have added to the methodology of the RNA sequencing analysis

      (9) The use of alternate DAMPs is a positive addition to the experimental design, but no data is given regarding the concentrations used. Ideally, positive controls showing histones/NAD are used at acutely activating concentrations could be included but at least references supporting the doses chosen or information about how doses were selected should be given. It is easy to find substantial literature on histones as a DAMP, but it was unclear why/how NAD was selected.

      We have added these concentrations and corresponding references.

      (10) The E.coli CFU reported in Figure 5B are extraordinarily low. In addition, CFU are generally shown on a log scale, but this appears to be linear. Please confirm that these data are correct. Perhaps improved methods might explain why? Is the second hit a low dose?

      These have been corrected in the new Fig. 6B.

      (11) Given that loss of either TWIK2 or Nlrp3 ablates bacterial protection, a link should be tested - experiments should test whether loss of TWIK2 prevents inflammasome activation in response to ATP (TWIK2 KO in 4E) and if loss of Nlrp3 changes TWIK2 translocation (Nlrp3 KO in at least some experiments of Figures 2/3).

      We have now added the TWIK KO results (new Fig. 5E).

      (12) One of the most striking data pieces is Figure 1D. It would, therefore, strengthen the paper to repeat those experiments (even just with the high-dose ATP) using TEIK2/P2X7/NLRP3 KO mice and really show the importance of these pathways in vivo. This is conceptually Figure 5, but the survival data of Figure 1 is far more convincing than the relatively weak bacterial load data of Figure 5.

      Unfortunately, our previous laboratory has been closed and we have trouble acquiring enough mice for additional survival data during the transition period. However, the bacterial load data has been adjusted to the same bacterial counts per 5 mg lung tissue instead of per individual sampling, giving a more contextual interpretation of the data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public reviews):

      (1) The absence of replicate paired-end datasets limits confidence in peak localization.

      The reviewer was under the impression that that we did not perform biological replicates of our ChIP-seq experiments. All ChIP-seq (and ATAC-seq) experiments were performed with biological replicates and the Pearson’s correlations (all >0.9) between replicates were provided in Supplementary Table 1. We had indicated this in the text and methods but will try to make this even clearer.

      (2) The analyses are primarily correlative, making it difficult to fully assess robustness or to support strong mechanistic conclusions.

      Histone modifications are difficult to alter genetically because of the high copy number of histone genes and inhibition of HATs/HDACs in general leads to alterations in other histone modifications. It is an inherent challenge in establishing causality of histone modifications, especially histone acetylation marks.

      (3) Some claims (e.g., specificity for CpG islands, "dynamic" regulation during differentiation) are not fully supported by the analyses as presented.

      We have modified the text in response to this point. The new text reads: “Non-CGI promoters have lower overall levels of transcription compared to CGI promoters, and for this promoter class H3K115ac enrichment detected by ChIP is only really seen for the highest quartile of transcription (4SU) quartile of expression (Figure 1G). CGI promoters on the other hand, exhibit significant levels of detected H3K115ac even for the lowest quartile of expression. These results suggest a special link between CGI promoters and H3K115ac”.

      (4) Overall, the study introduces an intriguing new angle on globular PTMs, but additional rigor and mechanistic evidence are needed to substantiate the conclusions.

      We agree that the paper does not provide mechanistic details or solid causality of H3K115ac. We have only emphasized the potential role of H3K115ac in nucleosome fragility based on our in vivo data and previously published in-vitro experiments (Manohar et.al., 2009, Chatterjee et. al., 2015). We do provide the evidence that H3K115ac is enriched on subnucleosomal particles via sucrose gradient sedimentation of MNase-digested chromatin (Figure 3C-D).

      Reviewer #2 (Public review):

      (1) I am not fully convinced about the specificity of the antibody. Although the experiment in Figure S1A shows a specific binding to H3K115ac-modified peptides compared to unmodified peptides, the authors do not show any experiment that shows that the antibody does not bind to unrelated proteins. Thus, a Western of a nuclear extract or the chromatin fraction would be critical to show. Also, peptide competition using the H3K115ac peptide to block the antibody may be good to further support the specificity of the antibody. Also, I don't understand the experiment in Figure S1B. What does it tell us when the H3K115ac histone mark itself is missing? The KLF4 promoter does not appear to be a suitable positive control, given that hundreds of proteins/histone modifications are likely present at this region. It is important to clearly demonstrate that the antibody exclusively recognizes H3K115ac, given that the conclusion of the manuscript strongly depends on the reliability of the obtained ChIP-Seq data.

      ChIP-qPCR in S1B includes competition from native chromatin and shows high specificity to its target. We have provided antibody validation in three ways:

      - Western blot with dot-blot of synthetic peptides (Figure S1A).

      - Western blots with Whole cell extracts (Figure 4D).

      - ChIP-qPCR on native chromatin spiked with a cocktail of synthetic mono-nucleosomes, each carrying a single acetylation and a specific barcode (SNAP-ChIP K-AcylStat Panel).

      We could not include H3K115ac marked nucleosomes as they are not available in the panel. Figure S1B shows that the H3K115ac antibody exhibits negligible binding to known K-acyl marks, comparable to an unmodified nucleosome. Because of the absence of a H3K115ac modified barcoded nucleosome, we used the KLF4 promoter from mESCs as a positive control, in agreement with ChIP-seq signal shown in the genome browser profile (Figure 1E), the KLF4 promoter shows a significantly higher signal than the gene body.

      (2) The association of H3K115ac with fragile nucleosomes is based on MNase-sensitivity and fragment length, which are indirect methods and can have technical bias. Experiments that support that the H3K115ac modified nucleosomes are indeed more fragile are missing.

      We have performed ChIP-seq on MNase digested mESC chromatin fractionated on sucrose gradients and this shows that H3K115ac is enriched in fractions containing sub-nucleosomal and fragile nucleosomes but depleted in fractions containing stable nucleosomes (Figure 3D).

      (3) The comparison of H3K115ac with H3K122ac and H3K64ac relies on publicly available datasets. Since the authors argue that these marks are distinct, data generated under identical experimental conditions would be more convincing. At a minimum, the limitations of using external datasets should be discussed.

      H3K64ac and H3K122ac datasets were generated by us in a previous publication (Pradeepa et. al., 2016) using same native MNase ChIP protocol as used here. The ChIP-seq datasets for H3K122ac and H3K27ac are processed in an identical manner, with the same computational pipelines, to the H3K115ac data sets generated in this paper.

      (4) The enrichment of H3K115ac at enhancers and CTCF binding sites is notable but remains descriptive. It would be interesting to clarify whether H3K115ac actively influences transcription factor/CTCF binding or is a downstream correlate.

      We agree with the reviewer’s comment, but we have not claimed causality.

      (5) No information is provided about how H3K115ac may be deposited/removed. Without this information, it is difficult to place this modification into established chromatin regulatory pathways.

      Due to broad target specificity, redundancies and crosstalk among different classes of HATs and HDACs, it is not tractable to answer this question in the current manuscript.

      Reviewer #3 (Public reviews):

      Reviewer 3 is mistaken in thinking our ChIP experiments are performed under cross-linked conditions. As clearly stated in the main text and methods, all our ChIP-seq for histone modifications is done on native MNase-digested chromatin – with no cross-linking. This includes the spike-in experiment shown in Fig S1B to test H3K115ac antibody specificity against the bar-coded SNAP-ChIP® K-AcylStat Panel from Epicypher. We could not include H3K115ac bar-coded nucleosomes in that experiment since they are not available in the panel.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) I have two primary concerns that resound through the entire paper:

      (a) Overall, the manuscript is making strong claims based on entirely correlative datasets. No quantitative analyses are performed to demonstrate co-occupancy/localization. Please see more detailed descriptions below.

      Our responses to specific points are provided against each comment below.

      (b) Lack of paired-end replicates for H3K115ac ChIP-seq. While the reviewer token for the deposited data was not made accessible to me, looking at Supplementary Table 1, it appears there are two H3K115ac ChIP-seq datasets. One is paired-end and is single-read. So are peaks called with only one replicate of PE? Or are inaccurate peaks called with SR datasets? Either way, this is not a rigorous way to evaluate H3K115ac localization.

      We are sorry that this reviewer was not able to access the data – the token for the GEO accession was provided for reviewers at the journal’s request. All ChIP-seq (and ATAC-seq) experiments (paired and single-end) were performed with two biological replicates and the Pearson’s correlations (all >0.9) between replicates were provided in Supplementary Table 1. This was indicated in both the main text and in the methods. In the revised manuscript we have tried to make this even clearer and have put the relevant Pearsons coefficient (r) into the text at the appropriate places. For the reviewer’s information, here is the complete list of data samples in the GEO Accession:

      Author response image 1.

      While I agree that H3K115ac occupancy is high at +CGIs, the authors downplay that H3K122ac and H3K27ac is also more highly enriched at these locations (page 7, last sentence of first paragraph). I imagine this is all due to the more highly transcribed nature of these genes. Sub-stratifying the K27ac and K122ac by transcription (as in Figure 1G) would help to demonstrate a unique nature of H3K115ac. But even better would be to do an analysis that plots H3K115ac enrichment vs transcription for every individual gene rather than aggregate analyses that are biased by single locations. For example, make an XY scatterplot of RNAPII occupancy or 4SU-seq signal vs H3K115ac level, where each point represents a single gene. Because the interpretation that it is CGI-based and not transcription is confounded with the fact that -CGI are more lowly transcribed. So, looking at Figure 1G, even the -CGI occupancy of H3K115ac is correlated with transcription, but it is just more lowly transcribed.

      We thank the reviewer for these suggestions but point out that Figure 1G shows H3K115ac signal for CGI+ and CGI– TSS that are matched for expressions levels (quartiles of 4SU-seq). Fig 1F shows that H3k115ac is much more of a discriminator between CGI+ and – than H3K27ac or H3K122ac.

      (2) H3K115ac, H3K27ac, and H3K122ac are all more enriched (in aggregate) at +CGI locations (Fig 1F); so do these locations just have more positioned nucleosomes? More H3.3? So that these PTMs are just more enriched due to the opportunity?

      Positioned nucleosomes are generally found downstream of the TSS of active CpG island promoters, so what the reviewer suggests may well account for the relative enrichment of H327ac and H3K122ac at CGI+ vs CGI- promoters in Fig.1F. But H3K115ac localisation is distinct, with the peak at the nucleosome-depleted region not the +1 nucleosome. This is also confirmed by the contour plots in Fig 3. Our observation is also not explained by an enrichment of H3.3 at CGI promoters, since we show that H3K115ac is not specific to H3.3 (Fig 4D).

      (3) The authors note in paragraph 2 of page 7 that "H3K115ac does not scale linearly with gene expression..." but the authors never show a quantification of this; stratification in four clusters is not able to make a linear correlation. Furthermore, in the second line of page 7, the authors state that the levels do generally correlate with transcription. To claim it is a specific CGI link and not transcription is tricky, but I encourage the authors to consider more quantifiable ways, rather than correlations, to demonstrate this point, if it is observed.

      We thank the reviewer for this comment, and taking it into consideration, we have decided to re-phrase this paragraph. The new text reads: “Non-CGI promoters have lower overall levels of transcription compared to CGI promoters, and for this promoter class H3K115ac enrichment detected by ChIP is only really seen for the highest quartile of transcription (4SU) quartile of expression (Figure 1G). CGI promoters on the other hand, exhibit significant levels of detected H3K115ac even for the lowest quartile of expression. These results suggest a special link between CGI promoters and H3K115ac”.

      (4) The authors claim on page 7 that "on average, transcription increased from TSS that also gained H3K115ac but to a modest extent, compared with the more substantial loss of H3K115ac from downregulated TSS". However, both upregulated and downregulated are significant; the difference in magnitude could simply be due to more highly or more lowly transcribed locations, meaning that fold change could be more robustly detected. I caution the authors to substantiate claims like this rather than stating a correlation.

      We thank the reviewer for this comment which relates to the data in Fig 2A. It is Fig. 2B shows that the association of H3K115ac loss with downregulation is statistically stronger than H3K115ac gain with upregulation, but only for CGI promoters. With regard to the text on the original pg 7 that is referred to, we have now reworded this to read “Average levels of transcription increased from TSS that also gained H3K115ac, and there was loss of H3K115ac from downregulated TSS (Figure 2A).”

      (5) For Figure 2C, the authors argue that H3K115ac correlate with bivalent locations. So this is all qualitative and aggregate localization; please quantitatively demonstrate this claim.

      Figure S2D provides statistics for this (observed/expected and Fishers exact test).

      (6) The authors claim in Figure 2 that H3115ac is dynamic during differentiation (title of Figure 2). However, there are locations that gain and lose, or maintain H3K115ac. In fact, the most discussed locations are H3K115ac with no change (2C); which means it is NOT dynamic during differentiation. So what is the message for the role during differentiation? From Supplemental Table 1, it appears there is a single ChIP experiment for H3K115ac in NPC, and it is a single read. So this is also a difficult claim with one replicate. Related to this, in S2A, the authors show K115ac where there is no change in transcription; so what is the role of H3K115ac at TSSs relevant to differentiation - it is at both locations changed and unchanged in transcription, but H3K115ac levels itself do not change at these subsets. So, how is this dynamic? This is very confusing, and clearer analyses and descriptions are necessary to deconvolute these data.

      We apologise for the misleading title for Figure 2. This has now been amended to “Changes in H3K115ac during differentiation”. The message of this figure is that whilst changes in H3K115ac at TSS are small (panels A-C), at enhancers the changes are much more dramatic (panel D). The reviewer is incorrect about the number of replicates for NPCs – there are two biological replicates (see response to point 1b).

      (7) The authors go on to examine H3K115ac enrichment on fragile nucleosomes through sucrose gradient sedimentation. A control for H3K27ac or H3K122ac would be nice for comparison.

      We do not have the material available to perform these experiments

      (8) When discussing Figures 3 and SF3, the authors mention performing a different MNase for a second ChIP. Showing the MNase distribution for both the more highly digested and the lowly digested would be nice. a) Related to the above, the authors show input in SF3E to argue that the difference in H3K115ac vs H3K27ac is not due to the library, but they do not show the MNase digestion patterns, which is more important for this argument.

      Input libraries (first two graphs of FigS3E) are the MNase-digested chromatin. Comparison of nucleotide frequencies from millions of reads is more robust method than the fragment length patterns.

      (9) The authors move on to examine H3K115ac at enhancers. Just out of curiosity, given what was found at promoters, is H3K115ac enriched at +CGI enhancers? And what is the correlation with enhancer transcription?

      This is an interesting point, but the number of enhancers associated with CGI is not very high and so we did not focus on this. We have not analysed a correlation with eRNAs in this paper.

      (10) The authors state on page 14 that the most frequent changes in H3K115ac during differentiation are at these enhancers. So do these changes connect with differentiation-specific genes, and/or genes that have altered transcription during differentiation? Just trying to understand the functional role.

      Given the challenges of connecting enhancers with target genes, we have not addressed this question quantitatively. However, we draw the reviewer’s attention to the Genome Browser shots in Figures 2D and S2C, which show clear gain of H3K115ac (and ATAC-seq peaks) at intra and intergenic regions close to genes whose transcription is activated during the differentiation to NPCs.

      (11) Related, at the end of page 14, the authors state that the changes in H3K115ac correlate with changes in ATAC-seq; I imagine this dynamic is not unique for H3K115ac and this is observed for other PTMs (H3K27ac), so assessing and clarifying this, to again get to the specific interest of H3K115ac, would be ideal.

      We have not claimed that chromatin accessibility is unique to H3K115ac. It is the location of H3K115ac which is found inside the ATAC-seq peak region while H3K27ac is found only upstream/downstream of the ATAC peak that is so striking. This is apparent in Fig 4C.

      (12) The authors examine levels of H3K115ac in H3.3 KO cell lines via western blot (Figure 4D), but no replicates and/or quantification are shown.

      We now provide a biological replicate for the Western Blot (new FigS4H) together with an image of the whole gel for the data in Fig 4D

      (13) In Figure S4 and at the end of page 17, the authors are arguing that there is a link to pioneer TF complexes, based on Oct4 binding. First, while Oct4 has pioneering activity, not all Oct4 sites (or motifs) are pioneering; this has been established. So if you want to use Oct4, substratifying by pioneer vs no pioneer is necessary. Second, demonstrating this is unique to pioneer and not to non-pioneer TFs would be an important control.

      In response to the reviewer’s comment, we have removed the term “pioneer” from the manuscript.

      (14) Minor point: Figure 4 A and B, there are some formatting issues with the scale bars.

      We thank the reviewer for pointing this out, and the errors have been corrected in the revised figure.

      (15) Minor point is that it should be clear when single replicates of data are used and when PE/SR sequences are combined or which one is used in each analysis, as this was hard to discern when reading the paper and figure legends.

      We have clearly stated in the text that, after Figure2, we repeated all experiments in paired-end mode. All processing steps are defined separately for single end and paired end datasets in the method section. Details of biological replicates are provided in Sup. Table 1. These concerns are also addressed in our response to Reviewer’s public comment-1.

      (16) Minor point: it is surprising that different MNase and different units were used in the ChIP vs sucrose sedimentation. Could the authors clarify why?

      Chromatin prep for sucrose gradients were done on a much larger scale than for ChIP-seq and required different setups to obtain the right level of MNase digestion.

      (17) The authors note that fragile nucleosomes contain H2A.Z and H3.3, but they never perform an analysis of available data to demonstrate a correlation (or better a quantifiable correlation) between H3K115ac occupancy and these marks at the locations they identify H3K115ac.

      Since have shown (Fig. 4) that depletion of H3.3 does not affect overall levels of H3K115ac, we do not think there is value in further quantitative correlative analyses of H3K115ac and variant histones.

      (18) Minor point: What is the overlap in peaks for H3K115ac, H3K122ac, and H3K27ac (Figure 1C)?

      Nearly all H3K115ac peaks overlap with H3K122ac and/or H3K27ac. Its most distinct properties are its association with CGI promoters, fragile nucleosomes and its unique localisation within the NDRs, three points that the manuscript is focussed on.

      Reviewer #3 (Recommendations for the authors):

      (1) The western blot results in Figure 4D probing for H3, H3.3, and H3K115ac use Ponceau S staining, presumably of an area of the membrane where histones might be expected to migrate, as a measure of loading. However, the Ponceau S bands appear uniformly weaker in the H3.3KO lanes, yet despite this, blotting with H3.3 antibody detects a band in H3.3 knockout ESCs, suggesting that the antibody does not have a high degree of specificity. Again, a blocking experiment with appropriate peptides would instill more confidence in the specificity of these reagents, and/or the authors could provide independent validation of the knockout model to differentiate between a partial knockout or antibody cross-reactivity (e.g., by Sanger sequencing).

      In a revised Fig. S4H we now show the whole gel corresponding to this blot but including co-staining with an antibody for H4 to provide a better loading control. We also provide a biological replicate of this Western blot in the lower panel of Fig. S4H.

      (2) The manuscript would benefit from in vitro follow-up and validation, but if the authors intend to keep the manuscript primarily in silico, I suggest dedicating a few lines in each section to explain the plots, their axes, and their purpose, as well as to assist with interpretation, rather than directly discussing the results. This would make the manuscript more accessible and understandable for a broader audience in the field of epigenetics.

      In the revised version, we have tried to improve the text to make the data more accessible to a broad audience.

    1. Reviewer #2 (Public Review):

      In this manuscript, Jong et al. provide and validate a very useful resource for performing CRISPR screenings to study neutrophil differentiation and function. The major strength of the paper lies in its careful validation of many aspects of the Hoxb8-immortalized progenitor cells, including their differentiation capacity, their ability to clear bacteria, and their capacity to differentiate in vivo. The authors succeed at this, with results correctly supporting their conclusions. The major weaknesses are its presentation and writing, some of which are poorly organized. Finally, while the potential impact of this resource in the field could be very large, the CRISPR screening results appear half-baked, almost preliminary, and could be better validated, or at least presented. A few other points that warrant revision are included below:

      • The introduction should be better constructed and organized. It should be written with more connectors to present facts in a stream that flows naturally, from the broad general facts to the experimental details implemented in the manuscript. It should also discuss other similar approaches used in the literature, such as LaFleur et al. 2019, and relate in which ways these presented methods could be better.

      • The scheme in Figure 4A should more clearly indicate the timings, doublings, numbers of cells, and other aspects of the experimental design.

      • The volcano plot in Figure 4B is poorly informative and almost redundant. What does one make of it?

      • The representation (normalized reads) of each sgRNA in the library and across multiple experiments, including their correlation, should be checked and plotted, to visualize how robust these replicates are.

      • In Figure 4E, the distribution of the hit sgRNAs should be compared to all other targeting guides (instead of just to non-targeting controls). Linear density distribution plots or scatter plots of all guides are usually the best way, but there are others (for example, see Figure 4 of LaFleur et al. 2019). Ideally, each independent sgRNA for each gene in the library, as well as biological replicates, should be separately shown, with hits clearly highlighted.

      • While in vivo differentiation is shown as possible with these cell lines, it is unclear whether CRISPR screenings could be performed in vivo too. Would sgRNA representation suffice for genome-wide? At least some of the new hits could be validated by testing differentiation in vivo (i.e. WASH complex).

      • In the methods section, the RNA-seq analysis pipeline details are missing (versions, software for alignment, quantification, differential gene expression, and enrichment). Also, parameters for MAGeCK and MAGeCKFlute should be explicit and detailed.

      • The discussion is mostly a summary of the results. It is lacking in detail and thoughtful discussion regarding novelty and impact beyond the validation of the cell line. What about potential applications? What about extending screenings to test bacterial-killing, as suggested in Figure 2? What about limitations compared to other similar methods out there? There is little discussion of such important potential matters. Also, a large part of the discussion is dedicated to discussing details about Cebpe that are all well known in the literature and add little value.

      • Figure legends are typically too succinct and hard to interpret, especially for non-experts. The text should enable the figure reader to correctly interpret what is shown in each panel.

    1. Reviewer #4 (Public review):

      I maintain that the images in Figure 12 (new Figure 14) do not support the authors' interpretation that 2-cell embryos resulted from in vitro fertilization (IVF) of Amrc-/- rescued sperm. They are clearly not normal 2-cell embryos and instead look very much like fragmented eggs that can be seen occasionally following in vitro fertilization procedures even when that is done with wild type eggs and sperm. The only portion of current Figure 14 that has normal looking 2-cell embryos is in panel 14A4, where wild type B6D2 sperm were used. Even in that panel, there are some fragmented eggs that the authors identify as 2-cell embryos.

      The authors offer the explanation that CD1 eggs fertilized by B6D2F1 hybrid male sperm do not develop beyond the 2-cell stage, citing a 2008 paper published in Biology of Reproduction by Fernandez-Goonzalez et al. I read through that paper very carefully and even had a colleague read through it in case I missed something, but that paper says nothing at all about strain incompatibilities, much less 2-cell arrest due to them. The only crosses done in that paper are CD1 eggs x CD1 sperm and B6D2 eggs x B6D2 sperm, all by intracytoplasmic sperm injection and not standard in vitro fertilization. [Note that the paper does mention performing in vitro fertilization but says nothing about how it was done or what mouse strains were used.] I even searched the literature for information regarding incompatibility between these strains and could find nothing relevant. But even if the authors are correct and there happens to be a strain incompatibility and 2-cell arrest is expected, what the authors are calling 2-cell embryos are clearly not.

      A second explanation offered by the authors is that they used collagenase to remove the cumulus cells and that this may have affected the appearance of the embryos. This technique is actually used to remove both the cumulus cells and the zona pellucida and has been described as a gentler way to do so than other standard methods (hyaluronidase treatment followed by acid Tyrodes to remove the zona pellucida) (Yamatoya et al., Reprod Med Biol 2011, DOI 10.1007/s12522-011-0075-8). I think it is highly relevant to the current study that the method they used to remove cumulus cells also dissolves the zona, either partially or completely. Given that many of the eggs, fragmented eggs, and 2-cell embryos (from the WT sperm) in Figure 14A are lacking a zona pellucida, it seems very likely that many of the eggs were either zona-free or had partial zona dissolution from the start. In fact, the authors state in the Methods section that "Cumulus-free and zona-free eggs were collected..." for how IVF was done. Partial zona dissolution is standard in some protocols for performing IVF using frozen mouse sperm, which usually have much lower motility and overall efficacy than fresh sperm. In any case, it would improve transparency if the manuscript made clear somewhere other than buried in the Methods that the IVF procedure was done on eggs with partially or fully removed zonas, to allow proper interpretation.

      In the rebuttal, the authors go on to state: "To provide additional functional evidence, we complemented the IVF experiments with ICSI using rescued Armc2-/- sperm and B6D2 oocytes, which allowed embryos to develop to the blastocyst stage. In these experiments, 25% of injected oocytes reached the blastocyst stage with rescued sperm compared to 13% for untreated Armc2-/- sperm (Supplementary Fig. 9) These results support the functional competence of rescued sperm and demonstrate partial recovery of fertilization ability following Armc2 mRNA electroporation."

      Their conclusion that the data support partial recovery of fertilization ability following Armc2 mRNA electroporation in my opinion has no basis. This experiment was done only once, and no information is provided regarding how many eggs underwent ICSI or how many reached the blastocyst stage. The authors claim that the rescued sperm were better than the Armc2-/- sperm in producing blastocysts, but this is based on a simple percentage report of 25% vs 13% without any statistical analysis, even on the results from the single experiment presented.

      Overall, the paper shows rescue of some sperm motility by the new method they use, and the new title is therefore appropriate. The authors have also dealt reasonably with many of the original concerns regarding documenting that their methodology was effective in producing protein (at least the GFP marker) in spermatogenic cells. In my view the authors have, however, not shown any indication of functional recovery over what is already known for the knockout sperm, that ICSI can support blastocyst stage embryo development. They also have not, in my view, justified the claims at the end of the abstract "These motile sperm were able to produce embryos by IVF..." and that "...mRNA electroporation can restore...partially fertilizing ability..."

    2. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      The authors assess the effectiveness of electroporating mRNA into male germ cells to rescue the expression of proteins required for spermatogenesis progression in individuals where these proteins are mutated or depleted. To set up the methodology, they first evaluated the expression of reporter proteins in wild-type mice, which showed expression in germ cells for over two weeks. Then, they attempted to recover fertility in a model of late spermatogenesis arrest that produces immotile sperm. By electroporating the mutated protein, the authors recovered the motility of ~5% of the sperm; although the sperm regenerated was not able to produce offspring using IVF, the embryos reached the 2-cell state (in contrast to controls that did not progress past the zygote state).

      This is a comprehensive evaluation of the mRNA methodology with multiple strengths. First, the authors show that naked synthetic RNA, purchased from a commercial source or generated in the laboratory with simple methods, is enough to express exogenous proteins in testicular germ cells. The authors compared RNA to DNA electroporation and found that germ cells are efficiently electroporated with RNA, but not DNA. The differences between these constructs were evaluated using in vivo imaging to track the reporter signal in individual animals through time. To understand how the reporter proteins affect the results of the experiments, the authors used different reporters: two fluorescent (eGFP and mCherry) and one bioluminescent (Luciferase). Although they observed differences among reporters, in every case expression lasted for at least two weeks. The authors used a relevant system to study the therapeutic potential of RNA electroporation. The ARMC2-deficient animals have impaired sperm motility phenotype that affects only the later stages of spermatogenesis. The authors showed that sperm motility was recovered to ~5%, which is remarkable due to the small fraction of germ cells electroporated with RNA with the current protocol. The sperm motility parameters were thoroughly assessed by CASA. The 3D reconstruction of an electroporated testis using state-of-the-art methods to show the electroporated regions is compelling.

      The main weakness of the manuscript is that although the authors manage to recover motility in a small fraction of the sperm population, it is unclear whether the increased sperm quality is substantial to improve assisted reproduction outcomes. The authors found that the rescued sperm could be used to obtain 2-cell embryos via IVF, but no evidence for more advanced stages of embryo differentiation was provided. The motile rescued sperm was also successfully used to generate blastocyst by ICSI, but the statistical significance of the rate of blastocyst production compared to non-rescued sperm remains unclear. The title is thus an overstatement since fertility was never restored for IVF, and the mutant sperm was already able to produce blastocysts without the electroporation intervention.

      Overall, the authors clearly show that electroporating mRNA can improve spermatogenesis as demonstrated by the generation of motile sperm in the ARMC2 KO mouse model.

      We thank the reviewer for this thoughtful and constructive comment. We agree that our study demonstrates a partial functional recovery of spermatogenesis rather than a complete restoration of fertility. Our main objective was to establish and validate a proof-of-concept approach showing that mRNA electroporation can rescue the expression of a missing or mutated protein in post-meiotic germ cells and result in the production of motile sperm.

      To address the reviewer’s concern, we have the title and discussion to more accurately reflect the scope of our findings. The new title reads:

      “Sperm motility in mice with oligo-astheno-teratozoospermia restored by in vivo injection and electroporation of naked mRNA”

      In the manuscript, we now emphasize that while motility recovery was significant, complete fertility restoration was not achieved. We have also clarified that:

      The 5% recovery in motile sperm represents a substantial improvement considering the small population of germ cells reached by the current electroporation method.

      The 2-cell embryo formation observed after IVF serves as a strong indication of partial functional recovery

      Finally, we now explicitly state in the Discussion that this approach should be considered a therapeutic proof-of-concept, demonstrating feasibility and potential, rather than a fully curative intervention.

      Reviewer #2 (Public review):

      The authors inject, into the rete testes, mRNA and plasmids encoding mRNAs for GFP and then ARMC2 (into infertile Armc2 KO mice) in a gene therapy approach to express exogenous proteins in male germ cells. They do show GFP epifluorescence and ARMC2 protein in KO tissues, although the evidence presented is weak. Overall, the data do not necessarily make sense given the biology of spermatogenesis and more rigorous testing of this model is required to fully support the conclusions, that gene therapy can be used to rescue male infertility.

      In this revision, the authors attempt to respond to the critiques from the first round of reviews. While they did address many of the minor concerns, there are still a number to be addressed. With that said, the data still do not support the conclusions of the manuscript.

      We thank the reviewer for their careful and detailed assessment of our manuscript. We appreciate the concerns raised regarding mRNA stability, GFP localization, and the interpretation of spermatogenesis stages, and we have addressed these points in the manuscript and in the responses below.

      (1) The authors have not satisfactorily provided an explanation for how a naked mRNA can persist and direct expression of GFP or luciferase for ~3 weeks. The most stable mRNAs in mammalian cells have half-lives of ~24-60 hours. The stability of the injected mRNAs should be evaluated and reported using cell lines. GFP protein's half-life is ~26 hours, and luciferase protein's half-life is ~2 hours.

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5A). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      (2) There is no convincing data shown in Figs. 1-8 that the GFP is even expressed in germ cells, which is obviously a prerequisite for the Armc2 KO rescue experiment shown in the later figures! In fact, to this reviewer the GFP appears to be in Sertoli cell cytoplasm, which spans the epithelium and surrounds germ cells - thus, it can be oft-confused with germ cells. In addition, if it is in germ cells, then the authors should be able to show, on subsequent days, that it is present in clones of germ cells that are maturing. Due to intracellular bridges, a molecule like GFP has been shown to diffuse readily and rapidly (in a matter of minutes) between adjacent germ cells. To clarify, the authors must generate single cell suspensions and immunostain for GFP using any of a number of excellent commercially-available antibodies to verify it is present in germ cells. It should also be present in sperm, if it is indeed in the germline.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis. These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      Other comments:

      70-1 This is an incorrect interpretation of the findings from Ref 5 - that review stated there were ~2,000 testis-enriched genes, but that does not mean "the whole process involves around two thousand of genes"

      We thank the reviewer for this helpful comment. We agree that our previous phrasing was imprecise. We have revised the sentence to clarify that approximately 2,000 genes show testis-enriched expression, rather than implying that the entire spermatogenic process is limited to these genes. The corrected sentence now reads:

      “Spermatogenesis involves the coordinated expression of a large number of genes, with approximately 2,000 showing testis-enriched expression, about 60% of which are expressed exclusively in the testes”

      74 would specify 'male':

      we have now specified it as you suggested.

      79-84 Are the concerns with ICSI due to the procedure itself, or the fact that it's often used when there is likely to be a genetic issue with the male whose sperm was used? This should be clarified if possible, using references from the literature, as this reviewer imagines this could be a rather contentious issue with clinicians who routinely use this procedure, even in cases where IVF would very likely have worked:

      We thank the reviewer for this important comment. Concerns about ICSI outcomes indeed reflect two partly overlapping causes: the procedure itself (direct sperm injection and associated laboratory manipulations) and the clinical/genetic background of couples undergoing ICSI (especially men with severe male-factor infertility). Large reviews and meta-analyses report a small increase in some perinatal and congenital risks after ART/ICSI, but these studies conclude that it is difficult to fully disentangle procedural effects from parental factors. Importantly, genetic or epigenetic abnormalities in the male (which motivate use of ICSI) likely contribute to adverse outcomes in offspring, while some studies also suggest that ICSI-specific manipulations may alter epigenetic marks in embryos. For these reasons professional bodies recommend reserving ICSI for appropriate male-factor indications rather than as routine insemination for non-male-factor cases

      We have revised the text accordingly to clarify this distinction:

      “ICSI can efficiently overcome the problems faced.  Nevertheless, concerns persist regarding the potential risks associated with this technique, including blastogenesis defect, cardiovascular defect, gastrointestinal defect, musculoskeletal defect, orofacial defect, leukemia, central nervous system tumors, and solid tumors [1-4]. Statistical analyses of birth records have demonstrated an elevated risk of birth defects, with a 30-40 % increased  likelihood in cases involving ICSI [1-4], and a prevalence of birth defects between 1 % and 4 % [3]. It is important to note, however, that the origin of these risks remains debated. Several large epidemiological and mechanistic studies indicate that both the procedure itself (direct microinjection and in vitro manipulation) and the underlying genetic or epigenetic abnormalities often present in men requiring ICSI contribute to the observed outcomes [1, 3] [5, 6] . To overcome these drawbacks, a number of experimental strategies have been proposed to bypass ARTs and restore spermatogenesis and fertility, including gene therapy [7-10].”

      199 Codon optimization improvement of mRNA stability needs a reference;

      We have added the references accordingly: [11-15]

      In one study using yeast transcripts, optimization improved RNA stability on the order of minutes (e.g., from ~5 minutes to ~17 minutes); is there some evidence that it could be increased dramatically to days or weeks?

      We agree with the reviewer that codon optimization can enhance mRNA stability, but available evidence indicates that this effect is moderate. In Saccharomyces cerevisiae, Presnyak et al. (2015) [16] showed that codon optimization increased mRNA half-life from approximately 5 minutes to ~17 minutes, representing a several-fold improvement rather than a shift to days or weeks. Similar codon-dependent stabilization has been observed in mammalian systems, where transcripts enriched in optimal codons exhibit longer half-lives and enhanced translation efficiency [11]; [17]). However, these studies consistently report effects on the scale of minutes to hours. In mammalian cells, the prolonged stability of therapeutic or vaccine mRNAs—lasting for days—is primarily achieved through additional features such as optimized untranslated regions, chemical nucleotide modifications (e.g., N¹-methylpseudouridine), and protective delivery systems, rather than codon usage alone ([18]; [19]).

      Other molecular optimizations that improve in vivo mRNA stability and translation include a poly(A) tail, which binds poly(A)-binding proteins to protect the transcript from 3′ exonuclease degradation and promotes ribosome recycling, and a CleanCap structure at the 5′ end, which mimics the natural Cap 1 configuration, protects against 5′ exonuclease attack, and enhances translational initiation [11-15]. Together, these modifications act synergistically to stabilize the transcript and support efficient translation.

      472-3 The reported half-life of EGFP is ~36 hours - so, if the mRNA is unstable (and not measured, but certainly could be estimated by qRT-PCR detection of the transcript on subsequent days after injection) and EGFP is comparatively more stable (but still hours), how does EGFP persist for 21 days after injection of naked mRNA??

      We thank the reviewer for this important comment. The stability of mRNA-GFP was assessed by RT-QPCR in HEK cells and seminiferous tubule cells (Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells (Fig. 5). Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability, efficient translation within germ cells and the slow protein turnover that is typical of the spermatogenic lineage.

      Curious why the authors were unable to get anti-GFP to work in immunostaining?

      We appreciate the reviewer’s question. We attempted to detect GFP using several commercially available anti-GFP antibodies under various standard immunostaining conditions. However, in our hands, these antibodies consistently produced either no signal or high background staining, making the results unreliable. We therefore relied on direct detection of GFP fluorescence, which provides a more accurate and specific readout of protein expression in our system.

      In Fig. 3-4, the GFP signals are unremarkable, in that they cannot be fairly attributed to any structure or cell type - they just look like blobs; and why, in Fig. 4D-E, why does the GFP signal appear stronger at 21 days than 15 days? And why is it completely gone by 28 days? This data is unconvincing.

      We would like to thank the reviewer for their comments. Figure 3–4 provides a global overview of GFP expression on the surface of the testis. The entire testis was imaged using an inverted epifluorescence microscope, and the GFP signal represents a composite of multiple seminiferous tubules across the tissue surface. Due to this whole-organ imaging approach, it is not possible to resolve individual structures such as the basement membrane or lumen, which is why the signals may appear as diffuse “blobs.”

      Regarding the time-course in Figure 4D–E, the apparent increase in GFP signal at 21 days compared with 15 days likely reflects accumulation and translation of the delivered mRNA in germ cells over time, whereas the absence of signal at 28 days corresponds to the natural turnover and degradation of GFP protein and mRNA in the tissue. We hope this explanation clarifies the observed patterns of fluorescence.

      If the authors did a single cell suspension, what types or percentage of cells would be GFP+? Since germ cells are not adherent in culture, a simple experiment could be done whereby a single cell suspension could be made, cultured for 4-6 hours, and non-adherent cells "shaken off" and imaged vs adherent cells. Cells could also be fixed and immunostained for GFP, which has worked in many other labs using anti-GFP.

      We thank the reviewer for this insightful comment. To directly address the concern, we performed additional experiments to assess GFP expression in germ cells following in vivo mRNA delivery. GFP-encoding mRNA was injected and electroporated into the testes on day 0. On day 1, testes were collected, enzymatically dissociated, and the resulting seminiferous tubule cell suspensions were cultured for 12 hours. Live cells were then analyzed by fluorescence microscopy (Fig. 10).

      We observed GFP expression in various germ cell types, including pachytene spermatocytes (53,4 %) (Fig 10 A-), round spermatids (25 %) (Fig 10B-E) and in elongated spermatids (11,4%) (Fig 10 C-E). The identification of these cell types was based on DAPI nuclear staining patterns, cell size fig 10 F, non-adherent characteristics, and the use of an enzymatic dissociation protocol.

      Fluorescence imaging revealed strong cytoplasmic GFP signals in each of these populations, confirming efficient transfection and translation of the delivered mRNA. These results demonstrate that the in vivo injection and electroporation protocol enables effective mRNA transfection across multiple stages of spermatogenesis.

      These results confirm that the injected mRNA is efficiently translated in germ cells at various stages of spermatogenesis. Together, these data validate the germ cell-specific nature of the GFP signal, supporting the Armc2 KO rescue experiments.

      As mentioned previously, we assessed the stability of mRNA-GFP using RT-QPCR in HEK cells and seminiferous tubule cells (see Fig. 5). mRNA-GFP was detected for up to 60 hours in HEK cells and for up to two weeks in seminiferous tubule cells. Together, these results suggest that the long-lasting fluorescence observed in our experiments reflects a combination of transcript stability and local translation within germ cells, as well as the slow protein turnover typical of the spermatogenic lineage.

      In Fig. 5, what is the half-life of luciferase? From this reviewer's search of the literature, it appears to be ~2-3 h in mammalian cells. With this said, how do the authors envision detectable protein for up to 20 days from a naked mRNA? The stability of the injected mRNAs should be shown in a mammalian cell line - perhaps this mRNA has an incredibly long half-life, which might help explain these results. However, even the most stable endogenous mRNAs (e.g., globin) are ~24-60 hrs.

      We did not directly assess the stability of luciferase mRNA, but we evaluated the persistence of GFP mRNA, which was synthesized and optimized using the same sequence optimization and chemical modification strategy as the luciferase mRNA. In these experiments, mRNA-GFP was detectable in seminiferous tubule cells for up to two weeks after injection. We therefore expect a similar stability profile for the luciferase mRNA. These findings suggest that the prolonged fluorescence or bioluminescence observed in our study likely reflects a combination of factors, including enhanced transcript stability, local translation within germ cells, and the inherently slow protein turnover characteristic of the spermatogenic lineage.

      527-8 The Sertoli cell cytoplasm is not just present along the basement membrane as stated, but also projects all the way to the lumina:

      we clarified the sentence " Sertoli cells have an oval to elongated nucleus and the cytoplasm presents a complex shape (“tombstone” pattern) along the basement membrane, with long projections that extend toward the lumen."

      529-30 This is incorrect, as round spermatids are never "localized between the spermatocytes and elongated spermatids" - if elongated spermatids are present, rounds are not - they are never coincident in the same testis section:

      We thank the reviewer for this important comment and for drawing attention to the detailed staging of the seminiferous epithelium. We agree that the spatial organization of germ cells varies depending on the stage of spermatogenesis. While round spermatids (steps 1–8) and elongated spermatids (steps 9–16) are typically associated with distinct stages, transitional stages of the seminiferous epithelium can contain both cell types in close proximity, reflecting the continuous and overlapping nature of spermatid differentiation (Meistrich, 2013, Methods Mol. Biol. 927:299–307). We have revised the text to clarify this point, indicating that the relative positioning of germ cell types depends on the stage of the seminiferous cycle rather than implying their constant coexistence within the same tubule section.

      Fig. 7. To this reviewer, all of the GFP appears to be in Sertoli cell cytoplasm In Figs 1-8 there is no convincing evidence presented that GFP is expressed in germ cells! In fact, it appears to be in Sertoli cells.

      We thank the reviewer for their observation. As previously mentioned, we have included an additional experiment specifically demonstrating GFP expression in germ cells (fig 10). This new data provides clear evidence that the GFP signal is not restricted to Sertoli cells and confirms successful uptake and translation of GFP mRNA in germ cells.

      Fig. 9 - alpha-tubuline?

      We corrected the figure.

      Fig. 11 - how was sperm morphology/motility not rescued on "days 3, 6, 10, 15, or 28 after surgery", but it was in some at 21 and 35? How does this make sense, given the known kinetics of male germ cell development??

      We note the reviewer’s concern regarding the timing of motile sperm appearance. Variability among treated mice is expected because transfection efficiency differed between spermatogonia and spermatids. Full spermiogenesis requires ~15 days, and epididymal transit adds ~8 days, consistent with motile sperm appearing around 21 days post-injection in some mice.

      And at least one of the sperm in the KO in Fig. B5 looks relatively normal, and the flagellum may be out-of-focus in the image? With only a few sperm for reviewers to see, how can we know these represent the population?

      We thank the reviewer for their comment. Upon closer examination of the image, the flagellum of the spermatozoon in question is clearly abnormally short and this is not due to being out of focus. Furthermore, the supplementary figure shows that the KO consistently lacks normal spermatozoa. These defects are consistent with previous findings from our laboratory [22], confirming that the observed phenotype is representative of the KO population rather than an isolated occurrence.

      Reviewer #3 (Public review):

      Summary:

      The authors used a novel technique to treat male infertility. In a proof-of-concept study, the authors were able to rescue the phenotype of a knockout mouse model with immotile sperm using this technique. This could also be a promising treatment option for infertile men.

      Strengths:

      In their proof-of-concept study, the authors were able to show that the novel technique rescues the infertility phenotype of Armc2 knockout spermatozoa. In the current version of the manuscript, the authors have added data on in vitro fertilisation experiments with Armc2 mRNA-rescued sperm. The authors show that Armc2 mRNA-rescued sperm can successfully fertilise oocytes that develop to the blastocyst stage. This adds another level of reliability to the data.

      Weaknesses:

      Some minor weaknesses identified in my previous report have already been fixed. The technique is new and may not yet be fully established for all issues. Nevertheless, the data presented in this manuscript opens the way for several approaches to immotile spermatozoa to ensure successful fertilisation of oocytes and subsequent appropriate embryo development.

      [Editors' note: The images in Figure 12 do not support the authors' interpretation that 2-cell embryos resulted from in vitro fertilization. Instead, the cells shown appear to be fragmented, unfertilized eggs. Combined with the lack of further development, it seems highly unlikely that fertilization was successful.]

      We thank the reviewer for their careful evaluation and constructive feedback. We appreciate the acknowledgment of the strengths of our study, particularly the proof-of-concept demonstration that Armc2-mRNA electroporation can rescue sperm motility in Armc2 knockout mice.

      Regarding the concern raised by the editor about Figure 12, we would like to clarify two technical points. First, the IVF experiments were performed using CD1 oocytes and B6D2 sperm. Due to strain-specific incompatibilities, fertilization of CD1 oocytes by B6D2 sperm typically does not progress beyond the two-cell stage (Fernández-González [23] et al., 2008, Biology of Reproduction). Therefore, the observation of two-cell embryos represents the expected limit of development in this cross and serves as a strong indication of successful fertilization, even though further development is not possible. Second, the oocytes used in these experiments were treated with collagenase to remove cumulus cells. This enzymatic treatment can sometimes affect the morphology of early embryos, which may explain why the two-cell embryos in Figure 12 appear less regular or somewhat fragmented. We also included a control showing embryos from B6D2 sperm with the same collagenase treatment on CD1 oocytes, which yielded similar appearances (Fig14 A4).

      To provide additional functional evidence, we complemented the IVF experiments with ICSI using rescued Armc2<sup>–/–</sup> sperm and B6D2 oocytes, which allowed embryos to develop to the blastocyst stage. In these experiments, 25% of injected oocytes reached the blastocyst stage with rescued sperm compared to 13% for untreated Armc2–/– sperm (Supplementary Fig. 9) These results support the functional competence of rescued sperm and demonstrate partial recovery of fertilization ability following Armc2 mRNA electroporation.

      We have clarified these points in the revised Results and Discussion sections to emphasize that the IVF data indicate partial functional recovery of rescued sperm rather than full fertility restoration. These clarifications address the editor’s concern while accurately representing the technical limitations of the strain combination used in our experiments.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Fig 12 and Supplementary Fig 9 are mislabeled in the text and rebuttal.

      We thank the reviewer for pointing this out. We have carefully checked the manuscript and the rebuttal text, and corrected all references to Figure 12 and Supplementary Figure 9 to ensure they are accurately labeled and consistent throughout the text.

      Reviewer #3 (Recommendations for the authors):

      The contribution of the newly added authors should be clarified. All other aspects of inadequacy raised in my previous report have been adequately addressed.

      No further comments.

      We thank the reviewer for noting this. The contributions of the newly added authors have been clarified in the Author Contributions section of the revised manuscript. All other points raised in the previous review have been addressed as indicated.

      References

      (1) Hansen, M., et al., Assisted reproductive technologies and the risk of birth defects--a systematic review. Hum Reprod, 2005. 20(2): p. 328-38.

      (2) Halliday, J.L., et al., Increased risk of blastogenesis birth defects, arising in the first 4 weeks of pregnancy, after assisted reproductive technologies. Hum Reprod, 2010. 25(1): p. 59-65.

      (3) Davies, M.J., et al., Reproductive technologies and the risk of birth defects. N Engl J Med, 2012. 366(19): p. 1803-13.

      (4) Kurinczuk, J.J., M. Hansen, and C. Bower, The risk of birth defects in children born after assisted reproductive technologies. Curr Opin Obstet Gynecol, 2004. 16(3): p. 201-9.

      (5) Graham, M.E., et al., Assisted reproductive technology: Short- and long-term outcomes. Dev Med Child Neurol, 2023. 65(1): p. 38-49.

      (6) Palermo, G.D., et al., Intracytoplasmic sperm injection: state of the art in humans. Reproduction, 2017. 154(6): p. F93-f110.

      (7) Usmani, A., et al., A non-surgical approach for male germ cell mediated gene transmission through transgenesis. Sci Rep, 2013. 3: p. 3430.

      (8) Raina, A., et al., Testis mediated gene transfer: in vitro transfection in goat testis by electroporation. Gene, 2015. 554(1): p. 96-100.

      (9) Michaelis, M., A. Sobczak, and J.M. Weitzel, In vivo microinjection and electroporation of mouse testis. J Vis Exp, 2014(90).

      (10) Wang, L., et al., Testis electroporation coupled with autophagy inhibitor to treat non-obstructive azoospermia. Mol Ther Nucleic Acids, 2022. 30: p. 451-464.

      (11) Wu, Q., et al., Translation affects mRNA stability in a codon-dependent manner in human cells. eLife, 2019. 8: p. e45396.

      (12) Gallie, D.R., The cap and poly(A) tail function synergistically to regulate mRNA translational efficiency. Genes & Development, 1991. 5(11): p. 2108-2116.

      (13) Henderson, J.M., et al., Cap 1 messenger RNA synthesis with co-transcriptional CleanCap® analog improves protein expression in mammalian cells. Nucleic Acids Research, 2021. 49(8): p. e42.

      (14) Stepinski, J., et al., Synthesis and properties of mRNAs containing novel “anti-reverse” cap analogs. RNA, 2001. 7(10): p. 1486-1495.

      (15) Sachs, A.B., P. Sarnow, and M.W. Hentze, Starting at the beginning, middle, and end: translation initiation in eukaryotes. Cell, 1997. 89(6): p. 831-838.

      (16) Presnyak, V., et al., Codon optimality is a major determinant of mRNA stability. Cell, 2015. 160(6): p. 1111-24.

      (17) Cao, D., et al., Unlock the sustained therapeutic efficacy of mRNA. J Control Release, 2025. 383: p. 113837.

      (18) Karikó, K., et al., Incorporation of pseudouridine into mRNA yields superior nonimmunogenic vector with increased translational capacity and biological stability. Mol Ther, 2008. 16(11): p. 1833-40.

      (19) Pardi, N., et al., mRNA vaccines — a new era in vaccinology. Nature Reviews Drug Discovery, 2018. 17(4): p. 261-279.

      (20) Meistrich, M.L. and R.A. Hess, Assessment of Spermatogenesis Through Staging of Seminiferous Tubules, in Spermatogenesis: Methods and Protocols, D.T. Carrell and K.I. Aston, Editors. 2013, Humana Press: Totowa, NJ. p. 299-307.

      (21) Au - Mäkelä, J.-A., et al., JoVE, 2020(164): p. e61800.

      (22) Coutton, C., et al., Bi-allelic Mutations in ARMC2 Lead to Severe Astheno-Teratozoospermia Due to Sperm Flagellum Malformations in Humans and Mice. Am J Hum Genet, 2019. 104(2): p. 331-340.

      (23) Fernández-Gonzalez, R., et al., Long-term effects of mouse intracytoplasmic sperm injection with DNA-fragmented sperm on health and behavior of adult offspring. Biol Reprod, 2008. 78(4): p. 761-72.

    1. Author response:

      Reviewer #1 (Public Review):

      The heterogeneity within the neutrophil population is becoming clear. However, it was not clear if neutrophil progenitors are also heterogenous. Because neutrophils are short-lived, it is technically challenging to tackle the question. This study used a system to isolate and expand clonal neutrophil progenitors (granulocyte-monocyte progenitors; GMPs) to achieve molecular and functional profiling. In the study, transcriptional profiling was performed by RNAseq and ATACseq. Functional assays were performed ex vivo to examine phagocytosis, ROS production, NET formation, and neutrophil swarming using Candida albicans, as well as C. glabrata and C. auris. The strengths of this study include the use of the neutrophil clone system to track GMPs developing into neutrophils. The clone-based approach made it possible to evaluate the functions of multiple neutrophil subpopulations. Limitations of this study include the dependency on ex vivo approaches and the modest degree of heterogeneity within presented neutrophils. Nevertheless, the finding - the heterogeneity of neutrophils can be traced back to the GMP stage - is significant.

      Reviewer #2 (Public Review):

      The stated goal of the authors is to establish and characterize an experimental system to study neutrophil heterogeneity in a manner that allows for functional outcomes to be probed. To do so, they start with murine GMPs that are conditionally immortalized by ER-HoxB8 expression and make single-cell clonal populations to ask whether those GMPs or neutrophils derived by differentiating such clonal GMPs harbor heterogeneity. At a conceptual level, this is an innovative approach that could shed light on mechanisms of neutrophil heterogeneity that have been described in both health and disease. They perform bulk multi-omics and functional analyses of both the clonal GMPs and neutrophil-like cells, including transcriptional and epigenetic profiling. However, the major weakness of the study is that the authors do not provide rigorous or convincing data that the cells they derive are truly mature neutrophils. To the contrary, the neutrophil-like cells lack Ly6G expression and so the authors fall back on using CD11b as the primary marker for delineating neutrophils; however, CD11b is expressed by both myeloid progenitors and some premature and mature myeloid lineages that are not neutrophils. They acknowledge this shortcoming, but they make an assumption that Ly6G expression is the only way in which the cells they derive are different from primary neutrophils without presenting any evidence indicating such. The authors use only SCF during the maturation of ER-HoxB8 GMPs into leukocytes, rather than including other cytokines such as G-CSF (or use in vivo maturation) that could have better-induced differentiation and maturation into granulocytes/neutrophils.

      Thank you. Of note, reviewer #1 also commented on the question of including other cytokines during the neutrophil differentiation process. We have included our response to reviewer #1 below, which includes the use of GM-CSF and IL-4.

      “We have now demonstrated enhanced Ly6G expression with GM-CSF and IL-4 treatment in a new Supplementary Figure 1.

      GMPs were washed out of estradiol-containing media and placed in fresh media containing 10 ng/ml GM-CSF and/or 1 ng/ml IL-4 for four days. Cells were collected and stained with CD117 (APC), F4/80 (AlexaFluor 488), Ly6G (PE), and CD11b (BV421). Neutrophil clones were run in biological triplicates, and undifferentiated GMPs were included as a negative control.

      GMPs stain as CD117POS / F4/80NEG / Ly6GNEG / CD11bNEG, indicating they are immature. The clones removed from estradiol differentiate and lose their CD117 expression. The mature cells remain F4/80NEG, as expected for mature neutrophils.

      The addition of GM-CSF to the media led to a significant increase in the expression of Ly6G. The addition of both GM-CSF + IL-4 did not further increase the proportion of Ly6G+ cells, and we have altered our statement slightly in the main text to reflect this finding (line 139).”

      The authors did not use their transcriptional analyses to further establish that the cells they derive from ER-HoxB8 GMPs are similar/different from primary murine neutrophils. Unfortunately, this shortcoming means that all of the analyses of neutrophil-like cells derived from clonal GMPs may or may not represent the transcriptional, epigenetic, etc. profile of a true mature neutrophil.

      Thank you. The ER-Hoxb8 system has been well-characterized by many authors at the function and at the transcriptional level, confirming that the cells highly reflect that same gene expression pattern as mature neutrophils. This was actually recently reviewed by Lail et al. (Traffic, 2022, PMID: 36117140). In terms of our analysis, we used transcriptional profiling to examine heterogeneity between different single-cell clones and not to re-validate the similarity with primary neutrophils.

      It is also not rigorously addressed whether what they call PMNs derived from clonal GMPs are a transcriptionally uniform population or if they harbor heterogeneity within the bulk population.

      Thank you. The reviewer poses an interesting, albeit challenging, question of whether even a single GMP clone can differentiate and result in mature neutrophil heterogeneity. To address this would require single cell sequencing of the resulting cells which we did not perform. We relied on single cell subcloning of the immature granulocyte monocyte progenitors to ensure a genetically identical clonal population. This was then additional confirmed by the retroviral insertional analysis. These analyses confirmed the clonal nature of our starting population, from which we posed the question of as whether the neutrophils derived from these clonal GMPs resulted in mature cells with consistent functional heterogeneity, which was indeed the case.

      Overall, while conceptually intriguing and in pursuit of an experimental system that would be impactful for the field, the study as performed has critical flaws.

    1. Keycap: 1 Emoji

      xxx

      1️⃣

      Related Emojis 🔟 *️⃣

      ️⃣

      0️⃣ 2️⃣ 3️⃣ 4️⃣ 5️⃣ 6️⃣ 7️⃣ 8️⃣ 9️⃣ 🐵 🍵 🌆 🇨🇻 🚨

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1:

      Summary:

      In their study, the authors investigated the F. graminearum homologue of the Drosophila Misato-Like Protein DML1 for a function in secondary metabolism and sensitivity to fungicides.

      Strengths:

      Generally, the topic of the study is interesting and timely, and the manuscript is well written, albeit in some cases, details on methods or controls are missing.

      Weaknesses:

      However, a major problem I see is with the core result of the study, the decrease in the DON content associated with the deletion of FgDML1. Although some growth data are shown in Figure 6, indicating a severe growth defect, the DON production presented in Figure 3 is not related to biomass. Also, the method and conditions for measuring DON are not described. Consequently, it could well be concluded that the decreased amount of DON detected is simply due to decreased growth, and the specific DON production of the mutant remains more or less the same.

      To alleviate this concern, it is crucial to show the details on the DON measurement and growth conditions and to relate the biomass formation under the same conditions to the DON amount detected. Only then can a conclusion as to an altered production in the mutant strains be drawn.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions. The point to point responds to the reviewer’s comments are listed as following. Our method for DON quantification was based on the amount per unit of mycelium. After obtaining the absorbance value from the ELISA reaction, the concentration of DON was calculated according to a standard curve and a formula, then divided by the dry weight of the mycelium to obtain the DON content per unit of mycelium, with the results finally expressed in µg/g.

      (1) Line 139f

      ... FgDML1 is a critical positive regulator of virulence ....

      Clearly, the deletion of FgDML1 impacts virulence, but it is too much of a general effect to say it is a regulator. DML1 acts high up in the cascade, impacting numerous processes, one of which is virulence. Generally, it has to be considered that deletion of DML1 causes a severe growth defect, which in turn is likely to lead to a plethora of effects. Besides discussing this fact, please also revise the manuscript to avoid references to "direct effects" or "regulator".

      Thank you very much for your advice. Our method for determining the amount of DON is based on the amount of mycelium per unit. After obtaining the absorbance value through Elisa reaction, we calculate the concentration of DON toxin according to the established standard curve and formula. Then, we divide it by the dry weight of mycelium to obtain the DON toxin content per unit mycelium, and finally present the results in µg/g. In summary, we conclude that the decrease in DON production by ΔFgDML is not due to slower hyphal growth, but rather a decrease in the ability of unit hyphae to produce DON toxins compared to the wild type. Given the decrease in DON toxin synthesis caused by FgDML1 deficiency, we believe that using a regulator is reasonable.

      (2) Line 143

      Please define "toxin-producing conditions".

      Thank you very much for your advice. We have accurately defined the conditions for toxin-producing conditions in the manuscript' toxin-inducing conditions '(28°C, 145 ×g, 7 days incubation)' (in L163-164)

      (3) Line 149

      A brief intro on toxisomes should be provided in the introduction to better integrate this into the manuscript's results.

      Thank you very much for your advice. We have added corresponding content about toxin producing bodies in the introduction section 'The biosynthesis of DON entails a reorganization of the endoplasmic reticulum into a specialized compartment termed the "toxisome" (Tang et al., 2018). The assembly of the toxisome coincides with the aggregation of key biosynthetic enzymes, which in turn enhances the efficiency of DON production. Concurrently, this compartmentalization serves as a self-defense mechanism, protecting the fungus from the autotoxicity of TRI pathway intermediates (Boenisch et al., 2017). The proteins TRI1, TRI4, TRI14, and Hmr1 are confirmed constituents of this structure(Kistler and Broz, 2015; Menke et al., 2013).' (in L86-93)

      (4) Line 153

      DON production decreases by about 80 %, but not to 0. Consequently, DML1 is important, but NOT essential for DON production.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDML1 is essential for the biosynthesis of the DON toxin. '(in L161)

      (5) Line 168ff

      Please provide a reference for FgDnm1 being critical for mitochondrial fission and state whether such an interaction has been shown in other organisms.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L277-283)

      (6) Line 178

      Please specify whether Complex III activity was related to biomass and provide a p-value or standard deviation for the value.

      Thank you very much for your question. The activity determination of complex III was completed using a complex III enzyme activity kit (Solarbio, Beijing, China) (Li, et al 2022; Wang, et al 2022). Take 0.1 g of standardized mycelium as the sample for the experiment. Given that the mycelium has been homogenized, we believe that there is no necessary correlation between the activity and biomass of complex III. And we also refined the specific measurement steps in the article. ' Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,100 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. Each treatment group contains three biological replicates and three technical replicates. '(in L511-517)

      Li C, et al. Amino acid catabolism regulates hematopoietic stem cell proteostasis via a GCN2-eIF2 axis. Cell Stem Cell. 2022 Jul 7; 29(7):1119-1134.e7. doi: 10.1016/j.stem.2022.06.004. PMID: 35803229.

      Wang K, et al. Locally organised and activated Fth1hi neutrophils aggravate inflammation of acute lung injury in an IL-10-dependent manner. Nat Commun. 2022 Dec 13;13(1):7703. doi: 10.1038/s41467-022-35492-y. PMID: 36513690; PMCID: PMC9745290

      (7) Line 185

      Albeit this headline is a reasonable hypothesis, you actually did not show that the conformation is altered. Please reword accordingly.

      Please also add references for cyazofamid acting on the QI site versus other fungicides acting on the QO site.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'Overexpression of FgQCR2, FgQCR8, and FgQCR9 may alters the conformation of the QI site, resulting in reduced sensitivity to cyazofamid. '(in L212-213). For fungicides targeting Qi and QO sites, we have added corresponding descriptions in the respective sections 'Numerous fungicides have been developed to inhibit the Qo site (e.g., pyraclostrobin, azoxystrobin)(Nuwamanya et al., 2022; Peng et al., 2022) and the Qi site (e.g., cyazofamid)(Mitani et al., 2001) of the cytochrome bc1 complex. '(in L327-329)

      (8) Line 200

      This section on growth should be moved up right after introducing the mutant strain.

      Thank you very much for your advice. We have advanced the part of nutritional growth and sexual asexual development before DON toxin to promote better reading and understanding. We arranged the sequence in the previous way to emphasize the new discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in ΔFgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (9) Line 203

      "... significantly reduced growth rates ..."

      This is not what was measured here. Figure 6A shows a plate assay that can be used to assess hyphal extension. In the figure, it is also visible that the mycelium of the deletion mutant is much denser, maybe due to increased hyphal branching. Please reword.

      Additionally, it is important to include a biomass measurement here under the conditions used for DON assessment. Hyphal extension measurements cannot be used instead of biomass.

      Thank you very much for your advice. We have made changes to the wording of the corresponding sections based on your suggestions. 'The ΔFgDML1 strain displayed a distinct growth phenotype characterized by retardation in radial growth and the formation of more compact, denser hyphal networks on all tested media compared to the PH-1 and ΔFgDML-C strains. '(in L136-138).

      (10) Line 217

      Please include information on how long the cultures were monitored. Given the very slow growth of the mutant, perithecia formation may be considerably delayed beyond 14 days.

      Thank you very much for your advice. Based on your suggestion, we have extended the incubation time for sexual reproduction to 21 days to more accurately evaluate its sexual reproduction ability. Our results show that even after 21 days, Δ FgDML1 still cannot produce ascospores and ascospores, which proves that the absence of FgDML1 does indeed cause sexual reproduction defects in F. graminearum.

      Author response image 1.

      Discussion

      (11) Please mention your summary Figure 8 early on in the discussion, and explain conclusions with this figure in mind. Please avoid repetition of the results section as much as possible.

      Also, please state clearly what was already known from previous research and is in agreement with your results, and what is new (in fungi or generally).

      Thank you very much for your advice. Based on your suggestion, we mentioned Fig8 earlier in the first half of the discussion and provided guidance for the following text. We also conducted a more comprehensive discussion by analyzing our research results and comparing them with previous studies. 'Our study defines a novel mechanism through which FgDML1 governs mitochondrial homeostasis. We demonstrate that FgDML1 directly interacts with the key mitochondrial fission regulator FgDnm1 and positively modulates cellular bioenergetic metabolism, as evidenced by elevated ATP and acetyl-CoA levels (Fig. 8). '(in L250-253). 'The Misato/DML1 protein family is evolutionarily conserved from yeast to humans and plays a critical role in mitochondrial regulation. In S. cerevisiae, DML1 is an essential gene; its deletion is lethal, while its overexpression results in fragmented mitochondrial networks and aberrant cellular morphology, underscoring its necessity for normal mitochondrial function (Gurvitz et al., 2002). Similarly, in Homo sapiens, the homolog Misato localizes to the mitochondrial outer membrane, and both its depletion and overexpression are sufficient to disrupt mitochondrial morphology and distribution (Kimura and Okano, 2007). '(in L241-244).

      (12) Line 262ff

      Please specify if this interaction was shown previously in other organisms and provide references.

      Thank you very much for your advice. We have clearly stated in the corresponding section that the interaction between FgDML and FgDnm is the first reported, and to our knowledge, no relevant reports have been found in other species so far. ' Notably, FgDML1 was found to interact with FgDnm1 (Fig. 5E), FgDnm1 is a key dynamin-related protein mediating mitochondrial fission(Griffin et al., 2005; Kang et al., 2023), suggesting that FgDML1 may form a complex with FgDnm1 to regulate mitochondrial fission and fusion processes. To our knowledge, this is the first report documenting an interaction between DML1 and Dnm in any fungal species, including model organisms such as S. cerevisiae. This novel finding provides new insights into the molecular mechanisms underlying mitochondrial dynamics in filamentous fungi. '(in L276-283)

      (13) Line 287ff

      There is no result that would justify this speculation. Please remove.

      Thank you very much for your advice. We have modified the corresponding wording in the corresponding section. 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355)

      Materials and methods

      (14) A table with all primer sequences used in the study and their purpose is missing. For every experiment, the number of technical and biological replicates needs to be stated.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1 (in Table S1) .We added the number of technical and biological replicates in the material and method descriptions for each experiment. 'For each sample, a total of 200 conidia were counted. The experiment included three biological replicates with three technical replicates each.'(in L434-436). 'Each treatment group contains three biological replicates. '(in L444-445). 'Each treatment group contains three biological replicates and three technical replicates. ' (in L463-464). 'Each treatment group contains three biological replicates and three technical replicates. '(in L474-475). 'Each treatment group contains three biological replicates. '(in L483). 'Each treatment group contains three biological replicates and three technical replicates.'(in L501-502). 'Each treatment group contains three biological replicates and three technical replicates. '(in L516-517). 'The experiment was independently repeated three times. '(in L533-534).

      (15) Line 369ff

      Please provide final concentrations used for assays here.

      Thank you very much for your advice. The final concentration has been displayed in the Figure (in Fig6. A, B) (in Fig. S3). And we have provided supplementary Table 2 to reflect the concentration in a more intuitive way.(in Table. S2)

      (16) Line 383

      Please provide a reference or data on the use of F2du for transformant selection and explain the abbreviation.

      Thank you very much for your advice. Based on your suggestion, we have provided the full name and references of F2du. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). '(in L405-407).

      (17) Line 407

      Please provide a reference for the method and at least a brief description.

      Thank you very much for your advice. Based on your suggestion, we have added references and provided a brief introduction to the method. 'As previously described (Tang et al., 2020; Wang et al., 2025), Specifically, coleoptiles were inoculated with conidial suspensions and incubated for 14 days, while leaves were inoculated with fresh mycelial plugs and incubated for 5 days, followed by observation and quantification of disease symptoms. DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018). '(in L466-471)

      (18) Line 414ff

      Also, here, the amount of biomass has to be considered for the measurement to be able to distinguish if actually less of the compounds were produced or if the effect seen was merely due to an altered amount of biomass present.

      Thank you very much for your advice. We believe that biomass is not within the scope of our measurement indicators, as we have measured and calculated based on unit hyphae. Therefore, we have ruled out experimental bias caused by a decrease in biomass.

      RNA and RT-qPCR

      (19) Line 461

      When the strains were transferred to AEA medium, was the biomass measured, at least wet weight, and in which culture volume was it done? It makes a big difference if the amount of (wet) biomass dilutes a small amount of fungicide-containing culture or if biomass is added in at least roughly equal amounts in sufficient growth medium to ensure equal conditions.

      Thank you very much for your question. Our sample processing controlled the wet weight of the samples before dosing, ensuring that the wet weight of the mycelium obtained from each sample before dosing was 0.2g. The mycelium was obtained through AEA with a volume of 100mL. This ensured consistency in the initial biomass between groups before dosing, and also ensured the accuracy of the drug concentration.

      (20) Line 466

      Please provide the name and supplier of the kit.

      Thank you very much for your advice. We have added corresponding content in the corresponding location. 'Mycelium was collected and total RNA was extracted following the instructions provided by the Total RNA Extraction Kit (Tiangen, Beijing, China).' (in L523-524).

      (21) All primer sequences must be provided in a table.

      Thank you very much for your advice. We have presented all the primers used in this study in Supplementary Table 1. (in Table S1).

      (22) For RT qPCR it is essential to check the RNA quality to be sure that the obtained results are not artifacts due to varying quality, which may exceed differences. Please state how quality control was done and which threshold was applied for high-quality RNA to be used in RTqPCR (like RIN factor, etc).

      Thank you very much for your question. We performed stringent quality control on the extracted total RNA. First, a micro-spectrophotometer was used to measure RNA concentration and purity, confirming that the A260/A280 ratio was between 1.8 and 2.0 and the A260/A230 ratio was greater than 2.0, indicating good RNA purity without significant protein or organic solvent contamination.Subsequently, verification by agarose gel electrophoresis revealed distinct 28S and 18S rRNA bands, demonstrating good RNA integrity and absence of degradation.

      Author response image 2.

      (B): Minor Comments:

      (1) Please increase the font size of the labels and annotations of the figures; it is hard to read as it is now.

      Thank you very much for your advice. We have increased the size of annotations or numerical labels in the corresponding images for better reading.

      (2) Throughout the manuscript: Please check that all abbreviations are explained at first use.

      Thank you very much for your advice. We have checked the entire text to ensure that abbreviations have their full names when they first appear.

      (3) I do hope that the authors can clarify all concerns and provide an amended manuscript of this interesting story.

      Thank you very much for your advice. Sincerely thank you for your suggestions and questions, which have been very helpful to us.

      Reviewer #2:

      The manuscript entitled "Mitochondrial Protein FgDML1 Regulates DON Toxin Biosynthesis and Cyazofamid Sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" identified the regulatory effect of FgDML1 in DON toxin biosynthesis and sensitivity of Fusarium graminearum to cyazofamid. The manuscript provides a theoretical framework for understanding the regulatory mechanisms of DON toxin biosynthesis in F. graminearum and identifies potential molecular targets for Fusarium head blight control. The paper is innovative, but there are issues in the writing that need to be addressed and corrected.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      Weaknesses:

      (1) The authors speculate that cyazofamid treatment caused upregulation of the assembly factors, leading to a change in the conformation of the Qi protein, thus restoring the enzyme activity of complex III. But no speculation was given in the discussion as to why this would lead to the upregulation of assembly factors, and how the upregulation of assembly factors would change the protein conformation, and is there any literature reporting a similar phenomenon? I would suggest adding this to the discussion.

      Thank you very much for your advice. Based on your suggestion, we have added content related to the assembly factor of complex III in the discussion section and made modifications to the corresponding wording. 'Previous studies have reported that mutations in the Complex III assembly factors TTC19, UQCC2, and UQCC3 impair the assembly and activity of Complex III (Feichtinger et al., 2017; Wanschers et al., 2014). '(in L345-347). 'In conclusion, our findings suggest that the overexpression of assembly factors FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 potentially modifies the conformation of the Qi site, which specifically modulates the sensitivity of F. graminearum to cyazofamid. '(in L352-355).

      (2) Would increased sensitivity of the mutant to cell wall stress be responsible for the excessive curvature of the mycelium?

      Thank you very much for your question. We believe that the sensitivity of ΔFgDML1 to osmotic stress is reduced, which may not be related to hyphal bending, as shown in the Author response image 3. During the conidia stage, ΔFgDML1 cannot germinate in YEPD, while the application of 1M Sorbitol promotes its germination. But it is caused by internal unknown mechanisms, which is also the focus of our future research.

      Author response image 3.

      (3) The vertical coordinates of Figure 7B need to be modified with positive inhibition rates for the mutants.

      Thank you very much for your advice. The display in Figure 7B truly reflects its inhibition rate. In the Δ FgDML1 mutant, when subjected to osmotic stress treatment, the inhibition rate becomes negative, indicating that the colony growth is greater than that of the CK. Therefore, the negative inhibition rate is shown in Figure 7B.

      (1) In Figure 1B, Figure 3C, and Figure 6C, the scale below the picture is not clear. In Figure 5D, the histogram is unclear, and it is recommended to redraw the graph.

      Thank you very much for your advice. The issue with the above images may be due to Word compression. We have changed the settings and enlarged the images as much as possible to better display them.

      (2) The full Latin name of the strain should be used in the title of figures and tables.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (3) Proteins in line 117 should be abbreviated.

      Thank you very much for your advice. Based on your suggestion, we have abbreviated the corresponding positions. 'The DML1 protein from S. cerevisiae was used as a query for a BLAST search against the Fusarium genome database, resulting in the identification of the putative DML1 gene FgDML1 (FGSG_05390) in F. graminearum. '(in L118-120).

      (4) The sentence in lines 187-189, which is supposed to introduce why the test is sensitive to the three drugs, is currently illogical.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ' (in L214-216).

      (5) The expression of FgQCR2, FgQCR7, and FgQCR8 was significantly upregulated in ΔFgDML1 at transcription levels. Do FgQCR2, FgQCR8, and FgQCR9 show upregulated expression at the protein level?

      Thank you very much for your question. Based on your suggestion, we evaluated the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in PH-1 and ΔFgDML1, and we found that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1. (in Fig. 6F).

      (6) In Figure 7B, it is recommended to adjust the position of the horizontal axis labels in the histogram.

      Thank you very much for your advice. Based on your suggestion, we have made modifications to the corresponding sections.(in Fig. 7B)

      (7) There are numerous errors in the writing of gene names in the text. Please check the full text and change the writing of gene names and mutant names to italic.

      Thank you very much for your advice. We have checked the entire text to ensure that all genes have been italicized.

      (8) All acronyms should be spelled out in figure and table captions. e.g., F. graminearum.

      Thank you very much for your advice. Based on your suggestion, we have used the full names of the strains appearing in the title of figures and tables.

      (9) In line 492, P should be lowercase and italic.

      Thank you very much for your advice. Based on your suggestion, we have made adjustments to the corresponding content.

      Reviewer #3:

      Summary:

      The manuscript "Mitochondrial 1 protein FgDML1 regulates DON toxin biosynthesis and cyazofamid sensitivity in Fusarium graminearum by affecting mitochondrial homeostasis" describes the construction of a null mutant for the FgDML1 gene in F. graminearum and assays characterising the effects of this mutation on the pathogen's infection process and lifecycle. While FgDML1 remains underexplored with an unclear role in the biology of filamentous fungi, and although the authors performed several experiments, there are fundamental issues with the experimental design and execution, and interpretation of the results.

      Strengths:

      FgDML1 is an interesting target, and there are novel aspects in this manuscript. Studies in other organisms have shown that this protein plays important roles in mitochondrial DNA (mtDNA) inheritance, mitochondrial compartmentalisation, chromosome segregation, mitochondrial distribution, mitochondrial fusion, and overall mitochondrial dynamics. Indeed, in Saccharomyces cerevisiae, the mutation is lethal. The authors have carried out multi-faceted experiments to characterise the mutants.

      Weaknesses:

      However, I have concerns about how the study was conceived. Given the fundamental importance of mitochondrial function in eukaryotic cells and how the absence of this protein impacts these processes, it is unsurprising that deletion of this gene in F. graminearum profoundly affects fungal biology. Therefore, it is misleading to claim a direct link between FgDML1 and DON toxin biosynthesis (and virulence), as the observed effects are likely indirect consequences of compromised mitochondrial function. In fact, it is reasonable to assume that the production of all secondary metabolites is affected to some extent in the mutant strains and that such a strain would not be competitive at all under non-laboratory conditions. The order in which the authors present the results can be misleading, too. The results on vegetative growth rate appeared much later in the manuscript, which should have come first, as the FgDML1 mutant exhibited significant growth defects, and subsequent results should be discussed in that context. Moreover, the methodologies are not described properly, making the manuscript hard to follow and difficult to replicate.

      We appreciate it very much that you spent much time on my paper and give me good suggestions, we tried our best to revise the manuscript. I have revised my manuscript according to your suggestions with red words. In the response comments, to highlight the specific positions of the revised parts in the manuscript with red line number. The point to point responds to the reviewer’s comments are listed as following.

      For weaknesses,we arranged the sequence in this way to emphasize the novel discovery between mitochondria and DON toxin. We found a significant decrease in DON toxin in Δ FgDML1, defects in the formation of toxin producing bodies, and downregulation of FgTRis at both the gene and protein levels. In summary, we believe that the absence of FgDML1 does indeed lead to a decrease in the content of DON toxin, and FgDML1 plays a regulatory role in the synthesis of DON toxin. In addition, our measurements of DON toxin, acetyl CoA, ATP and other indicators are all based on the amount per unit hyphae, excluding differences caused by hyphal biomass or growth. We have further refined the materials and methods to facilitate better reading and understanding.

      (1) Lines 37-39: The disease itself does not produce toxins; it is the fungus that causes the disease that produces toxins. Moreover, the disease symptoms observed are likely caused by the toxins produced by the fungus.

      Thank you very much for your advice. We have made modifications to the wording of the corresponding sections. 'Studies have shown that increased DON levels are positively correlated with the pathogenicity rate of F. graminearum.'(in L36-37).

      (2) Lines 82-87: While it is challenging to summarise the role of ATP in just a few words, this section needs improvement for clarity and accuracy. Additionally, I do not believe that drawing a direct link between mitochondrial defects and toxin production is an appropriate strategy in this case.

      Thank you very much for your advice. Based on your suggestion, we have added corresponding descriptions in the corresponding positions to provide more information on the relationship between ATP and toxins, in order to better prepare for the following text. 'Pathogen-intrinsic ATP homeostasis is recognized as a critical, rate-limiting determinant for toxin biosynthesis. Previous studies indicate that dual-target inhibition of ATP synthase (AtpA) and adenine deaminase (Ade) by a specific small-molecule probe effectively depletes intracellular ATP, consequently suppressing the synthesis of key virulence factors TcdA and TcdB transcriptionally and translationally(Marreddy et al., 2024). The systemic toxicity of Anthrax Edema Toxin (ET) is primarily attributed to its catalytic activity, which depletes the host cell's ATP reservoir, thereby triggering a bioenergetic collapse that culminates in cell lysis and death(Liu et al., 2025). '(in L78-86).

      (3) Lines 125-126: The manuscript does not clearly describe how subcellular localisation was determined. This methodology needs to be properly detailed.

      Thank you very much for your advice. The subcellular localization was validated through co-localization analysis with MitoTracker Red CMXRos, a mitochondrial-specific dye. The observed overlap between the FgDML1-GFP signal and the mitochondrial marker confirmed mitochondrial localization. Based on these results, we determined that FgDML1 is definitively localized to the mitochondria.We have incorporated this description in the appropriate section of the manuscript. 'Furthermore, subcellular localization studies confirmed that FgDML1 localizes to mitochondria, as demonstrated by colocalization with a mitochondria-specific dye MitoTracker Red CMXRos (Fig. 1B). '(in L125-127).

      (4) Regarding the organisation of the Results section, it needs to be revised. While I understand the authors' intention to emphasise the impact on virulence, the results showing how FgDML1 deletion affects vegetative growth, asexual and sexual reproduction, and sensitivity to stressors should be presented before the virulence assays and effects on DON production. Additionally, the authors do not provide any clear evidence that FgDML1 directly interacts with proteins involved in asexual or sexual reproduction, stress responses, or virulence. Therefore, it is misleading to suggest that FgDML1 directly regulates these processes. The observed phenotypes are, rather, a consequence of severely impaired mitochondrial function. Without functional mitochondria, the cell cannot operate properly, leading to widespread physiological defects. In this regard, statements such as those in lines 139-140 and 343-344 are misleading.

      Thank you very much for your advice. We have adjusted the order of the images based on your suggestion, placing the characterization of ΔFgDML1 in nutritional growth, sexual reproduction, and other aspects before DON toxin. And we have made adjustments to the corresponding statements. 'These findings demonstrate that FgDML1 is a positive regulator of virulence in F. graminearum. '(in L140-141).

      (5) Lines 185-186: The authors do not provide sufficient evidence to support the claim that FgQCR2, FgQCR8, and FgQCR9 overexpression is the main cause of reduced cyazofamid sensitivity. Although expression of these genes is altered, reduced sensitivity may result from changes in other proteins or pathways. To strengthen this claim, overexpression of FgQCR2, 8, and 9 in the wild-type background, followed by assessment of cyazofamid resistance, would be necessary. As it stands, there is no support for the claim presented in lines 329-332.

      Thank you very much for your advice. To establish a causal link between the overexpression of FgQCR2, FgQCR7, and FgQCR8 and the observed reduction in cyazofamid sensitivity, we first quantified the protein levels of these assembly factor. Western blot analysis confirmed their elevated expression in the ΔFgDML1 mutant compared to the wild-type PH-1. We further generated individual overexpression strains for FgQCR2, FgQCR7, and FgQCR8 in the wild-type PH-1 background. Fungicide sensitivity assays revealed that all three overexpression mutants displayed significantly reduced sensitivity to cyazofamid compared to the parental strain. These genetic complementation experiments confirm that upregulation of FgQCR2, FgQCR7, and FgQCR8 is sufficient to confer reduced cyazofamid sensitivity.We have incorporated these explanations and provided supporting images in the appropriate section of the manuscript. 'To further clarify whether the upregulated expression of FgQCR2, FgQCR7, and FgQCR8 genes affects their protein expression levels, we measured the protein levels. The results showed that the protein expression levels of FgQCR2, FgQCR7, and FgQCR8 in ΔFgDML1 were higher than those in PH-1(Fig. 6F). Subsequently, we overexpressed FgQCR2, FgQCR7, and FgQCR8 in the wild-type background, and the corresponding overexpression mutants exhibited reduced sensitivity to cyazofamid(Fig. 6E). '(in L205-211)(in Fig. 6E, F)

      (6) Lines 187-190: This segment is confusing and difficult to follow. It requires rewriting for clarity.

      Thank you very much for your advice. Based on your suggestion, we have made corresponding modifications in the corresponding locations. 'Since Complex III is involved in the action of both cyazofamid (targeting the QI site) and pyraclostrobin (targeting the QO site), the sensitivity of ΔFgDML1 to cyazofamid and pyraclostrobin was investigated. ''(in L214-216)

      (7) Lines 345-346: The authors state that in this study, FgDML1 is localised in mitochondria, which implies that in other studies, its localisation was different. Is this accurate? Clarification is needed.

      Thank you very much for your question. In previous studies, the localization of this protein was not clearly defined, and its function was only emphasized to be related to mitochondria. Whether in yeast or in Drosophila melanogaster. (Miklos et al., 1997; Gurvitz et al., 2002)

      Miklos GLG, Yamamoto M-T, Burns RG, Maleszka R. 1997. An essential cell division gene of drosophila, absent from saccharomyces, encodes an unusual protein with  tubulin-like and myosin-like peptide motifs. Proc Natl Acad Sci 94:5189–5194. doi:10.1073/pnas.94.10.5189

      Gurvitz A, Hartig A, Ruis H, Hamilton B, de Couet HG. 2002. Preliminary characterisation of DML1, an essential saccharomyces cerevisiae gene related to misato of drosophila melanogaster. FEMS Yeast Res 2:123–135. doi:10.1016/S1567-1356(02)00083-1

      Material and Methods Section

      (8) In general, the methods require more detailed descriptions, including the brands and catalog numbers of reagents and kits used. Simply stating that procedures were performed according to manufacturers' instructions is insufficient, particularly when the specific brand or kit is not identified.

      Thank you very much for your advice. We have added corresponding content based on your suggestion to more comprehensively display the reagent brand and complete product name. 'Transformants were selected on PDA plates containing either 100 μg/mL Hygromycin B (Yeasen, Shanghai, China) or 0.2 μmol/mL 5-Fluorouracil 2'-deoxyriboside (F2du) (Solarbio, Beijing, China)(Zhao et al., 2022). ' (in L405-407). 'DON toxin was measured using a Wise Science ELISA-based kit (Wise Science, Jiangsu, China) (Li et al., 2019; Zheng et al., 2018) '. (in L469-471)

      (9) Line 364: What do CM and MM stand for? Please define.

      Thank you very much for your advice. Based on your suggestion, we have made modifications in the corresponding locations. 'To evaluate vegetative growth, complete medium (CM), minimal medium (MM), and V8 Juice Agar (V8) media were prepared as described previously(Tang et al., 2020). '(in L385-387)

      Generation of Deletion and Complemented Mutants:

      (10) This section lacks detail. For example, were PCR products used directly for PEG-mediated transformation, or were the fragments cloned into a plasmid?

      Thank you very much for your question. We directly use the fused fragments for protoplast transformation after sequencing confirmation. We have clearly defined the fragment form used for transformation at the corresponding location. 'The resulting fusion fragment was transformed into the wild-type F. graminearum PH-1 strain via polyethylene glycol (PEG)-mediated protoplast transformation. '(in L403-405).

      (11) PCR and Southern blot validation results should be included as supplementary material, along with clear interpretations of these results.

      Thank you very much for your advice. In the supplementary material we submitted, Supplementary Figure 2 already includes the results of PCR and Southern blot validation.(in Fig. S2)

      (12) There is almost no description of how the mutants mentioned in lines 388-390 were generated.

      Thank you very much for your advice. Based on your suggestions, we have added relevant content in the appropriate sections to more comprehensively and clearly reflect the experimental process. 'Specifically, FgDML1, including its native promoter region and open reading frame (ORF) (excluding the stop codon), was amplified.The PCR product was then fused with the XhoI -digested pYF11 vector. After transformation into E. coli and sequence verification, the plasmid was extracted and subsequently introduced into PH-1 protoplasts. For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. The overexpression mutant was constructed according to a previously described method. Specifically, the ORF of FgDML1 was amplified and the PCR product was ligated into the SacII-digested pSXS overexpression vector. The resulting plasmid was then transformed into PH-1 protoplasts (Shi et al., 2023). For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively.'(in L413-426).

      Vegetative Growth and Conidiation Assays:

      (13) There is no information about how long the plates were incubated before photos were taken. Judging by the images, it appears that different incubation times may have been used.

      Thank you very much for your advice. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (14) There is no description of the MBL medium.

      Thank you very much for your advice. Based on your suggestion, we have supplemented the corresponding content in the corresponding positions. 'Mung bean liquid (MBL) medium was used for conidial production, while carrot agar (CA) medium was utilized to assess sexual reproduction(Wang et al., 2011). '(in L387-389).

      DON Production and Pathogenicity Assays:

      (15) Were DON levels normalised to mycelial biomass? The vegetative growth assays show that FgDML1 null mutants exhibit reduced growth on all tested media. If mutant and wild-type strains were incubated for the same period under the same conditions, it is reasonable to assume that the mutants accumulated significantly less biomass. Therefore, results related to DON production, as well as acetyl-CoA and ATP levels, must be normalised to biomass.

      Thank you very much for your question. We have taken into account the differences in mycelial biomass. Therefore, when measuring DON, acetyl-CoA, and ATP levels, all data were normalized to mycelial mass and calculated as amounts per unit of mycelium, thereby avoiding discrepancies arising from variations in biomass.

      Sensitivity Assays:

      (16) While the authors mention that gradient concentrations were used, the specific concentrations and ranges are not provided. Importantly, have the plates shown in Figure 5 been grown for different periods or lengths? Given the significantly reduced growth rate shown in Figure 6A, the mutants should not have grown to the same size as the WT (PH-1) as shown in Figures 5A and 5B unless the pictures have been taken on different days. This needs to be explained.

      Thank you very much for your question. Due to the slower growth of ΔFgDML1, we adopted different incubation periods and have supplemented the relevant content in the corresponding section. 'All strains were incubated at 25°C in darkness; however, due to ΔFgDML1 slower growth, the ΔFgDML1 mutant required a 5-day incubation period compared to the 3 days used for PH-1 and ΔFgDML1-C. '(in L490-493).

      (17) Additionally, was inhibition measured similarly for both stress agents and fungicides? This should be clarified.

      Thank you very much for your question. We have supplemented the specific concentration gradient of fungicides. 'The concentration gradients for each fungicide in the sensitivity assays were set up according to Supplementary Table S2. '(in L493-494)(in Table. S2).

      Complex III Enzyme Activity:

      (18) A more detailed description of how this assay was performed is needed.

      Thank you very much for your advice. We have provided further detailed descriptions of the corresponding sections. 'Briefly, 0.1 g of mycelia was homogenized with 1 mL of extraction buffer in an ice bath. The homogenate was centrifuged at 600 ×g for 10 min at 4°C. The resulting supernatant was then subjected to a second centrifugation at 11,000 ×g for 10 min at 4°C. The pellet was resuspended in 200 μL of extraction buffer and disrupted by ultrasonication (200 W, 5 s pulses with 10 s intervals, 15 cycles). Complex III enzyme activity was finally measured by adding the working solution as per the manufacturer's protocol. '(in L511-517)

      (19) Were protein concentrations standardised prior to the assay?

      Thank you very much for your question. Protein concentrations for all Western blot samples were quantified using a BCA assay kit to ensure equal loading.

      (20) Line 448: Are ΔFgDML1::Tri1+GFP and ΔFgDML1+GFP the same strain? ΔFgDML1::Tri1+GFP has not been previously described.

      Thank you very much for your question. These two strains are not the same strain, and we have supplemented their construction process in the corresponding section. 'For the construction of PH-1::FgTri1+GFP and ΔFgDML1::FgTri1+GFP, the ORF of FgTri1 was amplified and ligated into the XhoI-digested pYF11 vector as described above. The resulting vectors were then transformed into protoplasts of PH-1 or ΔFgDML1, respectively. '(in L423-426)

      (21) Lines 460 and 468: Please adopt a consistent nomenclature, either RT-qPCR or qRT-PCR.

      Thank you very much for your advice. We have unified it and modified the corresponding content in the corresponding sections. 'Reverse Transcription Quantitative Polymerase Chain Reaction (RT-qPCR) was carried out using the QuantStudio 6 Flex real-time PCR system (Thermo, Fisher Scientific, USA) to assess the relative expression of three subunits of Complex III (FgCytb, FgCytc1, FgISP), five assembly factors (FgQCR2, FgQCR6, FgQCR7, FgQCR8, FgQCR9), and DON biosynthesis-related genes (FgTri5 and FgTri6). '(in L526-531)

      (22) Lines 472-473: Why was FgCox1 used as a reference for FgCytb? Clarification is needed.

      Thank you very much for your question. FgCytb (cytochrome b) and FgCOX1 (cytochrome c oxidase subunit I) are both encoded by the mitochondrial genome and serve as core components of the oxidative phosphorylation system (Complex III and Complex IV, respectively). Their transcription is co-regulated by mitochondrial-specific mechanisms in response to cellular energy status. Consequently, under experimental conditions that perturb energy homeostasis, FgCOX1 expression exhibits relative, context-dependent stability with FgCytb, or at least co-varies directionally, making it a superior reference for normalizing target gene expression. In contrast, FgGapdh operates within a distinct genetic and regulatory system. Using FgCOX1 ensures that both reference and target genes reside within the same mitochondrial compartment and functional module, thereby preventing normalization artifacts arising from independent variation across disparate pathways.

      (23) Lines 476-477: This step requires a clearer and more detailed explanation.

      Thank you very much for your advice. We provided detailed descriptions of them in their respective positions. 'For FgDnm1-3×Flag, the 3×Flag tag was added to the C-terminus of FgDnm1 by PCR, fused with the hygromycin resistance gene and the FgDnm1 downstream arm, and then introduced into PH-1 protoplasts. '(in L417-419). 'The FgDnm1-3×Flag fragment was introduced into PH-1 and FgDML1+GFP protoplasts, respectively, to obtain single-tagged and double-tagged strains. '(in L541-543)

      Western blotting:

      (24) Uncropped Western blot images should be provided as supplementary material.

      Thank you very much for your advice. All Western blot images will be submitted to the supplementary material package.

      (25) Lines 485-489: A more thorough description of the antibodies used (including source, catalogue number, and dilution) is necessary.

      Thank you very much for your advice. The antibodies used are clearly stated in terms of brand, catalog number, and dilution. We have added the dilution ratio. 'All antibodies were diluted as follows: primary antibodies at 1:1000 and secondary antibodies at 1:10000. '(in L550-551)

      (26) The Western blot shown in Figure 3D appears problematic, particularly the anti-GAPDH band for FgDML1::FgTri1+GFP. Are both anti-GAPDH bands derived from the same gel?

      Thank you very much for your advice. We are unequivocally certain that these data derive from the same gel. Therefore, we are providing the original image for your inspection.

      Author response image 4.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      We would like to thank the reviewer for taking the time to review our manuscript. We are happy to hear the reviewer thinks the manuscript is interesting and thank the reviewer for their constructive feedback.

      To clarify the statistical analyses used, we included a new supplementary dataset with all statistical analyses and p-values indicated per graph. Furthermore, figure legends now include the information on the exact statistical test used in each case.

      Regarding mosquito experiments, while we indeed reported a reduction in transmission and oocysts numbers we are aware that this effect might be due to the high variability in mosquito feeding assays. To highlight this point, we deleted the sentence "with the transmission reduction of [numbers]...." and we included the sentence "The high variability encountered in the standard membrane feeding assays, though, partially obstructs a clear conclusion on the biological relevance of the observed reduction in oocyst numbers"

      More specific comments to address: Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information.

      We added the information "high molecular mass gels with lower acrylamide percentage" to clarify methodology in the text. Furthermore, we extended the figure legend to include all relevant information. Further experimental details can be found in the study cited in this context, where the dataset originates from (Evers et al., 2021).

      Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify.

      We thank the reviewer for pointing this out - this was indeed incorrectly annotated. We used the endogenous mito-mScarlet signal in IFA and mitoTracker in U-ExM. The figure annotation has now been corrected.

      Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)?

      The statistic test is now included in the material and method section with the sentence "The fitted model was used to obtain estimated means and contrasts and were evaluated using Wald Statistics". The test is now also mentioned in the figure legend.

      Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible).

      As the data spans three orders of magnitude with low values being biologically meaningful, we decided that a log scale would best facilitate readability of the graph. As the 0 values are also important to show, we went with a standard approach to handle 0s in log transformed data and substituted the 0s with a small value (0.001). We apologize for not mentioning this transformation in the manuscript. To make this transformation transparent, we added a break at the lower end of the log‑scaled y‑axis and relabelled the lowest tick as '0'. This ensures that mosquitoes with zero oocysts are shown along the x‑axis without being assigned an artificial value on the log scale. We would furthermore like to highlight that for statistics we used the true value 0 and not 0.001.

      Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect?

      We agree with the reviewer and with the new sentence added, as per major point, we hope we clarified the concept. Note that original Figure 2D has been moved to the supplementary information, as per minor comment of another reviewer.

      Figure 3 legend - Please add which statistical test was used and the number of replicates.

      Done

      Figure 4 legend - Please add which statistical test was used and the number of replicates.

      Done. Regarding replicates, note that while we measured over 100 cristae from over 30 mitochondria, these all stem from the same parasite culture.

      Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show?

      Indeed, the information was missing. We added it to the figure legend.

      Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Our original sentence was reductive. What we wanted to state was related to the functional relevance of crista architecture and overall mitochondrial morphology rather than the general functional relevance of the mitochondria. We changed the sentence accordingly.

      Furthermore, even though we do not discuss this in the article, we are aware of mitochondria targeting drugs that are known to block mosquito transmission. We want to point out that it is difficult to discern the disruption of ETC and therefore an impact on energy conversion with the impact on the essential pathway of pyrimidine synthesis, highly relevant in microgamete formation. Still, a recent paper from Sparkes et al. 2024 showed the essentiality of mitochondrial ATP synthesis during gametogenesis so it is very likely that the mitochondrial energy conversion is highly relevant for transmission to the mosquito.

      Reviewer #1 (Significance (Required)):

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research. My expertise is in Plasmodium cell biology.

      We thank the reviewer for the praise.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Major comments: 1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      We thank the reviewer for taking the time to review our manuscript.

      Based on the reviewers' interpretation we conclude the title does not come across as intended. We have changed the title to: "The role of MICOS in organizing mitochondrial cristae in malaria parasites"

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      We do agree with the reviewer's notion that we did not address complex stability, and our wording did not make this sufficiently clear. We shortened and rephrased the paragraph in question.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      We shortened this paragraph.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      Interesting suggestion. As our staining and imaging conditions are suitable for such analysis (as demonstrated by Sarazin et al., 2025, https://www.biorxiv.org/content/10.1101/2025.11.27.690934v1), we performed the measurements on the same dataset which we collected for Figure 3. We did, however, not detect any difference in mitotracker intensity between the different lines. The result of this analysis is included in the new version of Supplementary figure S6.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      While theoretically plausible and informative, we currently do not know the relevance of mitochondrial energy conversion for general sporozoite biology or specifically features of sporozoite movement. Given the required resources and time to set this experiment up and the uncertainty whether it is a relevant proxy for mitochondrial functioning, we argue it is out of scope for this manuscript.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      While this experiment could potentially further our understanding of the interaction between MICOS and levels of OXPHOS complex subunits we argue that the indirect nature of the evidence does not justify the required investments.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      We acknowledge that we cannot demonstrate the absolute absence of any membrane irregularities along the inner mitochondrial membrane. At the same time, if such structures were present, they would be extremely small and unlikely to contain the full set of proteins characteristic of mature cristae. For this reason, we consider it appropriate to classify ABS mitochondria as acristate. To reflect the reviewer's point while maintaining clarity for readers, we have slightly adjusted our wording in the manuscript, changing 'fully acristate' to 'acristate'.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      We agree with the reviewer that the absence of a detectable epitope‑tag signal does not definitively exclude low‑level expression, and we have therefore replaced the term 'absent' with 'undetectable' throughout the manuscript. In context with previous findings of low-level transcripts of the proteins in a study by Lopez-Berragan et al. and Otto et al., we also added the sentence "The apparent absence could indicate that transcripts are not translated in ABS or that the proteins' expression was below detection limits of western blot analysis." to the discussion. _At the same time, we would like to clarify that transcript levels for both genes fall within the

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      We appreciate the reviewer's suggestion. As noted in the Discussion, existing transcriptomic datasets already show detectable MIC19 and MIC60 mRNAs in ABS. For this reason, we expect RT-qPCR to reveal low (but not absent) levels of both transcripts, unlike the true loss expected to be observed in the dKO. Because such residual signals have been reported previously and their biological relevance remains uncertain, we do not believe transcript levels alone can serve as a definitive indicator of cristae absence in ABS.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Searching for the CX9C motifs is a valuable suggestion. In response to the reviewer´s suggestion we analysed the conservation of the motif in PfMIC19 and included this in a new figure panel (Figure 1 F).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      In response to this comment we made Figure 1 F, where we show conserved residues within the CHCH domains of a broad range of MIC19 annotated sequences across the opisthokonts, and show that the Cx9C motifs are conserved also in PfMIC19. Outside the CHCH domain, we did not find any meaningful conservation, as PfMIC19 heavily diverges from opisthokont MIC19.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      The graphs in figures 3, 4 and 5 got a makeover, such that they now are in linear scale and violin plots (also following a suggestion from further down in the reviewer's comments). We believe that this improves interpretability. ANOVA was kept as statistical testing to assure the correction for multiple comparisons that cannot be performed with standard t-test. A full overview of statistics and exact p-values can also be found in the newly added supplementary information 2.

      Minor comments: Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      We acknowledge that producing ATP via OXPHOS is not a characteristic of all mitochondria-like organelles (e.g. mitosomes), which is why these are typically classified separately from canonical mitochondria. When not considering mitochondria-like organelles, energy conversion is the function that the mitochondrion is most well-known for and the one associated with cristae.

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      To clarify we changed this to "yeast or human" model of mitochondria.

      Lines 75-76: This applies to Mic10 only

      We removed the "high degree of conservation in other cristate eukaryotes" statement.

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Done

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      To clarify: the mean reported in the table indicates the mean per replicate while the mean reported in figure 2C is the overall mean for a given genotype that corrects for variability within experiments. We agree that moving the table to the supplementary data is a good idea. We decided to not include a graph for infected and non-infected mosquitoes as this information would be partially misleading, highlighting a phenotype we argue to be influenced by the strong variability.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images.

      Thank you for the nice comment on our images. We have now moved part of the graphs to supplementary figure 6 and only kept the Relative Frequency, Sphericity and total mitochondria volume per cell in the main figure.

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      We have now specified the exact tubulin isoform used as the male gametocyte marker, both in the main text and in Supplementary Fig. S6. This is a commercial antibody previously known to work as an effective male marker, which is why we selected it for this experiment. This is now clearly stated in the manuscript.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      To clarify the biological effect that we can conclude form the measurement, we added an explanation about it in the respective section of the results, and we decided to replace the raw results of the plug-in readout with the deduced relative dispersion.

      Line 222: Report male/female crista measurements

      We added Supplementary information 2, which contains exact statistical test and outcomes on all presented quantifications as well as a per-sex statistical analysis of the data from figure 4. Correspondingly, we extended supplementary information 2 by a per-sex colour code for the thin section TEM data.

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      We changed this accordingly.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      This has been changed accordingly.

      Line 320: incorrect citation. Related to point 1above.

      Correct citation is now included in the text.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      This has been changed accordingly.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      The sentence has been substituted following the indication of the reviewer. Though we still include the data of the human cells as this has also been shown in Stephens et al. 2020.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Done

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1. Other suggestions for added value

      We removed the sentence. Also, the entire paragraph has been shortened, restructured and wording was changed to address major point 1.

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      While we did identify SAMM50 in our BN PAGE, the protein does not co-migrate with the MICOS components but instead comigrates with other components of a putative sorting and assembly machinery (SAM) complex. As SAMM50, the SAM complex and the overarching putative mitochondrial membrane space bridging (MIB) complex are not mentioned in the manuscript, we decided to not include the information in the figure.

      Reviewer #2 (Significance (Required)):

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors. In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact. For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

      We thank the reviewer for their extensive analysis of the significance of our findings, including the compliments on our microscopy images and the sophisticated experimental approaches. We hope we have convincingly argued why we could or could not include some of the additional analyses suggested by the reviewer in section 1 above.

      With regard to the significance statement, we want to point out that our finding that PfMICOS is not needed for initial formation of cristae (as opposed to organization thereof), is a confirmation of something that has been assumed by the field, without being the actual focus of studies. We argue that the distinction between formation and organization of cristae is important and deserves some attention within the manuscript. The result of MICOS not being involved in the initial formation of cristae, we argue to be relevant in Plasmodium biology and beyond. As for the insights into how MICOS works in Plasmodium we have confirmed that the previously annotated PfMIC60 is indeed involved in the organization of cristae. Furthermore, we have identified and characterized PfMIC19. These findings, we argue, are indeed meaningful insights into PfMICOS.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      We thank the reviewer for their time and compliment.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      We extended the introduction to include this information.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      To clarify we rephrased the sentence to: "Although MICOS has been described as an organizer of crista junctions, its role during the initial formation of nascent cristae has not been investigated."

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      To solve the reference issue, we added the uniprot IDs we compared to see that the annotated ORF is bigger in Plasmodium. We also changed the comparison to yeast instead of human, because we realized it is confusing to compare to yeast all throughout the figure, but then talk about human in this specific sentence.

      Regarding whether the true N-terminus is known. Short answer: No, not exactly.

      However, we do know that the Pf version is about double the size of the yeast protein.

      As the reviewer correctly states, we show the size of 120kDa for the tagged protein in Figure 1G. Considering that we tagged the protein C-terminally, and observed a 120kDa product on western blot, it is safe to conclude that the true N-terminus does not deviate massively from the annotated ORF, and hence, that there is a considerable extension of the protein beyond a 60kDa protein. We do not directly compare to yeast MIC60 on our western blots, however, that comparison can be drawn from literature: Tarasenko et al., 2017 showed that purified MIC60 running at ~60kDa on SDS-PAGE actively bends membranes, suggesting that in its active form, the monomer of yeast MIC60 is indeed 60kDa in size.

      To clarify, we now emphasize that we ran the Alphafold prediction on the annotated open reading frame (annotated and sequenced by Bohme et al. and Chapell et al. now cited in the manuscript), and revised the wording to make clear what we are comparing in which sentence.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      As a reply to this and other comments from the reviewers we added the multiple testing within all samples. In addition, to clarify statistics used we included a supplementary dataset with all p-values and statistical tests used.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      We deleted this statement.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      This sentence has been removed.

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      This sentence has been deleted in the revised version of the manuscript.

      Minor comments:

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Title is changed accordingly

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.

      Done, the paper is now cited

      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).

      Done

      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.

      The paper and concept have been added to the manuscript, though the sentence has been moved up in the introduction, when crista junctions are first introduced.

      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.

      We were referring to the poor confidence score. To address this comment as well as major point 2, we rewrote the respective paragraph. It now clearly states that confidence of the prediction is low, and we mention the tool that was used to identify conserved domains (Topology-based Evolutionary Domains).

      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".

      We adapted the domain description to "a stack of two parallel beta-sheets" and replaced the statement on unknown function by the statement "Because this domain is predicted solely from computational analysis, both its actual existence in the native protein and its biological function remain unknown."

      Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible

      We appreciate the reviewer's suggestion and note that the available structural data indeed provides valuable insight into how MIC60 and MIC19 interact. However, these structures represent fusion constructs of limited protein fragments and therefore capture only a small portion of each protein, specifically the interaction interface. Because our aim in Fig. 1B is to compare the overall domain architecture of the full‑length proteins, we believe that including fragment‑based structures would be less informative in this context.

      Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?

      The HA antibody used in our experiments is a standard commercial reagent that performs reliably in both WB and IFA, although it shows a low background signal in gametocytes. We agree that the sensitivity of the method and the interpretation of weak or absent bands should be addressed explicitly. Transcript levels for both PfMIC19 and PfMIC60 in asexual blood stages fall within the

      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.

      Considering the nature of the investigated proteins (embedded in the IMM and spread throughout the mitochondria) difficulties in achieving a clear signal in IFA or U-ExM are not very surprizing. While epitopes may remain buried in IFA, U-ExM usually increases accessibility for the antibodies. However, U-ExM comes at the cost of being prone to dotty background signals, therefore potentially hiding low abundance, naturally dotty signals such as the signal of MICOS proteins that localize to distinct foci (at the CJ) along the mitochondrion. Current literature suggests that, in both human and yeast, STED is the preferred method for accurate spatial resolution of MICOS proteins (https://www.ncbi.nlm.nih.gov/pubmed/32567732,https://www.ncbi.nlm.nih.gov/pubmed/32067344). Unfortunately, we do not have experience with, nor access to, this particular technique/method.

      Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).

      The limitations of other methods are described in the respective results section.

      We added a clarifying sentence in the results section of Figure 4:

      "Note that such measurements do not indicate the true total length or width of cristae, as the data is two-dimensional. The recorded values are to be considered indicative of possible trends, rather than absolute dimensions of cristae."

      This statement refers to the length/width measurements of cristae.

      In the context of Figure 4 D we mention the following (see preprint lines 229 - 230): "We expect this effect to translate into the third dimension and thus conclude that the mean crista volume increases with the loss of either PfMIC19,PfMIC60, or both."

      For Figure 5, we included a clarifying statement in the results section of the preprint (lines 269 - 273): "Note that these mitochondrial volumes are not full mitochondria, but large segments thereof. As a result of the incompleteness of the mitochondria within the section, and the tomography specific artefact of the missing wedge, we were unable to confirm whether cristae were in fact fully detached from the boundary membrane, or just too long to fit within the observable z-range. "

      Line 404: perhaps undetected or similar would be a better description than "hidden"?

      The sentence does not exist in the revised manuscript

      Reviewer #3 (Significance (Required)):

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism. The limitation of the study stems from what is already known about MICOS and its subunits in

      great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis. Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      As suggested by Reviewer 2, we examined mitochondrial membrane potential in gametocytes using MitoTracker staining and did not observe any obvious differences associated with the morphological defects. At present, additional assays to probe mitochondrial function in P. falciparum gametocytes are not sufficiently established, and developing and validating such methods would require substantial work before they could be applied to our mutant lines. For these reasons, a more detailed mechanistic link between the observed morphological changes and the reduced infection efficiency is currently beyond reach.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      MICOS is a conserved mitochondrial protein complex responsible for organising the mitochondrial inner membrane and the maintenance of cristae junctions. This study sheds first light on the role of two MICOS subunits (Mic60 and the newly annotated Mic19) in the malaria parasite Plasmodium falciparum, which forms cristae de novo during sexual development, as demonstrated by EM of thin section and electron tomography. By generating knockout lines (including a double knockout), the authors demonstrate that knockout of both MICOS subunits leads to defects in cristae morphology and a partial loss of cristae junctions. With a formidable set of parasitological assays, the authors show that despite the metabolically important role of mitochondria for gametocytes, the knockout lines can progress through the life stages and form sporozoites, albeit with diminished infection efficiency.

      Major comments:

      1) The authors should improve to present their findings in the right context, in particular by:

      (i) giving a clearer description in the introduction of what is already known about the role of MICOS. This starts in the introduction, where one main finding is missing: loss of MICOS leads to loss of cristae junctions and the detachment of cristae membranes, which are nevertheless formed, but become membrane vesicles. This needs to be clearly stated in the introduction to allow the reader to understand the consistency of the authors' findings in P. falciparum with previous reports in the literature.

      (ii) at the end to the introduction, the motivating hypothesis is formulated ad hoc "conclusive evidence about its involvement in the initial formation of cristae is still lacking" (line 83). If there is evidence in the literature that MICOS is strictly required for cristae formation in any organism, then this should be explained, because the bona fide role of MICOS is maintenance of cristae junctions (the hypothesis is still plausible and its testing important).

      2) Line 96-97: "Interestingly, PfMIC60 is much larger than the human MICOS counterpart, with a large, poorly predicted N-terminal extension." This statement is lacking a reference and presumably refers to annotated ORFs. The authors should clarify if the true N-terminus is definitely known - a 120kDa size is shown for the P. falciparum but this is not compared to the expected length or the size in S. cerevisiae.

      3) lines 244-245: "Furthermore, our data indicates the effect size increases with simultaneous ablation of both proteins?". The authors should explain which data they are referring to, as some of the data in Fig 3 and 4 look similar and all significance tests relate to the wild type, not between the different mutants, so it is not clear if any overserved differences are significant. The authors repeat this claim in the discussion in lines 368-369 without referring to a specific significance test. This needs to be clarified.

      4) lines 304-306: "Though well established as the cristae organizing system, the role of MICOS in initial formation of cristae remains hidden in model organisms that constitutively display cristae.". This sentence is misleading since even in organisms that display numerous cristae throughout their life cycle, new cristae are being formed as the cells proliferate. Thus, failure to produce cristae in MICOS knockout lines would have been observable but has apparently not been reported in the literature. Thus, the concerted process in P. falciparum makes it a great model organism, but not fundamentally different to what has been studied before in other organisms.

      5) lines 373-378. "where ablation of just MIC60 is sufficient to deplete functionality of the entire MICOS (11, 15),". The authors' claim appears to be contrary to what is actually stated in ref 15, which they cite:

      "MICOS subunits have non-redundant functions as the absence of both MICOS subcomplexes results in more severe morphological and respiratory growth defects than deletion of single MICOS subunits or subcomplexes."

      This seems in line with what the authors show, rather than "different".

      6) lines 380-385: "... thus suggesting that membrane invaginations still arise, but are not properly arranged in these knockout lines. This suggests that MICOS either isn't fully depleted,...". These conclusions are incompatible with findings from ref. 15, which the authors cite. In that study, the authors generated a ∆MICOS line which still forms membrane invaginations, showing that MICOS is not required at all for this process in yeast. Hence the authors' implication that MICOS needs to be fully depleted before membrane invaginations cease to occur is not supported by the literature.

      7) The authors should consider if the first part of their title could be seen as misleading: It suggests that MICOS is "the architect" in cristae formation, but this is not consistent with the literature nor their own findings.

      Minor comments:

      • Line 43, of the three seminal papers describing the discovery of MICOS in 2011, the authors only cite two (refs 6 and 7), but miss the third paper, Hoppins et al, PMID: 21987634, which should probably be corrected.
      • Page 2, line 58: for a more complete picture the authors should also cite the work of others here which shows that although at very low levels, e.g. complex III (a drug target) and ATP synthase do assemble (Nina et al, 2011, JBC).
      • Page 3, line 80: "Irrespective of the shape of an organism's cristae, the crista junctions have been described as tubular channels that connect the cristae membrane to the inner boundary membrane (22, 24)." This omits the slit-shaped cristae junctions found in yeast (Davies et al, 2011, PNAS), which the authors should include.
      • Line 97: "poorly predicted N-terminal extension", as there is no experimental structure, we don't know if the prediction is poor. Presumably the authors mean either poorly ordered or the absence of secondary structure elements, or the poor confidence score for that region in the prediction? This should be clarified or corrected.
      • Line 98: "an antiparallel array of ten β-sheets". They are actually two parallel beta-sheets stacked together. The authors could find out the name of this fold, but the confidence of the prediction is marked a low/very low. So, its existence is unknown, not just its "function".
      • Fig 1B: The authors show two alphafold predictions of S. cerevisiae and P. falciparum Mic60 structures. There is however an experimental Mic60/19 (fragment) structure from the former organism (PMID: 36044574), which should be included if possible
      • Line: 318-321: "The same trend was observed for PfMIC19 and PfMIC60. Although transcriptomic data suggested that low-level transcripts of PfMIC19 and PfMIC60 are present in ABS (38), we did not detect either of the proteins in ABS by western blot analysis. While this statement is true, the authors should comment on the sensitivity of the respective methods - how well was the antibody working in their hands and how do they interpret the absence of a WB band compared to transcriptomics data?
      • Lines 322-323: would the authors not typically have expected an IFA signal given the strength of the band in Western blot? If possible, the authors should comment if the negative fluorescence outcome can indeed be explained with the low abundance or if technical challenges are an equally good explanation.
      • Lines 357-365: the authors describe limitations of the applied methods adequately. Perhaps it would be helpful to make a similar statement about the analysis of 3D objects like mitochondria and cristae from 2D sections. E.g. the apparent cristae length depends on whether cristae are straight (e.g. coiled structures do not display long cross sections despite their true length in 3D).
      • Line 404: perhaps undetected or similar would be a better description than "hidden"?

      Significance

      The main strength of the study is that it provides the first characterisation of the MICOS complex in P. falciparum, a human parasite in which the mitochondrion has been shown to be a drug target. Mic60 and the newly annotated Mic19 are confirmed to be essential for proper cristae formation and morphology, as well as overall mitochondrial morphology. Furthermore, the mutant lines are characterised for their ability to complete the parasite life cycle and defects in infection effectivity are observed. This work is an important first step for deciphering the role of MICOS in the malaria parasite and the composition and function of this complex in this organism.

      The limitation of the study stems from what is already known about MICOS and its subunits in other organism. MICOS subunit knockouts have been characterised in great detail in yeast and humans with similar findings regarding loss of cristae and cristae defects. The findings of this study do not provide dramatic new insight on MICOS function or go substantially beyond the vast existing literature in terms of the extent of the study, which focuses on parasitological assays and morphological analysis.

      Exploring the role of MICOS in an early-divergent organism and human parasite is however important given the divergence found in mitochondrial biology and P. falciparum is a uniquely suited model system. One aspect that would increase the impact of the paper would be if the authors could mechanistically link the observed morphological defects to the decreased infection efficiency, e.g. by probing effects on mitochondrial function. This will likely be challenging as the morphological defects are diverse and the fitness defects appear moderate/mild.

      The advance presented in this study is to pioneer the study of MICOS in P. falciparum, thus widening our understanding of the role of this complex to different model organism. This study will likely be mainly of interest for specialised audiences such as basic research parasitologists and mitochondrial biologists. My own field of expertise is mitochondrial biology and structural biology.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Major comments:

      1) In my opinion, the authors tend to sensationalize or overinterpret their results. The title of the manuscript is very misleading. While MICOS is certainly important for crista formation, it is not the only factor, as ATP synthase dimer rows make a highly significant contribution to crista morphology. Thus, one can argue with equal validity that ATP synthase should be considered the 'architect', as it's the conformation of the dimers and rows modulate positive curvature. Secondly, while cristae are still formed upon mic60/mic19 gene knockout (KO), they are severely deformed, and likely dysfunctional (see below). Thus, I do not agree with the title that MICOS is dispensable for crista formation, because the authors results show that it clearly is essential. So, the title should be changed.

      The Discussion section starting from line 373 also suffers from overinterpretation as well as being repetitive and hard to understand. The authors infer that MICOS stability is compromised less in the single KOs (sKO) in compared to the mic60/mic19 double KO (dKO). MICOS stability was never directly addressed here and the composition of the MICOS complex is unaddressed, so it does not make sense to speculate by such tenuous connections. The data suggest to me that mic60 and mic19 are equally important for crista formation and crista junction (CJ) stabilization, and the dKO has a more severe phenotype than either KO, further demonstrating neither is epistatic.

      The following paragraphs (line 387 to 422) continues with such unnecessary overinterpretation to the point that it is confusing and contradictory. Line 387 mentions an 'almost complete loss of CJs' and then line 411 mentions an increase in CJ diameter, both upon Mic60 ablation. I do not think this discussion brings any added value to the manuscript and should be shortened. Yes, maybe there are other putative MICOS subunits that may linger in the KOS that are further destabilized in the dKO, or maybe Mic60 remains in the mic19 KO (and vice versa) to somehow salvage more CJs, which is not possible in the dKO. It is impossible to say with confidence how ATP synthase behaves in the KOs with the current data.

      2) While the authors went through impressive lengths to detect any effect on lifecycle progression, none was found except for a reduction in oocyte count. However, the authors did not address any direct effect on mitochondria, such as OXPHOS complex assembly, respiration, membrane potential. This seems like a missed opportunity, given the team's previous and very nice work mapping these complexes by complexome profiling. However, I think there are some experiments the authors can still do to address any mitochondrial defects using what they have and not resorting to complexome profiling (although this would be definitive if it is feasible):

      i) Quantification of MitoTracker Red staining in WT and KOs. The authors used this dye to visualize mitochondria to assay their gross morphology, but unfortunately not to assay membrane potential in the mutants. The authors can compare relative intensities of the different mitochondria types they categorized in Fig. 3A in 20-30 cells to determine if membrane potential is affected when the cristae are deformed in the mutants. One would predict they are affected.

      ii) Sporozoites are shown in Fig S5. The authors can use the same set up to track their motion, with the hypothesis that they will be slower in the mutants compared to WT due to less ATP. This assumes that sporozoite mitochondria are active as in gametocytes.

      iii) Shotgun proteomics to compare protein levels in mutants compared to WT, with the hypothesis that OXPHOS complex subunits will be destabilized in the mutants with deformed cristae. This could be indirect evidence that OXPHOS assembly is affected, resulting in destabilized subunits that fail to incorporate into their respective complexes.

      To expedite resubmission, the authors can restrict the cell lines to WT and the dKO, as the latter has a stronger phenotype that the individual KOs and conclusions from this cell line are valid for overall conclusions about Plasmodium MICOS.

      I will also conclude that complexome/shotgun proteomics may be a useful tool also for identifying other putative MICOS subunits by determining if proteins sharing the same complexome profile as PfMic60 and Mic19 are affected. This would address the overinterpretation problem of point 1.

      3) I am aware of the authors previous work in which they were not able to detect cristae in ABS, and thus have concluded that these are truly acristate. This can very well be true, or there can be immature cristae forms that evaded detection at the resolution they used in their volumetric EM acquisitions. The mitochondria and gametocyte cristae are pretty small anyway, so it not unreasonable to assume that putative rudimentary cristae in ABS may be even smaller still. Minute levels of sampled complex III and IV plus complex V dimers in ABS that were detected previously by the authors by complexome profiling would argue for the presence of miniscule and/or very few cristae.

      I think that authors should hedge their claim that ABS is acrisate by briefly stating that there still is a possibility that miniscule cristae may have been overlooked previously.

      This brings me to the claim that Mic19 and Mic60 proteins are not expressed in ABS. This is based on the lack of signal from the epitope tag; a weak signal is detected in gametocytes. Thus, one can counter that Mic19 and Mic60 are also expressed, but below the expression limits of the assay, as the protein exhibits low expression levels when mitochondrial activity is upregulated.

      To address this point, the authors should determine of mature mic60 and mic19 mRNAs are detected in ABS in comparison to the dKO, which will lack either transcript. RT-qPCR using polyT primers can be employed to detect these transcripts. If the level of these mRNAs are equivalent to dKO in WT ABS, the authors can make a pretty strong case for the absence of cristae in ABS.

      4) The major finding of the manuscript is of a Mic19 analog in plasmodium should be highlighted. As far as I know, this manuscript could represent the first instance of Mic19 outside of opisthokonts that was not found by sensitive profile HMM searches and certainly the first time such a Mic19 was functionally analyzed.

      They should highlight the twin CX9C motifs that are a hallmark of Mic19 and other proteins that undergo oxidative folding via the MIA pathway. Interestingly, the Mia40 oxidoreductase that is central to MIA in yeast and animals, is absent in apicomplexans (DOI: 10.1080/19420889.2015.1094593).

      Did the authors try to align Plasmodium Mic19 orthologs with conventional Mic19s? This may reveal some conserved residues within and outside of the CHCH domain.

      5) Statistcal significance. Sometimes my eyes see population differences that are considered insignificant by the statistical methods employed by the authors, eg Fig. 4E, mutants compared to WT, especially the dKO. Have the authors considered using other methods such as student t-test for pairwise comparisons?

      Minor comments:

      Line 33. Anaerobes (eg Giardia) have mitochondria that do produce ATP, unlike aerobic mitochondria

      Line 56: Unclear what authors mean by "canonical model of mitochondria"

      Lines 75-76: This applies to Mic10 only

      Line 80: Cite DOI: 10.1016/j.cub.2020.02.053

      Fig 2D: I find this table difficult to read. If authors keep table format, at least get rid of 'mean' column' as this data is better depicted in 2C. I suggest depicted this data either like in 3B depicting portion of infected vs unaffected flies in all experiments, then move modified Table to supplement. Important to point out experiment 5 appears to be an outlier with reduced infectivity across all cell lines, including WT.

      Fig. 3C-G: I feel like these data repeatedly lead to same conclusions. These are all different ways of showing what is depicted in Fig 2B: mitochondria gross morphology is affected upon ablation of MICOS. I suggest that these graphs be moved to supplement and replaced by the beautiful images

      Line 180: Be more specific with which tubulin isoform is used as a male marker and state why this marker was used in supplemental Fig S6.

      Line 196 and Fig 3C: the word 'intensities' in this context is very ambiguous. Please choose a different term (puncta, elements, parts?). This is related to major point 2i above.

      Line 222: Report male/female crista measurements

      Fig. 4B-E: depict data as violin plots or scatter plots like Fig. 2C to get a better grasp of how the crista coverage is distributed. It seems like the data spread is wider in the double KO. This would also solve the problem with the standard deviation extending beyond 0%.

      Lines 331-333: Please clarify that this applies for some, but not all MICOS subunits. Please also see major point 1 above. Also, the authors should point out that despite their structural divergence, trypanosomal cryptic mitofilins Mic34 and Mic40 are essential for parasite growth, in contrast to their findings with PfMic60 (DOI: https://doi.org/10.1101/2025.01.31.635831).

      Line 320: incorrect citation. Related to point 1above.

      Lines 333-335. This is related to the above. Again, some subunits appear to affect cell growth under lab conditions, and some do not. This and the previous sentence should be rewritten to reflect this.

      Line 343-345: The sentence and citation 45 are strange. Regarding the former, it is about CHCHD10, whose status as a bona fide MICOS subunit is very tenuous, so I would omit this. About the phenomenon observed, I think it makes more sense to write that Mic60 ablation results in partially fragmented mitochondria in yeast (Rabl et al., 2009 J Cell Biol. 185: 1047-63). A fragmented mitochondria is often a physiological response to stress. I would just rewrite as not to imply that mitochondrial fission (or fusion) is impaired in these KOs, or at least this could be one of several possibilities.

      Line 373: 'This indicates' is too strong. I would say 'may suggest' as you have no proof that any of the KOs disrupts MICOS. This hypothesis can be tested by other means, but not by penetrance of a phenotype.

      Line 376-377; 'deplete functionality' does not make sense, especially in the context of talking about MICOS subunit stability. In my opinion, this paragraph overinterprets the KO effects on MICOS stability. None of the experiments address this phenomenon, and thus the authors should not try to interpret their results in this context. See major point 1.

      Other suggestions for added value

      1) Does Plasmodium Sam50 co-fractionate with Mic60 and Mic19 in BN PAGE (Fig. 1E)

      2) Can Alphafold3 predict a heterotetramer of PfMic60? What about the four Mic19 and Mic60 subunits together. Is this tetramer consistent with the Bock-Bierbaum model. Is this model consistent with the CJ diameter measured in plasmodium, which is perhaps better evidence than that in lines 419-422.

      Significance

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      The manuscript by Tassan-Lugrezin is predicated on the idea that Plasmodium represents the only system in which de novo crista formation can be studied. They leverage this system to ask the question whether MICOS is essential for this process. They conclude based on their data that the answer is no, which the authors consider unprecedented. But even if their claim is true that ABS is acristate, this supposed advantage does not really bring any meaningful insight into how MICOS works in Plasmodium.

      First the positives of this manuscript. As has been the case with this research team, the manuscript is very sophisticated in the experimental approaches that are made. The highlights are the beautiful and often conclusive microscopy performed by the authors. Only the localization of Mic60 and Mic19 was inconclusive due to their very low expression unfortunately.

      The examination of the MICOS mutants during in vitro life cycle of Plasmodium falciparum is extremely impressive and yields convincing results. Mitochondrial deformation is tolerated by life cycle stage differentiation, with a modest but significant reduction of oocyte production, being observed.

      However, despite the herculean efforts of the authors, the manuscript as it currently stands represents only a minor advance in our understanding of the evolution of MICOS, which from the title and focus of the manuscript, is the main goal of the authors.

      In its current form, the manuscript reports some potentially important findings:

      1) Mic60 is verified to play a role in crista formation, as is predicted by its orthology to other characterized Mic60 orthologs.

      2) The discovery of a novel Mic19 analog (since the authors maintain there is no significant sequence homology), which exhibits a similar (or the same?) complexome profile with Mic60. This protein was upregulated in gametocytes like Mic60 and phenocopies Mic60 KO.

      3) Both of these MICOS subunits are essential (not dispensable) for proper crista formation

      4) Surprisingly, neither MICOS subunit is essential for in vitro growth or differentiation from ABS to sexual stages, and from the latter to sporozoites. This says more about the biology of plasmodium itself than anything about the essentiality of Mic60, ie plasmodium life cycle progression tolerates defects to mitochondrial morphology. But yes, I agree with the authors that Mic60's apparent insignificance for cell growth in examined conditions does differ with its essentiality in other eukaryotes. But fitness costs were not assayed (eg by competition between mutants and WT in infection of mosquitoes)

      5) Decreased fitness of the mutants is implied by a reduction of oocyte formation.

      While interesting in their own way, collectively they do not represent a major advance in our understanding of MICOS evolution. Furthermore, the findings bifurcate into categories informing MICOS or Plasmodium biology. Both aspects are somewhat underdeveloped in their current form.

      This is unfortunate because there seem to be many missed opportunities in the manuscript that could, with additional experiments, lead to a manuscript with much wider impact.

      For me, what is remarkable about Plasmodium MICOS that sets it apart from other iterations is the apparent absence of the Mic10 subunit. Purification of plasmodium MICOS via the epitope tagged Mic60 and Mic19 could have verified that MICOS is assembled without this core subunit. Perhaps Mic60 and Mic19 are the vestiges of the complex, and thus operate alone in shaping cristae. Such a reduction may also suggest the declining importance of mitochondria in plasmodium.

      Another missed opportunity was to assay the impact of MICOS-depletion of OXPHOS in plasmodium. This is a salient issue as maybe crista morphology is decoupled from OXPHOS capacity in Plasmodium, which links to the apparent tolerance of mitochondrial morphology in cell growth and differentiation. I suggested in section A experiments to address this deficit.

      Finally, the authors could assay fitness costs of MICOS-ablation and associated phenotypes by assaying whether mosquito infectivity is reduced in the mutants when they are directly competing with WT plasmodium. Like the authors, I am also surprised that MICOS mutants can pass population bottlenecks represented by differentiation events. Perhaps the apparent robustness of differentiation may contribute plasmodium's remarkable ability to adapt.

      I realize that the authors put a lot of efforts into their study and again, I am very impressed by the sophistication of the methods employed. Nevertheless, I think there is still better ways to increase the impact of the study aside from overinterpreting the conclusions from the data. But this would require more experiments along the lines I suggest in Section A and here.

    4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary: This manuscript reports the identification of putative orthologues of mitochondrial contact site and cristae organizing system (MICOS) proteins in Plasmodium falciparum - an organism that unusually shows an acristate mitochondrion during the asexual part of its life cycle and then this develops cristae as it enters the sexual stage of its life cycle and beyond into the mosquito. The authors identify PfMIC60 and PfMIC19 as putative members and study these in detail. The authors at HA tags to both proteins and look for timing of expression during the parasite life cycle and attempt (unsuccessfully) to localise them within the parasite. They also genetically deleted both gene singly and in parallel and phenotyped the effect on parasite development. They show that both proteins are expressed in gametocytes and not asexuals, suggesting they are present at the same time as cristae development. They also show that the proteins are dispensible for the entire parasite life cycle investigated (asexuals through to sporozoites), however there is some reduction in mosquito transmission. Using EM techniques they show that the morphology of gametocyte mitochondria is abnormal in the knock out lines, although there is great variation.

      Major comments: The manuscript is interesting and is an intriguing use of a well studied organism of medical importance to answer fundamental biological questions. My main comments are that there should be greater detail in areas around methodology and statistical tests used. Also, the mosquito transmission assays (which are notoriously difficult to perform) show substantial variation between replicates and the statistical tests and data presentation are not clear enough to conclude the reduction in transmission that is claimed. Perhaps this could be improved with clearer text?

      More specific comments to address:

      Line 101/Fig1E (and figure legend) - What is this heatmap showing. It would be helpful to have a sentence or two linking it to a specific methodology. I could not find details in the M+M section and "specialized, high molecular mass gels" does not adequately explain what experiments were performed. The reference to Supplementary Information 1 also did not provide information. Line 115 and Supplementary Figure 2C + D - The main text says that the transgenic parasites contained a mitochondrially localized mScarlet for visualization and localization, but in the supplementary figure 2 it shows mitotracker labelling rather than mScarlet. This is very confusing. The figure legend also mentions both mScarlet and MitoTracker. I assume that mScarlet was used to view in regular IFAs (Fig S2C) and the MitoTracker was used for the expansion microscopy (Fig S2D)? Please clarify. Figure 2C - what is the statistical test being used (the methods say "Mean oocysts per midgut and statistical significance were calculated using a generalized linear mixed effect model with a random experiment effect under a negative binomial distribution." but what test is this?)? Also the choice of a log10 scale for oocyst intensity is an unusual choice - how are the mosquitoes with 0 oocysts being represented on this graph? It looks like they are being plotted at 10^-1 (which would be 0.1 oocysts in a mosquito which would be impossible). Figure 2D - it is great that the data from all feeding replicates has been shared, however it is difficult to conclude any meaningful impact in transmission with the knock-out lines when there is so much variation and so few mosquitoes dissected for some datapoints (10 mosquitoes are very small sample sizes). For example, Exp1 shows a clear decrease in mic19- transmission, but then Exp2 does not really show as great effect. Similarly, why does the double knock out have better transmission than the single knockouts? Sure there would be a greater effect? Figure 3 legend - Please add which statistical test was used and the number of replicates. Figure 4 legend - Please add which statistical test was used and the number of replicates. Figure 5C - the 3D reconstructions are very nice, but what does the red and yellow coloring show? Line 352 - "Still, it is striking that, despite the pronounced morphological phenotype, and the possibly high mitochondrial stress levels, the parasites appeared mostly unaffected in life cycle propagation, raising questions about the functional relevance of mitochondria at these stages." How do the authors reconcile this statement with the proven fact that mitochondria-targeted antimalarials (such as atovaquone) are very potent inhibitors of parasite mosquito transmission?

      Significance

      This manuscript is a novel approach to studying mitochondrial biology and does open a lot of unanswered questions for further research directions. Currently there are limitations in the use of statistical tests and detail of methodology, but these could be easily be addressed with a bit more analysis/better explanation in the text. This manuscript could be of interest to readers with a general interest in mitochondrial cell biology and those within the specific field of Plasmodium research.

      My expertise is in Plasmodium cell biology.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      (1) I have to admit that it took a few hours of intense work to understand this paper and to even figure out where the authors were coming from. The problem setting, nomenclature, and simulation methods presented in this paper do not conform to the notation common in the field, are often contradictory, and are usually hard to understand. Most importantly, the problem that the paper is trying to solve seems to me to be quite specific to the particular memory study in question, and is very different from the normal setting of model-comparative RSA that I (and I think other readers) may be more familiar with.

      We have revised the paper for clarity at all levels: motivation, application, and parameterization. We clarify that there is a large unmet need for using RSA in a trial-wise manner, and that this approach indeed offers benefits to any team interested in decoding trial-wise representational information linked to a behavioral responses, and as such is not a problem specific to a single memory study.

      (2) The definition of "classical RSA" that the authors are using is very narrow. The group around Niko Kriegeskorte has developed RSA over the last 10 years, addressing many of the perceived limitations of the technique. For example, cross-validated distance measures (Walther et al. 2016; Nili et al. 2014; Diedrichsen et al. 2021) effectively deal with an uneven number of trials per condition and unequal amounts of measurement noise across trials. Different RDM comparators (Diedrichsen et al. 2021) and statistical methods for generalization across stimuli (Schütt et al. 2023) have been developed, addressing shortcomings in sensitivity. Finally, both a Bayesian variant of RSA (Pattern component modelling, (Diedrichsen, Yokoi, and Arbuckle 2018) and an encoding model (Naselaris et al. 2011) can effectively deal with continuous variables or features across time points or trials in a framework that is very related to RSA (Diedrichsen and Kriegeskorte 2017). The author may not consider these newer developments to be classical, but they are in common use and certainly provide the solution to the problems raised in this paper in the setting of model-comparative RSA in which there is more than one repetition per stimulus.

      We appreciate the summary of relevant literature and have included a revised Introduction to address this bounty of relevant work. While much is owed to these authors, new developments from a diverse array of researchers outside of a single group can aid in new research questions, and should always have a place in our research landscape. We owe much to the work of Kriegeskorte’s group, and in fact, Schutt et al., 2023 served as a very relevant touchpoint in the Discussion and helped to highlight specific needs not addressed by the assessment of the “representational geometry” of an entire presented stimulus set. Principal amongst these needs is the application of trial-wise representational information that can be related to trial-wise behavioral responses and thus used to address specific questions on brain-behavior relationships. We invite the Reviewer to consider the utility of this shift with the following revisions to the Introduction.

      Page 3. “Recently, methodological advancements have addressed many known limitations in cRSA. For example, cross-validated distance measures (e.g., Euclidean distance) have improved the reliability of representational dissimilarities in the presence of noise and trial imbalance (Walther et al., 2016; Nili et al., 2014; Diedrichsen et al., 2021). Bayesian approaches such as pattern component modeling (Diedrichsen, Yokoi, & Arbuckle, 2018) have extended representational approaches to accommodate continuous stimulus features or temporal variation. Further, model comparison RSA strategies (Diedrichsen et al., 2021) and generalization techniques across stimuli (Schütt et al., 2023) have improved sensitivity and inference. Nevertheless, a common feature shared across most of improvements is that they require stimuli repetition to examine the representational structure. This requirement limits their ability to probe brain-behavior questions at the level of individual events”.

      Page 8. “While several extensions of RSA have addressed key limitations in noise sensitivity, stimulus variance, and modeling (e.g., Diedrichsen et al., 2021; Schütt et al., 2023), our tRSA approach introduces a new methodological step by estimating representational strength at the trial level. This accounts for the multi-level variance structure in the data, affords generalizability beyond the fixed stimulus set, and allows one to test stimulus- or trial-level modulations of neural representations in a straightforward way”.

      Page 44. “Despite such prevalent appreciation for the neurocognitive relevance of stimulus properties, cRSA often does not account for the fact that the same stimulus (e.g., “basketball”) is seen by multiple subjects and produces statistically dependent data, an issue addressed by Schütt et al., 2023, who developed cross validation and bootstrap methods that explicitly model dependence across both subjects and stimulus conditions”.

      (3) The stated problem of the paper is to estimate "representational strength" in different regions or conditions. With this, the authors define the correlation of the brain RDM with a model RDM. This metric conflates a number of factors, namely the variances of the stimulus-specific patterns, the variance of the noise, the true differences between different dissimilarities, and the match between the assumed model and the data-generating model. It took me a long time to figure out that the authors are trying to solve a quite different problem in a quite different setting from the model-comparative approach to RSA that I would consider "classical" (Diedrichsen et al. 2021; Diedrichsen and Kriegeskorte 2017). In this approach, one is trying to test whether local activity patterns are better explained by representation model A or model B, and to estimate the degree to which the representation can be fully explained. In this framework, it is common practice to measure each stimulus at least 2 times, to be able to estimate the variance of noise patterns and the variance of signal patterns directly. Using this setting, I would define 'representational strength" very differently from the authors. Assume (using LaTeX notation) that the activity patterns $y_j,n$ for stimulus j, measurement n, are composed of a true stimulus-related pattern ($u_j$) and a trial-specific noise pattern ($e_j,n$). As a measure of the strength of representation (or pattern), I would use an unbiased estimate of the variance of the true stimulus-specific patterns across voxels and stimuli ($\sigma^2_{u}$). This estimator can be obtained by correlating patterns of the same stimuli across repeated measures, or equivalently, by averaging the cross-validated Euclidean distances (or with spatial prewhitening, Mahalanobis distances) across all stimulus pairs. In contrast, the current paper addresses a specific problem in a quite specific experimental design in which there is only one repetition per stimulus. This means that the authors have no direct way of distinguishing true stimulus patterns from noise processes. The trick that the authors apply here is to assume that the brain data comes from the assumed model RDM (a somewhat sketchy assumption IMO) and that everything that reduces this correlation must be measurement noise. I can now see why tRSA does make some sense for this particular question in this memory study. However, in the more common model-comparative RSA setting, having only one repetition per stimulus in the experiment would be quite a fatal design flaw. Thus, the paper would do better if the authors could spell the specific problem addressed by their method right in the beginning, rather than trying to set up tRSA as a general alternative to "classical RSA".

      At a general level, our approach rests on the premise that there is meaningful information present in a single presentation of a given stimulus. This assumption may have less utility when the research goals are more focused on estimating the fidelity of signal patterns for RSA, as in designs with multiple repetitions. But it is an exaggeration to state that such a trial-wise approach cannot address the difference between “true” stimulus patterns and noise. This trial-wise approach has explicit utility in relating trial-wise brain information to trial-wise behavior, across multiple cognitions (not only memory studies, as applied here). We have added substantial text to the Introduction distinguishing cRSA, which is widely employed, often in cases with a single repetition per stimulus, and model comparative methods that employ multiple repetitions. We clarify that we do not consider tRSA an alternative to the model comparative approach, and discuss that operational definitions of representational strength are constrained by the study design.

      Page 3. “In this paper, we present an advancement termed trial-level RSA, or tRSA, which addresses these limitations in cRSA (not model comparison approaches) and may be utilized in paradigms with or without repeated stimuli”.

      Page 4. “Representational geometry usually refers to the structure of similarities among repeated presentations of the same stimulus in the neural data (as captured in the brain RSM) and is often estimated utilizing a model comparison approach, whereas representational strength is a derived measure that quantifies how strongly this geometry aligns with a hypothesized model RSM. In other words, geometry characterizes the pattern space itself, while representational strength reflects the degree of correspondence between that space and the theoretical model under test”.

      Finally, we clarified that in our simulation methods we assume a true underlying activity pattern and a random error pattern. The model RSM is computed based on the true pattern, whereas the brain RSM comes from the noisy pattern, not the model RSM itself.

      Page 9. “Then, we generated two sets of noise patterns, which were controlled by parameters σ<sub>A</sub> and σ<sub>B</sub> , respectively, one for each condition”.

      (4) The notation in the paper is often conflicting and should be clarified. The actual true and measured activity patterns should receive a unique notation that is distinct from the variances of these patterns across voxels. I assume that $\sigma_ijk$ is the noise variances (not standard deviation)? Normally, variances are denoted with $\sigma^2$. Also, if these are variances, they cannot come from a normal distribution as indicated on page 10. Finally, multi-level models are usually defined at the level of means (i.e., patterns) rather than at the level of variances (as they seem to be done here).

      We have added notations for true and measured activity patterns to differentiate it from our notation for variance. We agree that multilevel models are usually defined at the level of means rather than at the level of variances and we include a Figure (Fig 1D) that describes the model in terms of the means. We clarify that the σ ($\sigma$) used in the manuscript were not variances/standard deviations themselves; rather, they were meant to denote components of the actual (multilevel) variance parameter. Each component was sampled from normal distributions, and they collectively summed up to comprise the final variance parameter for each trial. We have modified our notation for each component to the lowercase letter s to minimize confusion. We have also made our R code publicly available on our lab github, which should provide more clarity on the exact simulation process.

      (5) In the first set of simulations, the authors sampled both model and brain RSM by drawing each cell (similarity) of the matrix from an independent bivariate normal distribution. As the authors note themselves, this way of producing RSMs violates the constraint that correlation matrices need to be positive semi-definite. Likely more seriously, it also ignores the fact that the different elements of the upper triangular part of a correlation matrix are not independent from each other (Diedrichsen et al. 2021). Therefore, it is not clear that this simulation is close enough to reality to provide any valuable insight and should be removed from the paper, along with the extensive discussion about why this simulation setting is plainly wrong (page 21). This would shorten and clarify the paper.

      We have added justification of the mixed-effects model given the potential assumption violations. We caution readers to investigate the robustness of their models, and to employ permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. Finally, we agree that the first simulation setting does not possess several properties of realistic RDMs/RSMs; however, we believe that there is utility in understanding the mathematical properties of correlations – an essential component of RSA – in a straightforward simulation where the ground truth is known, thus moving the simulation to Appendix 1.

      (6) If I understand the second simulation setting correctly, the true pattern for each stimulus was generated as an NxP matrix of i.i.d. standard normal variables. Thus, there is no condition-specific pattern at all, only condition-specific noise/signal variances. It is not clear how the tRSA would be biased if there were a condition-specific pattern (which, in reality, there usually is). Because of the i.i.d. assumption of the true signal, the correlations between all stimulus pairs within conditions are close to zero (and only differ from it by the fact that you are using a finite number of voxels). If you added a condition-specific pattern, the across-condition RSA would lead to much higher "representational strength" estimates than a within-condition RSA, with obvious problems and biases.

      The Reviewer is correct that the voxel values in the true pattern are drawn from i.i.d. standard normal distributions. We take the Reviewer’s suggestion of “condition-specific pattern” to mean that there could be a condition-voxel interaction in two non-mutually exclusive ways. The first is additive, essentially some common underlying multi-voxel pattern like [6, 34, -52, …, 8] for all condition A trials, and different one such pattern for condition B trials, etc. The second is multiplicative, essentially a vector of scaling factors [x1.5, x0.5, x0.8, …, x2.7] for all condition A trials, and a different one such vector for condition B trials, etc. Both possibilities could indeed affect tRSA as much as it would cRSA.

      Importantly, If such a strong condition-specific pattern is expected, one can build a condition-specific model RDM using one-shot coding of conditions (see example figure; src: https://www.newbi4fmri.com/tutorial-9-mvpa-rsa), to either capture this interesting phenomenon or to remove this out as a confounding factor. This practice has been applied in multiple regression cRSA approaches (e.g., Cichy et al., 2013) and can also be applied to tRSA.

      (7) The trial-level brain RDM to model Spearman correlations was analyzed using a mixed effects model. However, given the symmetry of the RDM, the correlations coming from different rows of the matrix are not independent, which is an assumption of the mixed effect model. This does not seem to induce an increase in Type I errors in the conditions studied, but there is no clear justification for this procedure, which needs to be justified.

      We appreciate this important warning, and now caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the supplement.

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models. The multilevel structure of RSA data introduces potential dependencies across subjects, stimuli, and trials, which can violate assumptions of independence if not properly modeled. In the present study, we used a model that included random intercepts for both subjects and stimuli, which accounts for variance at these levels and improves the generalizability of fixed-effect estimates. Still, there is a potential for systematic dependence across trials within a subject. To ensure that the model assumptions were satisfied, we conducted a series of diagnostic checks on an exemplar ROI (right LOC; middle occipital gyrus) in the Object Perception dataset, including visual inspection of residual distributions and autocorrelation (Appendix 3, Figure 13). These diagnostics supported the assumptions of normality, homoscedasticity, and conditional independence of residuals. In addition, we conducted permutation-based inference, similar to prior improvements to cRSA (Niliet al. 2014), using a nested model comparison to test whether the mean similarity in this ROI was significantly greater than zero. The observed likelihood ratio test statistic fell in the extreme tail of the null distribution (Appendix 3, Figure 14), providing strong nonparametric evidence for the reliability of the observed effect. We emphasize that this type of model checking and permutation testing is not merely confirmatory but can help validate key assumptions in RSA modeling, especially when applying mixed-effects models to neural similarity data. Researchers are encouraged to adopt similar procedures to ensure the robustness and interpretability of their findings”.

      Exemplar Permutation Testing

      To test whether the mean representational strength in the ROI right LOC (middle occipital gyrus) was significantly greater than zero, we used a permutation-based likelihood ratio test implemented via the permlmer function. This test compares two nested linear mixed-effects models fit using the lmer function from the lme4 package, both including random intercepts for Participant and Stimulus ID to account for between-subject and between-item variability.

      The null model excluded a fixed intercept term, effectively constraining the mean similarity to zero after accounting for random effects:

      ROI ~ 0 + (1 | Participant) + (1 | Stimulus)

      The full model included the same random effects structure but allowed the intercept to be freely estimated:

      ROI ~ 1 + (1 | Participant) + (1 | Stimulus)

      By comparing the fit of these two models, we directly tested whether the average similarity in this ROI was significantly different from zero. Permutation testing (1,000 permutations) was used to generate a nonparametric p-value, providing inference without relying on normality assumptions. The full model, which estimated a nonzero mean similarity in the right LOC (middle occipital gyrus), showed a significantly better fit to the data than the null model that fixed the mean at zero (χ²(1) = 17.60, p = 2.72 × 10⁻⁵). The permutation-based p-value obtained from permlmer confirmed this effect as statistically significant (p = 0.0099), indicating that the mean similarity in this ROI was reliably greater than zero. These results support the conclusion that the right LOC contains representational structure consistent with the HMAXc2 RSM. A density plot of the permuted likelihood ratio tests is plotted along with the observed likelihood ratio test in Appendix 3 Figure 14.

      (8) For the empirical data, it is not clear to me to what degree the "representational strength" of cRSA and tRSA is actually comparable. In cRSA, the Spearman correlation assesses whether the distances in the data RSM are ranked in the same order as in the model. For tRSA, the comparison is made for every row of the RSM, which introduces a larger degree of flexibility (possibly explaining the higher correlations in the first simulation). Thus, could the gains presented in Figure 7D not simply arise from the fact that you are testing different questions? A clearer theoretical analysis of the difference between the average row-wise Spearman correlation and the matrix-wise Spearman correlation is urgently needed. The behavior will likely vary with the structure of the true model RDM/RSM.

      We agree that the comparability between mean row-wise Spearman correlations and the matrix-wise Spearman correlation is needed. We believe that the simulations are the best approach for this comparison, since they are much more robust than the empirical dataset and have the advantage of knowing the true pattern/noise levels. We expand on our comparison of mean tRSA values and matrix-wise Spearman correlations on page 42.

      Page 42. “Although tRSA and cRSA both aim to quantify representational strength, they differ in how they operationalize this concept. cRSA summarizes the correspondence between RSMs as a single measure, such as the matrix-wise Spearman correlation. In contrast, tRSA computes such correspondence for each trial, enabling estimates at the level of individual observations. This flexibility allows trial-level variability to be modeled directly, but also introduces subtle differences in what is being measured. Nonetheless, our simulations showed that, although numerical differences occasionally emerged—particularly when comparing between-condition tRSA estimates to within-condition cRSA estimates—the magnitude of divergence was small and did not affect the outcome of downstream statistical tests”.

      (9) For the real data, there are a number of additional sources of bias that need to be considered for the analysis. What if there are not only condition-specific differences in noise variance, but also a condition-specific pattern? Given that the stimuli were measured in 3 different imaging runs, you cannot assume that all measurement noise is i.i.d. - stimuli from the same run will likely have a higher correlation with each other.

      We recognize the potential of condition-specific patterns and chose to constrain the analyses to those most comparable with cRSA. However, depending on their hypotheses, researchers may consider testing condition RSMs and utilizing a model comparison approach or employ the z-scored approach, as employed in the simulations above. Regarding the potential run confounds, this is always the case in RSA and why we exclude within-run comparisons. We have also added to the Discussion the suggestion to include run as a covariate in their mixed-effects models. However, we do not employ this covariate here as we preferred the most parsimonious model to compare with cRSA.

      Page 46 - 47. “Further, while analyses here were largely employed to be comparable with cRSA, researchers should consider taking advantage of the flexibility of the mixed-effects models and include co variates of non-interest (run, trial order etc.)”.

      (10) The discussion should be rewritten in light of the fact that the setting considered here is very different from the model-comparative RSA in which one usually has multiple measurements per stimulus per subject. In this setting, existing approaches such as RSA or PCM do indeed allow for the full modelling of differences in the "representational strength" - i.e., pattern variance across subjects, conditions, and stimuli.

      We agree that studies advancing designs with multiple repetitions of a given stimulus image are useful in estimating the reliability of concept representations. We would argue however that model comparison in RSA is not restricted to such data. Many extant studies do not in fact have multiple repetitions per stimulus per subject (Wang et al., 2018 https://doi.org/10.1088/1741-2552/abecc3, Gao et al, 2022 https://doi.org/10.1093/cercor/bhac058, Li et al, 2022 https://doi.org/10.1002/hbm.26195, Staples & Graves, 2020 https://doi.org/10.1162/nol_a_00018) that allow for that type of model-comparative approach. While beneficial in terms of noise estimation, having multiple presentations was not a requirement for implementing cRSA (Kriegeskorte, 2008 https://doi.org/10.3389/neuro.06.004.2008). The aim of this manuscript is to introduce the tRSA approach to the broad community of researchers whose research questions and datasets could vary vastly, including but not limited to the number of repeated presentations and the balance of trial counts across conditions.

      (11) Cross-validated distances provide a powerful tool to control for differences in measurement noise variances and possible covariances in measurement noise across trials, which has many distinct advantages and is conceptually very different from the approach taken here.

      We have added language on the value of cross-validation approaches to RSA in the Discussion:

      Page 47. “Additionally, we note that while our proposed tRSA framework provides a flexible and statistically principled approach for modeling trial-level representational strength, we acknowledge that there are alternative methods for addressing trial-level variability in RSA. In particular, the use of cross-validated distance metrics (e.g., crossnobis distance) has become increasingly popular for controlling differences in measurement noise variance and accounting for possible covariance structures across trials (Walther et al., 2016). These metrics offer several advantages, including unbiased estimation of representational dissimilarities under Gaussian noise assumptions and improved generalization to unseen data. However, cross-validated distances are conceptually distinct from the approach taken here: whereas cross-validation aims to correct for noise-related biases in representational dissimilarity matrices, our trial-level RSA method focuses on estimating and modeling the variability in representation strength across individual trials using mixed-effects modeling. Rather than proposing a replacement for cross-validated RSA, tRSA adds a complementary tool to the methodological toolkit—one that supports hypothesis-driven inference about condition effects and trial-level covariates, while leveraging the full structure of the data”.

      (12) One of the main limitations of tRSA is the assumption that the model RDM is actually the true brain RDM, which may not be the case. Thus, in theory, there could be a different model RDM, in which representational strength measures would be very different. These differences should be explained more fully, hopefully leading to a more accessible paper.

      Indeed, the chosen model RSM may not be the true RSM, but as the noise level increases the correlation between RSMs practically becomes zero. In our simulations we assume this to be true as a straightforward way to manipulate the correspondence between the brain data and the model. However, just like cRSA, tRSA is constrained by the model selections the researchers employ. We encourage researchers to have carefully considered theoretically-motivated models and, if their research questions require, consider multiple and potentially competing models. Furthermore, the trial-wise estimates produced by tRSA encourage testing competing models within the multiple regression framework. We have added this language to the Discussion.

      Page 46. ..”choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives”.

      Pages 45-46. “While a number of studies have addressed the validity of measuring representational geometry using designs with multiple repetitions, a conceptual benefit of the tRSA approach is the reliance on a regression framework that engenders the testing of competing conceptual models of stimulus representation (e.g., taxonomic vs. encyclopedic semantic features, as in Davis et al., 2021)”.

      Reviewer #2 (Public review):

      (1)  While I generally welcome the contribution, I take some issue with the accusatory tone of the manuscript in the Introduction. The text there (using words such as 'ignored variances', 'errouneous inferences', 'one must', 'not well-suited', 'misleading') appears aimed at turning cRSA in a 'straw man' with many limitations that other researchers have not recognized but that the new proposed method supposedly resolves. This can be written in a more nuanced, constructive manner without accusing the numerous users of this popular method of ignorance.

      We apologize for the unintended accusatory tone. We have clarified the many robust approaches to RSA and have made our Introduction and Discussion more nuanced throughout (see also 3, 11 and16).

      (2) The described limitations are also not entirely correct, in my view: for example, statistical inference in cRSA is not always done using classic parametric statistics such as t-tests (cf Figure 1): the rsatoolbox paper by Nili et al. (2014) outlines non-parametric alternatives based on permutation tests, bootstrapping and sign tests, which are commonly used in the field. Nor has RSA ever been conducted at the row/column level (here referred to by the authors as 'trial level'; cf King et al., 2018).

      We agree there are numerous methods that go beyond cRSA addressing these limitations and have added discussion of them into our manuscript as well as an example analysis implementing permutation tests on tRSA data (see response to 7). We thank the reviewer for bringing King et al., 2014 and their temporal generalization method to our attention, we added reference to acknowledge their decoding-based temporal generalization approach.

      Page 8. “It is also important to note that some prior work has examined similarly fine-grained representations in time-resolved neuroimaging data, such as the temporal generalization method introduced by King et al. (see King & Dehaene, 2014). Their approach trains classifiers at each time point and tests them across all others, resulting in a temporal generalization matrix that reflects decoding accuracy over time. While such matrices share some structural similarity with RSMs, they do not involve correlating trial-level pattern vectors with model RSMs nor do their second-level models include trial-wise, subject-wise, and item-wise variability simultaneously”.

      (3) One of the advantages of cRSA is its simplicity. Adding linear mixed effects modeling to RSA introduces a host of additional 'analysis parameters' pertaining to the choice of the model setup (random effects, fixed effects, interactions, what error terms to use) - how should future users of tRSA navigate this?

      We appreciate the opportunity to offer more specific proscriptions for those employing a tRSA technique, and have added them to the Discussion:

      Page 46. “While linear mixed-effects modeling offers a powerful framework for analyzing representational similarity data, it is critical that researchers carefully construct and validate their models and choose their model RSMs carefully. In our simulations, we designed our model RSM to be the “true” RSM for demonstration purposes. However, researchers should consider if their models and model alternatives. However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (4) Here, only a single real fMRI dataset is used with a quite complicated experimental design for the memory part; it's not clear if there is any benefit of using tRSA on a simpler real dataset. What's the benefit of tRSA in classic RSA datasets (e.g., Kriegeskorte et al., 2008), with fixed stimulus conditions and no behavior?

      To clarify, our empirical approach uses two different tasks: an Object Perception task more akin to the classic RSA datasets employing passive viewing, and a Conceptual Retrieval task that more directly addresses the benefits of the trialwise approach. We felt that our Object Perception dataset is a simpler empirical fMRI dataset without explicit task conditions or a dichotomous behavioral outcome, whereas the Retrieval dataset is more involved (though old/new recognition is the most common form of memory retrieval testing) and  dependent on behavioral outcomes. However, we recognize the utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (5) The cells of an RDM/RSM reflect pairwise comparisons between response patterns (typically a brain but can be any system; cf Sucholutsky et al., 2023). Because the response patterns are repeatedly compared, the cells of this matrix are not independent of one another. Does this raise issues with the validity of the linear mixed effects model? Does it assume the observations are linearly independent?

      We recognize the potential danger for not meeting model assumptions. Though our simulation results and model checks suggest this is not a fatal flaw in the model design, we caution readers to investigate the robustness of their models, and consider employing permutation testing that does not make independence assumptions. We have also added checks of the model residuals and an example of permutation testing in the Appendix. See response to R1.

      (6) The manuscript assumes the reader is familiar with technical statistical terms such as Type I/II error, sensitivity, specificity, homoscedasticity assumptions, as well as linear mixed models (fixed effects, random effects, etc). I am concerned that this jargon makes the paper difficult to understand for a broad readership or even researchers currently using cRSA that might be interested in trying tRSA.

      We agree this jargon may cause the paper to be difficult to understand. We have expanded/added definitions to these terms throughout the methods and results sections.

      Page 12. “Given data generated with 𝑠<sub>𝑐𝑜𝑛𝑑,𝐴</sub> = 𝑠<sub>𝑐𝑜𝑛𝑑,B</sub>, the correct inference should be a failure to reject the null hypothesis of ; any significant () result in either direction was considered a false positive (spurious effect, or Type I error). Given data generated with , the inference was considered correct if it rejected the null hypothesis of  and yielded the expected sign of the estimated contrast (b<sub>B-𝐴</sub><0). A significant result with the reverse sign of the estimated contrast (b<sub>B-𝐴</sub><0) was considered a Type I error, and a nonsignificant (𝑝 ≥ 0.05) result was considered a false negative (failure to detect a true effect, or Type II error)”.

      Page 2. “Compared to cRSA, the multi-level framework of tRSA was both more theoretically appropriate and significantly sensitive (better able to detect) to true effects”.

      Page 25.”The performance of cRSA and tRSA were quantified with their specificity (better avoids false positives, 1 - Type I error rate) and sensitivity (better avoids false negatives 1 - Type II error rate)”.

      Page 6. “One of the fundamental assumptions of general linear models (step 4 of cRSA; see Figure 1D) is homoscedasticity or homogeneity of variance — that is, all residuals should have equal variance” .

      Page11. “Specifically, a linear mixed-effects model with a fixed effect  of condition (which estimates the average effect across the entire sample, capturing the overall effect of interest) and random effects of both subjects and stimuli (which model variation in responses due to differences between individual subjects and items, allowing generalization beyond the sample) were fitted to tRSA estimates via the `lme4 1.1-35.3` package in R (Bates et al., 2015), and p-values were estimated using Satterthwaites’s method via the `lmerTest 3.1-3` package (Kuznetsova et al., 2017)”.

      (7) I could not find any statement on data availability or code availability. Given that the manuscript reuses prior data and proposes a new method, making data and code/tutorials openly available would greatly enhance the potential impact and utility for the community.

      We thank the reviewer for raising our oversight here. We have added our code and data availability statements.

      Page 9. “Data is available upon request to the corresponding author and our simulations and example tRSA code is available at https://github.com/electricdinolab”.

      Reviewer #1 (Recommendations for the authors):

      (13) Page 4: The limitations of cRSA seem to be based on the assumption that within each different experimental condition, there are different stimuli, which get combined into the condition. The framework of RSA, however, does not dictate whether you calculate a condition x condition RDM or a larger and more complete stimulus x stimulus RDM. Indeed, in practice we often do the latter? Or are you assuming that each stimulus is only shown once overall? It would be useful at this point to spell out these implicit assumptions.

      We agree that stimulus x stimulus RDMs can be constructed and are often used. However, as we mentioned in the Introduction, researchers are often interested in the difference between two (or more) conditions, such as “remembered” vs. “forgotten” (Davis et al., https://doi.org/10.1093/cercor/bhaa269) or “high cognitive load” vs. “low cognitive load” (Beynel et al., https://doi.org/10.1523/JNEUROSCI.0531-20.2020). In those cases, the most common practice with cRSA is to construct condition-specific RDMs, compute cRSA scores separately for each condition, and then compare the scores at the group level. The number of times each stimulus gets presented does not prevent one from creating a model RDM that has the same rows and columns as the brain RDM, either in the same condition (“high load”) or across different conditions.

      (14) Page 5: The difference between condition-level and stimulus-level is not clear. Indeed, this definition seems to be a function of the exact experimental design and is certainly up for interpretation. For example, if I conduct a study looking at the activity patterns for 4 different hand actions, each repeated multiple times, are these actions considered stimuli or conditions?

      We have added clarifying language about what is considered stimuli vs conditions. Indeed, this will depend on the specific research questions being employed and will affect how researchers construct their models. In this specific example, one would most likely consider each different hand action a condition, treating them as fixed effects rather than random effects, given their very limited number and the lack of need to generalize findings to the broader “hand actions” category.

      Page 5. “Critically, the distinction between condition-level and stimulus level is not always clear as researchers may manipulate stimulus-level features themselves. In these cases, what researchers ultimately consider condition-level and stimulus-level will depend on their specific research questions. For example, researchers intending to study generalized object representation may consider object category a stimulus-level feature, while researchers interested in if/how object representation varies by category may consider the same category variable condition-level”.

      (15) Page 5: The fact that different numbers of trials / different levels of measurement noise / noise-covariance of different conditions biases non-cross-validated distances is well known and repeatedly expressed in the literature. We have shown that cross-validation of distances effectively removes such biases - of course, it does not remove the increased estimation variability of these distances (for a formal analysis of estimation noise on condition patterns and variance of the cross-nobis estimator, see (Diedrichsen et al. 2021)).

      We thank the reviewer for drawing our attention to this literature and have added discussions of these methods.

      (16). Page 5: "Most studies present subjects with a fixed set of stimuli, which are supposedly samples representative of some broader category". This may be the case for a certain type of RSA experiments in the visual domain, but it would be unfair to say that this is a feature of RSA studies in general. In most studies I have been involved in, we use a "stimulus" x "stimulus" RDM.

      We have edited this sentence to avoid the “most” characterization. We also added substantial text to the introduction and discussion distinguishing cRSA, which is nonetheless widely employed, especially in cases with a single repetition per stimulus (Macklin et al., 2023, Liu et al, 2024) and the model comparative method and explicitly stating that we do not consider tRSA an alternative to the model comparative approach.

      (17). Page 5: I agree that "stimuli" should ideally be considered a random effect if "stimuli" can be thought of as sampled from a larger population and one wants to make inferences about that larger population. Sometimes stimuli/conditions are more appropriately considered a fixed effect (for example, when studying the response to stimulation of the 5 fingers of the right hand). Techniques to consider stimuli/conditions as a random effect have been published by the group of Niko Kriegeskorte (Schütt et al. 2023).

      Indeed, in some cases what may be thought of as “stimuli” would be more appropriately entered into the model as a fixed effect; such questions are increasingly relevant given the focus on item-wise stimulus properties (Bainbridge et al., Westfall & Yarkoni). We have added text on this issue to the Discussion and caution researchers to employ models that most directly answer their research questions.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question. An effect is fixed when the levels represent the specific conditions of theoretical interest (e.g., task condition) and the goal is to estimate and interpret those differences directly. In contrast, an effect is random when the levels are sampled from a broader population (e.g., subjects) and the goal is to account for their variability while generalizing beyond the sample tested. Note that the same variable (e.g., stimuli) may be considered fixed or random depending on the research questions”.

      (18) Page 6: It is correct that the "classical" RSA depends on a categorical assignment of different trials to different stimuli/conditions, such that a stimulus x stimulus RDM can be computed. However, both Pattern Component Modelling (PCM) and Encoding models are ideally set up to deal with variables that vary continuously on a trial-by-trial or moment-by-moment basis. tRSA should be compared to these approaches, or - as it should be clarified - that the problem setting is actually quite a different one.

      We agree that PCM and encoding models offer a flexible approach and handle continuous trial-by-trial variables. We have clarified the problem setting in cRSA is distinct on page 6, and we have added the robustness of encoding models and their limitations to the Discussion.

      Page 6. “While other approaches such as Pattern Component Modeling (PCM) (Diedrichsen et al., 2018) and encoding models (Naselaris et al., 2011) are well-suited to analyzing variables that vary continuously on a trial-by-trial or moment-by-moment basis, these frameworks address different inferential goals. Specifically, PCM and encoding models focus on estimating variance components or predicting activation from features, while cRSA is designed to evaluate representational geometry. Thus, cRSA as well as our proposed approach address a problem setting distinct from PCM and encoding models”.

      (19) Page 8: "Then, we generated two noise patterns, which were controlled by parameters 𝜎 𝐴 and 𝜎𝐵, respectively, one for each condition." This makes little sense to me. The noise patterns should be unique to each trial - you should generate n_a + n_b noise patterns, no?

      We clarify that the “noise patterns” here are n_voxel x n_trial in size; in other words, all trial-level noise patterns are generated together and each trial has their own unique noise pattern. We have revised our description as “two sets of noise patterns” for clarity starting on page 9.

      (20) Page 9: First, I assume if this is supposed to be a hierarchical level model, the "noise parameters" here correspond to variances? Or do these \sigma values mean to signify standard deviations? The latter would make little sense. Or is it the noise pattern itself?

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (21) Page 10: your formula states "𝜎<sub>𝑠𝑢𝑏𝑗</sub>~ 𝙽(0, 0.5^2)". This conflicts with your previous mention that \sigmas are noise "levels" are they the noise patterns themselves now? Variances cannot be normally distributed, as they cannot be negative.

      As clarified in 4., the σ values are meant to denote hierarchical components of the composite standard deviation; we have updated our notation to use lower case letter s instead for clarity.

      (22) Page 13: What was the task of the subject in the Memory retrieval task? Old/new judgements relative to encoding of object perception?

      We apologize for the lack of clarity about the Memory Retrieval task and have added that information and clarified that the old/new judgements were relative to a separate encoding phase, the brain data for which has been reported elsewhere.

      Page 14. “Memory Retrieval took place one day after Memory Encoding and involved testing participants’ memory of the objects seen in the Encoding phase. Neural data during the Encoding phase has been reported elsewhere. In the main Memory Retrieval task, participants were presented with 144 labels of real-world objects, of which 114 were labels for previously seen objects and 30 were unrelated novel distractors. Participants performed old/new judgements, as well as their confidence in those judgements on a four-point scale (1 = Definitely New, 2 = Probably New, 3 = Probably Old, 4 = Definitely Old)”.

      (23) Page 13: If "Memory Retrieval consisted of three scanning runs", then some of the stimulus x stimulus correlations for the RSM must have been calculated within a run and some between runs, correct? Given that all within-run estimates share a common baseline, they share some dependence. Was there a systematic difference between the within-run and the between-run correlations?

      We have clarified in this portion of the methods that within run comparisons were excluded from our analyses. We also double-checked that the within-run exclusion was included in the description of the Neural RSMs.

      Page 14. “Retrieval consisted of three scanning runs, each with 38 trials, lasting approximately 9 minutes and 12 seconds (within-run comparisons were later excluded from RSA analyses)”.

      Page 18. “This was done by vectorizing the voxel-level activation values within each region and calculating their correlations using Pearson’s r, excluding all within-run comparisons.”

      (24) Page 20: It is not clear why the mean estimate of "representational strength" (i.e., model-brain RSM correlations) is important at all. This comes back to Major point #2, namely that you are trying to solve a very different problem from model-comparative RSA.

      We have clarified that our approach is not an alternative to model-comparative RSA, and that depending on the task constraints researchers may choose to compare models with tRSA or other approaches requiring stimulus repetition (see 3).

      (25) Page 21: I believe the problems of simulating correlation matrices directly in the way that the authors in their first simulation did should be well known and should be moved to an appendix at best. Better yet, the authors could start with the correct simulation right away.

      We agree the paper is more concise with these simulations being moved to the appendix and more briefly discussed. We have implemented these changes (Appendix 1). However, we are not certain that this problem is unknown, and have several anecdotes of researchers inquiring about this “alternative” approach in talks with colleagues, thus we do still discuss the issues with this method.

      (26) Page 26: Is the "underlying continuous noise variable 𝜎𝑡𝑟𝑖𝑎𝑙 that was measured by 𝑣𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 " the variance of the noise pattern or the noise pattern itself? What does it mean it was "measured" - how?

      𝜎𝑡𝑟𝑖𝑎𝑙 is a vector of standard deviations for different trials, and 𝜎𝑡𝑟𝑖𝑎𝑙 i would be used to generate the noise patterns for trial i. v_measured is a hypothetical measurement of trial-level variability, such as “memorability” or “heartbeat variability”. We have revised our description to clarify our methods.

      Reviewer #2 (Recommendations for the authors):

      (8) It would be helpful to provide more clarity earlier on in the manuscript on what is a 'trial': in my experience, a row or column of the RDM is usually referred to as 'stimulus condition', which is typically estimated on multiple trials (instances or repeats) of that stimulus condition (or exemplars from that stimulus class) being presented to the subject. Here, a 'trial' is both one measurement (i.e., single, individual presentation of a stimulus) and also an entry in the RDM, but is this the most typical scenario for cRSA? There is a section in the Discussion that discusses repetitions, but I would welcome more clarity on this from the get-go.

      We have added discussion of stimulus repetition methods and datasets to the Introduction and clarified our use of the terms.

      Page 8. “Critically, in single-presentation designs, a “trial” refers to one stimulus presentation, and corresponds to a row or column in the RSM. In studies with repeated stimuli, these rows are often called “conditions” and may reflect aggregated patterns across trials. tRSA is compatible with both cases: whether rows represent individual trials or averaged trials that create “conditions”, tRSA estimates are computed at the row level”.

      (9) The quality of the results figures can be improved. For example, axes labels are hard to read in Figure 3A/B, panels 3C/D are hard to read in general. In Figure 7E, it's not possible to identify the 'dark red' brain regions in addition to the light red ones.

      We thank the reviewer for raising these and have edited the figures to be more readable in the manner suggested.

      (10) I would be interested to see a comparison between tRSA and cRSA in other fMRI (or other modality) datasets that have been extensively reported in the literature. These could be the original Kriegeskorte 96 stimulus monkey/fMRI datasets, commonly used open datasets in visual perception (e.g., THINGS, NSD), or the above-mentioned King et al. dataset, which has been analyzed in various papers.

      We recognize the great utility of replication from other research groups and do invite researchers to utilize tRSA on their datasets.

      (11) On P39, the authors suggest 'researchers can confidently replace their existing cRSA analysis with tRSA': Please discuss/comment on how researchers should navigate the choice of modeling parameters in tRSA's linear mixed effects setting.

      We have added discussion of the mixed-effects parameters and the various and encourage researchers to follow best practices for their model selection.

      Page 46. “However, researchers should always consider if their models match the goals of their analysis, including 1) constructing the random effects structure that will converge in their dataset and 2) testing their model fits against alternative structures (Meteyard & Davies, 2020; Park et al., 2020) and 3) considering which effects should be considered random or fixed depending on their research question”.

      (12) The final part of the Results section, demonstrating the tRSA results for the continuous memorability factor in the real fMRI data, could benefit from some substantiation/elaboration. It wasn't clear to me, for example, to what extent the observed significant association between representational strength and item memorability in this dataset is to be 'believed'; the Discussion section (p38). Was there any evidence in the original paper for this association? Or do we just assume this is likely true in the brain, based on prior literature by e.g. Bainbridge et al (who probably did not use tRSA but rather classic methods)?

      Indeed, memorability effects have been replicated in the literature, but not using the tRSA method. We have expanded our discussion to clarify the relationship of our findings and the relevant literature and methods it has employed.

      Page 38. “Critically, memorability is a robust stimulus property that is consistent across participants and paradigms (Bainbridge, 2022). Moreover, object memorability effects have been replicated using a variety of methods aside from tRSA, including univariate analyses and representational analyses of neural activity patterns where trial-level neural activity pattern estimates are correlated directly with object memorability (Slayton et al, 2025).”

      (13) The abstract could benefit from more nuance; I'm not sure if RSA can indeed be said to be 'the principal method', and whether it's about assessing 'quality' of representations (more commonly, the term 'geometry' or 'structure' is used).

      We have edited the abstract to reflect the true nuisance in the current approaches.

      Abstract. Neural representation refers to the brain activity that stands in for one’s cognitive experience, and in cognitive neuroscience, a prominent method of studying neural representations is representational similarity analysis (RSA). While there are several recent advances in RSA, the classic RSA (cRSA) approach examines the structure of representations across numerous items by assessing the correspondence between two representational similarity matrices (RSMs): usually one based on a theoretical model of stimulus similarity and the other based on similarity in measured neural data.

      (14) RSA is also not necessarily about models vs. neural data; it can also be between two neural systems (e.g., monkey vs. human as in Kriegeskorte et al., 2008) or model systems (see Sucholutsky et al., 2023). This statement is also repeated in the Introduction paragraph 1 (later on, it is correctly stated that comparing brain vs. model is most likely the 'most common' approach).

      We have added these examples in our introduction to RSA.

      Page 3.”One of the central approaches for evaluating information represented in the brain is representational similarity analysis (RSA), an analytical approach that queries the representational geometry of the brain in terms of its alignment with the representational geometry of some cognitive model (Kriegeskorte et al., 2008; Kriegeskorte & Kievit, 2013), or, in some cases, compares the representational geometry of two neural systems (e.g., Kriegeskorte et al., 2008) or two model systems (Sucholutsky et al., 2023)”.

      (15) 'theoretically appropriate' is an ambiguous statement, appropriate for what theory?

      We apologize for the ambiguous wording, and have corrected the text:

      Page 11. “Critically, tRSA estimates were submitted to a mixed-effects model which is statistically appropriate for modeling the hierarchical structure of the data, where observations are nested within both subjects and stimuli (Baayen et al., 2008; Chen et al., 2021)”.

      (16) I found the statement that cRSA "cannot model representation at the level of individual trials" confusing, as it made me think, what prohibits one from creating an RDM based on single-trial responses? Later on, I understood that what the authors are trying to say here (I think) is that cRSA cannot weigh the contributions of individual rows/columns to the overall representational strength differently.

      We thank the reviewer for their clarifying language and have added it to this section of the manuscript.

      “Abstract. However, because cRSA cannot weigh the contributions of individual trials (RSM rows/columns), it is fundamentally limited in its ability to assess subject-, stimulus-, and trial-level variances that all influence representation”.

      (17) Why use "RSM" instead of "RDM"? If the pairwise comparison metric is distance-based (e..g, 1-correlation as described by the authors), RDM is more appropriate.

      We apologize for the error, and have clarified the Methods text:

      Page3-4. First, brain activity responses to a series of N trials are compared against each other (typically using Pearson’s r) to form an N×N representational similarity matrix.

      (18) Figure 2: please write 'Correlation estimate' in the y-axis label rather than 'Estimate'.

      We have edited the label in Figure 2.

      (19) Page 6 'leaving uncertain the directionality of any findings' - I do not follow this argument. Obviously one can generate an RDM or RSM from vector v or vector -v. How does that invalidate drawing conclusions where one e.g., partials out the (dis)similarity in e.g., pleasantness ratings out of another RDM/RSM of interest?

      We agree such an approach does not invalidate the partial method; we have clarified what we mean by “directionality”.

      Page 8. ”For instance, even though a univariate random variable , such as pleasantness ratings, can be conveniently converted to an RSM using pairwise distance metrics (Weaverdyck et al., 2020), the very same RSM would also be derived from the opposite random variable , leaving uncertain of the directionality (or if representation is strongest for pleasant or unpleasant items) of any findings with the RSM (see also Bainbridge & Rissman, 2018)”.

      (20) P7 'sampled 19900 pairs of values from a bi-variate normal distribution', but the rows/columns in an RDM are not independent samples - shouldn't this be included in the simulation? I.e., shouldn't you simulate first the n=200 vectors, and then draw samples from those, as in the next analysis?

      This section has been moved to Appendix 1 (see responses to Reviewer 1.13).

      (21) Under data acquisition, please state explicitly that the paper is re-using data from prior experiments, rather than collecting data anew for validating tRSA.

      We have clarified this in the data acquisition section.

      Page 13. “A pre-existing dataset was analyzed to evaluate tRSA. Main study findings have been reported elsewhere (S. Huang, Bogdan, et al., 2024)”.

      (22) Figure 4 could benefit from some more explanation in-text. It wasn't clear to me, for example, how to interpret the asterisks depicted in the right part of the figure.

      We clarified the meaning of the asterisks in the main text in addition to the existent text in the figure caption.

      Page 26. “see Figure 4, off-diagonal cells in blue; asterisks indicate where tRSA was statistically more sensitive then cRSA)”.

      (23) Page 38 "the outcome of tRSA's improved characterization can be seen in multiple empirical outcomes:" it seems there is one mention of 'outcomes' too many here.

      We have revised this sentence.

      Page 41. “tRSA's improved characterization can be seen in multiple empirical outcomes”.

      (24) Page 38 "model fits became the strongest" it's not clear what aspect of the reported results in the paragraph before this is referring to - the Appendix?

      Yes, the model fits are in the Appendix, we have added this in text citation.

      Moreover, model-fits became the strongest when the models also incorporated trial-level variables such as fMRI run and reaction time (Appendix 3, Table 6).

      References

      Diedrichsen, J., Berlot, E., Mur, M., Schütt, H. H., Shahbazi, M., & Kriegeskorte, N. (2021). Comparing representational geometries using whitened unbiased-distance-matrix similarity. Neurons, Behavior, Data and Theory, 5(3). https://arxiv.org/abs/2007.02789

      Diedrichsen, J., & Kriegeskorte, N. (2017). Representational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis. PLoS Computational Biology, 13(4), e1005508.

      Diedrichsen, J., Yokoi, A., & Arbuckle, S. A. (2018). Pattern component modeling: A flexible approach for understanding the representational structure of brain activity patterns. NeuroImage, 180, 119-133.

      Naselaris, T., Kay, K. N., Nishimoto, S., & Gallant, J. L. (2011). Encoding and decoding in fMRI. NeuroImage, 56(2), 400-410.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS Computational Biology, 10(4), e1003553.

      Schütt, H. H., Kipnis, A. D., Diedrichsen, J., & Kriegeskorte, N. (2023). Statistical inference on representational geometries. ELife, 12. https://doi.org/10.7554/eLife.82566

      Walther, A., Nili, H., Ejaz, N., Alink, A., Kriegeskorte, N., & Diedrichsen, J. (2016). Reliability of dissimilarity measures for multi-voxel pattern analysis. NeuroImage, 137, 188-200.

      King, M. L., Groen, I. I., Steel, A., Kravitz, D. J., & Baker, C. I. (2019). Similarity judgments and cortical visual responses reflect different properties of object and scene categories in naturalistic images. NeuroImage, 197, 368-382.

      Kriegeskorte, N., Mur, M., Ruff, D. A., Kiani, R., Bodurka, J., Esteky, H., ... & Bandettini, P. A. (2008). Matching categorical object representations in inferior temporal cortex of man and monkey. Neuron, 60(6), 1126-1141.

      Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS computational biology, 10(4), e1003553.

      Sucholutsky, I., Muttenthaler, L., Weller, A., Peng, A., Bobu, A., Kim, B., ... & Griffiths, T. L. (2023). Getting aligned on representational alignment. arXiv preprint arXiv:2310.13018.

    1. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Farber and colleagues have performed single cell RNAseq analysis on bone marrow derived stem cells from DO Mice. By performing network analysis, they look for driver genes that are associated with bone mineral density GWAS associations. They identify two genes as potential candidates to showcase the utility of this approach.

      Strengths:

      The study is very thorough and the approach is innovative and exciting. The manuscript contains some interesting data relating to how cell differentiation is occurring and the effects of genetics on this process. The section looking for genes with eQTLs that differ across the differentiation trajectory (Figure 4) was particularly exciting.

      Weaknesses:

      The manuscript is, in parts, hard to read due to the use of acronyms and there are some questions about data analysis that still need to be addressed.

      Comments on revisions:

      Dillard et al have made several improvements to their manuscript.

      (1) We previously asked the authors to determine whether any cell types were enriched for BMD-related traits since the premise of the paper is that 'many genes impacting BMD do so by influencing osteogenic differentiation or ... adipogenic differentiation'. Given the potential for the cell culture method to skew the cell type distribution non-physiologically, it is important to establish which cell types in their assay are most closely associated with BMD traits. The new CELLECT analysis and Figure 1E address this point nicely. However, it would still be nice to see the correlations between these cell types and BMD traits in the mice as this would provide independent evidence to support their physiological importance more broadly.

      (2) Shortening the introduction.

      (3) Addressing limitations that arise from not accounting for founder genome SNPs when aligning scRNA-seq data.

      (4) The main take-away of this paper is, to us, the development of a single cell approach to studying BMD-related traits. It is encouraging that the cells post-culture appear to be representative of those pre-culture (supplemental figure 3).

      However, the authors seem to have neglected several comments made by both reviewers. While we share the authors' enthusiasm for the single cell analytical approach, we do not understand their reluctance to perform further statistical tests. We feel that the following comments have still not been addressed:

      (1) The manuscript still contains the following:

      "To provide further support that tradeSeq-identified genes are involved in differentiation, we performed a cell type-specific expression quantitative trait locus (eQTL) analysis for each mesenchymal cell type from the 80 DO mice. We identified 563 genes (eGenes) regulated by a significant cis-eQTL in specific cell types of the BMSC-OB scRNA-seq data (Supplementary Table S14). In total, 73 eGenes were also tradeSeq-identified genes in one or more cell type boundaries along their respective trajectories (Supplementary Table S9)."

      The purpose of this paragraph is to convince readers that the eGenes approach aligns with the tradeSeq approach (and that their approach can therefore be trusted). It is essential that such claims are supported by statistical reasoning. Given that it would be very simple to perform permutation/enrichment analyses to address this point, and both reviewers requested similar analyses, we do not understand the author's reluctance here. Otherwise, this section should be rewritten so that it does not imply that the identification of these genes provides support for their approach.

      (2) Given that a central purpose of this manuscript is to establish a systematic workflow for identifying candidate genes, the manuscript could still benefit from more explanation as to why the authors chose to highlight Tpx2 and Fgfrl1. Tpx2 does already have a role in bone physiology through the IMPC. The authors should comment on why they did not explore Kremen1, for instance, as this gene seems important for the transition to both OB1 and 2.

      A final minor comment is that it would be very helpful if the authors could indicate if the DDGs in Table 1 are also eGenes for the relevant cell type. This is much more meaningful than looking through GTEx.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review): 

      In this manuscript, Dillard and colleagues integrate cross-species genomic data with a systems approach to identify potential driver genes underlying human GWAS loci and establish the cell type(s) within which these genes act and potentially drive disease. Specifically, they utilize a large single-cell RNA-seq (scRNA-seq) dataset from an osteogenic cell culture model - bone marrow-derived stromal cells cultured under osteogenic conditions (BMSC-OBs) - from a genetically diverse outbred mouse population called the Diversity Outbred (DO) stock to discover network driver genes that likely underlie human bone mineral density (BMD) GWAS loci. The DO mice segregate over 40M single nucleotide variants, many of which affect gene expression levels, therefore making this an ideal population for systems genetic and co-expression analyses. The current study builds on previously published work from the same group that used co-expression analysis to identify co-expressed "modules" of genes that were enriched for BMD GWAS associations. In this study, the authors utilize a much larger scRNA-seq dataset from 80 DO BMSC-OBs, infer co-expression-based and Bayesian networks for each identified mesenchymal cell type, focused on networks with dynamic expression trajectories that are most likely driving differentiation of BMSC-OBs, and then prioritized genes ("differentiation driver genes" or DDGs) in these osteogenic differentiation networks that had known expression or splicing QTLs (eQTL/sQTLs) in any GTEx tissue that colocalized with human BMD GWAS loci. The systems analysis is impressive, the experimental methods are described in detail, and the experiments appear to be carefully done. The computational analysis of the single-cell data is comprehensive and thorough, and the evidence presented in support of the identified DDGs, including Tpx2 and Fgfrl1, is for the most part convincing. Some limitations in the data resources and methods hamper enthusiasm somewhat and are discussed below. Overall, while this study will no doubt be valuable to the BMD community, the cross-species data integration and analytical framework may be more valuable and generally applicable to the study of other diseases, especially for diseases with robust human GWAS data but for which robust human genomic data in relevant cell types is lacking. 

      Specific strengths of the study include the large scRNA-seq dataset on BMSC-OBs from 80 DO mice, the clustering analysis to identify specific cell types and sub-types, the comparison of cell type frequencies across the DO mice, and the CELLECT analysis to prioritize cell clusters that are enriched for BMD heritability (Figure 1). The network analysis pipeline outlined in Figure 2 is also a strength, as is the pseudotime trajectory analysis (results in Figure 3). One weakness involves the focus on genes that were previously identified as having an eQTL or sQTL in any GTEx tissue. The authors rightly point out that the GTEx database does not contain data for bone tissue, but the reason that eQTLs can be shared across many tissues - this assumption is valid for many cis-eQTLs, but it could also exclude many genes as potential DDGs with effects that are specific to bone/osteoblasts. Indeed, the authors show that important BMD driver genes have cell-type-specific eQTLs. Furthermore, the mesenchymal cell type-specific co-expression analysis by iterative WGCNA identified an average of 76 co-expression modules per cell cluster (range 26-153). Based on the limited number of genes that are detected as expressed in a given cell due to sparse per-cell read depth (400-6200 reads/cell) and dropouts, it's hard to believe that as many as 153 co-expression modules could be distinguished within any cell cluster. I would suspect some degree of model overfitting here and would expect that many/most of these identified modules have very few gene members, but the methods list a minimum module size of 20 genes. How do the numbers of modules identified in this study compare to other published scRNA-seq studies that use iterative WGCNA? 

      In the section "Identification of differentiation driver genes (DDGs)", the authors identified 408 significant DDGs and found that 49 (12%) were reported by the International Mouse Knockout [sic] Consortium (IMPC) as having a significant effect on whole-body BMD when knocked out in mice. Is this enrichment significant? E.g., what is the background percentage of IMPC gene knockouts that show an effect on whole-body BMD? Similarly, they found that 21 of the 408 DDGs were genes that have BMD GWAS associations that colocalize with GTEx eQTLs/sQTLs. Given that there are > 1,000 BMD GWAS associations, is this enrichment (21/408) significant? Recommend performing a hypergeometric test to provide statistical context to the reported overlaps here. 

      We thank the reviewer for their constructive feedback and thoughtful questions. In regards to the iterativeWGCNA, a larger number of modules is sometimes an outcome of the analysis, as reported in the iterativeWGCNA preprint (Greenfest-Allen et al., 2017). While we did not make a comparison to other works leveraging this tool for scRNA-seq, it has been used broadly across other published studies, such as PMID: 39640571, 40075303, 33677398, 33653874. While model overfitting, as you mention, may be a cause for more modules, our Bayesian network analysis we perform after iterativeWGCNA highlights smaller aspects of coexpression modules, as opposed to focusing on the entirety of any given module.

      We did not perform enrichment or statistical tests as our goal was to simply highlight attributes or unique features of these genes for additional context.

      Reviewer #2 (Public review): 

      Summary: 

      In this manuscript, Farber and colleagues have performed single-cell RNAseq analysis on bone marrow-derived stem cells from DO Mice. By performing network analysis, they look for driver genes that are associated with bone mineral density GWAS associations. They identify two genes as potential candidates to showcase the utility of this approach. 

      Strengths: 

      The study is very thorough and the approach is innovative and exciting. The manuscript contains some interesting data relating to how cell differentiation is occurring and the effects of genetics on this process. The section looking for genes with eQTLs that differ across the differentiation trajectory (Figure 4) was particularly exciting. 

      Weaknesses: 

      The manuscript is in parts hard to read due to the use of acronyms and there are some questions about data analysis that need to be addressed. 

      We thank the reviewer for their feedback and shared enthusiasm for our work. We tried to minimize the use of technical acronyms as much as we could without compromising readability. Additionally, we addressed questions regarding aspects of data analysis. 

      Reviewer #1 (Recommendations for the authors):

      (1) For increased transparency and to allow reproducibility, it would be necessary for the scripts used in the analysis to be shared along with the publication of the preprint. Also, where feasible, sharing the processed data in addition to the raw data would allow the community greater access to the results and be highly beneficial. 

      Thank you for this suggestion. The raw data will be available via GEO accession codes listed in the data availability statement. We will make available scripts for some analyses on our Github (https://github.com/Farber-Lab/DO80_project) and processed scRNA-seq data in a Seurat object (.rds) on Zenodo (https://zenodo.org/records/15299631)

      (2) Lines 55-76: I think the summary of previous work here is too long. I understand that they would like to cover what has been done previously, but this seems like overkill. 

      Good suggestion. We have streamlined some of the summary of our previous work.

      (3) Did the authors try to map QTL for cell-type proportion differences in their BMSC-OBs? While 80 samples certainly limit mapping power, the data shown in Figs 4C/D suggest that you might identify a large-effect modifier of LMP/OB1 proportions. 

      We did try to map QTL for cell type proportion differences, but no significant associations were identified. 

      (4) Methods question: Does the read alignment method used in your analysis account for SNPs/indels that segregate among the DO/CC founder strains? If not, the authors may wish to include this in their discussion of study limitations and speculate on how unmapped reads could affect expression results. 

      The read alignment method we used does not account for SNPs/indels from the DO founder strains that fall in RNA transcripts captured in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424). 

      (5) Much of the discussion reads as an overview of the methods, while a discussion of the results and their context to the existing BMD literature is relatively lacking in comparison.

      We have added additional explanation of the results and context to the discussion (line 381-382, 396-407). 

      (6) Figure 1E and lines 146-149: Adjusted p values should be reported in the figure and accompanying text instead of switching between unadjusted and adjusted p values. 

      We updated Figure 1e to portray adjusted p-values, listed the adjusted p-values in legend of Figure 1e, and listed them in the main text (line 153-154).

      (7) Why do the authors bring the IMPC KO gene list into the analysis so late? This seems like a highly relevant data resource (moreso than the GTEx eQTLs/sQTLs) that could have been used much earlier to help identify DDGs. 

      Given that our scRNA-seq data is also from mice, we did choose to integrate information from the IMPC to highlight supplemental features of genes in networks (i.e., genes that have an experimentally-tested and significant effect on BMD in mice). However, our primary goal was to inform human GWAS and leverage our previous work in which we identified colocalizations between human BMD GWAS and eQTL/sQTL in a human GTEx tissue, which is why this information was used to guide our network analysis.

      (8) Does Fgfrl1 and/or Tpx2 have a cis-eQTL in your BMSC-OB scRNA-seq dataset? 

      We did not identify cis-eQTL effects for Fgfrl1 and Tpx2.

      (9) Figure 4B-C: These eQTLs may be real, but based on the diplotype patterns in Figure 4C, I suspect they are artifacts of low mapping power that are driven by rare genotype classes with one or two samples having outlier expression results. For example, if you look at the results in Fig 4C for S100a1 expression, the genotype classes with the highest/lowest expression have lower sample numbers. In the case of Pkm eQTL showing a PWK-low effect, the PWK genome has many SNPs that differ from the reference genome in the 3' UTR of this gene, and I wonder if reads overlapping these SNPs are not aligning correctly (see point 4 above) and resulting (falsely) in lower expression values for samples with a PWK haplotype. 

      As mentioned above, our alignment method did not consider DO founder genetic variation that is specifically located in the 3’ end of RNA transcripts in the scRNA-seq data. We have included this as a limitation in our discussion (line 422-424).

      In future studies, we intend to include larger populations of mice to potentially overcome, as you mention, any artifacts that may be attributable to low statistical power, rare genotype classes, or outlier expression.

      Reviewer #2 (Recommendations for the authors):

      Major Points 

      (1) The authors hypothesize "that many genes impacting BMD do so by influencing osteogenic differentiation or possibly bone marrow adipogenic differentiation". However, cell type itself does not correlate with any bone trait. Does this indicate that the hypothesis is not entirely correct, as genes that drive these phenotypes would not be enriched in one particular cell type? The authors have previously identified "high-priority target genes". So, are there any cell types that are enriched for these target genes? If not, this would indicate that all these genes are more ubiquitously expressed and this is probably why they would have a greater effect on the overall bone traits. Furthermore, are the 73 eGenes (so genes with eQTLs in a particular cell type that change around cell type boundaries) or the DDGs (Table 1) enriched for these high-priority target genes? 

      The bone traits measured in the DO mice are complex and impacted by many factors, including the differentiation propensity and abundance of certain cell types, both within and outside of bone. Though we did not identify correlations between cell type abundance and the bone traits we measured, we tailored our investigations to focus on cellular differentiation using the scRNA-seq data. However, future studies would need to be performed to investigate any connections between cellular differentiation, cell type abundance, and bone traits.

      We did not perform enrichment analyses of either the target genes identified from our other work or eGenes identified here, but instead used the target gene list to center our network analysis and the eGenes to showcase the utility of the DO mouse population.

      (2) The readability of the paper could be improved by minimising the use of acronyms and there are several instances of confusing wording throughout the paper. In many cases, this can be solved by re-organising sentences and adding a bit more detail. For example, it was unclear how you arrived at Fgfrl1 or Tpx2.

      One of the goals of our study was to identify genes that have (to our knowledge) little to no known connection to BMD. We chose to highlight Fgfrl1 and Tpx2 because there is minimal literature characterizing these genes in the context of bone, which we speak to in the results (line 296-297). Additionally, we prioritized these genes in our previous work and they were identified in this study by using our network analyses using the scRNA-seq data, which we mention in the results (line 276-279).

      (3) Technical aspects of the assay. In Figure 1d you show that the cell populations vary considerably between different DO mice. It would be useful to give some sense of the technical variance of this assay given that the assay involves culturing the cells in an exogenous environment. This could take the form of tests between mice within the same inbred strain, or even between different legs of the same DO mice to show that results are technically very consistent. It might also be prudent to identify that this is a potential limitation of the approach as in vitro culturing has the potential to substantially change the cell populations that are present. 

      We agree that in vitro culturing, in addition to the preparation of single cells for scRNA-seq, are unavoidable sources of technical variation in this study. However, the total number of cells contributed by each of the 80 DO mice after data processing does not appear to be skewed and the distribution appears normal (see added figures, now included as Supplemental Figure 3). Therefore, technical variation is at least consistent across all samples. Nevertheless, we have mentioned the potential for technical variation artifacts in our study in the discussion (line 414-416).

      (4) Need for permutation testing. "We identified 563 genes regulated by a significant eQTL in specific cell types. In total, 73 genes with eQTLs were also tradeSeq-identified genes in one or more cell type boundaries". These types of statements are fine but they need to be backed up with permutation testing to show that this level of enrichment is greater than one would expect by chance. 

      We did not perform enrichment tests as our only goal was to 1. determine if eQTL could be resolved in the DO mouse population using our scRNA-seq data and 2. predict in what cell type the associated eQTL and associated eGene may have an effect.

      (5) The main novelty of the paper seems to be that you have used single-cell RNA seq (given that you appear to have already detailed the candidates at the end). I don't think this makes the paper less interesting, but I think you need to reframe the paper more about the approach, and not the specific results. How you landed on these candidates is also not clear. So the paper might be improved by more robustly establishing the workflow and providing guidelines for how studies like this should be conducted in the future. 

      We sought to not only devise a rigorous approach to analyze our single cell data, but also showcase the utility of the approach in practice by highlighting targets for future research (i.e., Fgfrl1 and Tpx2).

      Our goal was to identify novel genes and we landed on these candidate genes (Fgfrl1 and Tpx2) because they had substantial data supporting their causality and they have yet to be fully characterized in the context of bone and BMD (line 295-297).

      In regards to establishing the workflow, we have included rationale for specific aspects of our approach throughout the paper. For example, Figure 2 itemizes each step of our network analysis and we explain why each step is utilized throughout various parts results (e.g., lines 168-170, 179-181, 191-193, 202-203, 257-260, 276-277).

      We have added a statement advocating for large-scale scRNA-seq from genetically diverse samples and network analyses for future studies (line 436-438).

      Minor Points 

      (1) In the summary you use the word "trajectory". Trajectories for what? I assume the transition between cell types, but this is not clear. 

      We added text to clarify the use of trajectory in the summary (line 34).

      (2) This sentence: "By 60 identifying networks enriched for genes implicated in GWAS we predicted putatively causal genes 61 for hundreds of BMD associations based on their membership in enriched modules." is also not clear. Do you mean: we predicted putatively causal genes by identifying clusters of co-expressed genes that were enriched for GWAS genes?" It is not clear how you identify the causal gene in the network. Is this just based on the hub gene? 

      The aforementioned sentence has since been removed to streamline the introduction, as suggested by Reviewer 1.

      In regards to causal gene identification, it is not based on whether it is hub gene. We prioritized a DDG (and their associated networks) if it was a causal gene that we identified in our previous work as having eQTL/sQTL in a GTEx tissue that colocalizes with human BMD GWAS.

      (3) Figure 3C. This is good but the labels are quite small. Would be good to make all the font sizes larger. 

      We have enlarged Figure 3C.

      (4) Line 341 in the Discussion should be "pseudotemporal". 

      We have edited “temporal” to “pseduotemporal”.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this fMRI study, the authors wished to assess neural mechanisms supporting flexible "temporal construals". For this, human participants learned a story consisting of fifteen events. During fMRI, events were shown to them, and they were instructed to consider the event from "an internal" or from "an external" perspective. The authors found opposite patterns of brain activity in the posterior parietal cortex and the anterior hippocampus for the internal and the external viewpoint. They conclude that allocentric sequences are stored in the hippocampus, whereas egocentric sequences are used in the parietal cortex. The claims align with previous fMRI work addressing this question.

      We appreciate the reviewer's concise summary of our research. We would like to offer two clarifications to prevent any potential misunderstandings.

      First, the activity patterns in the parietal cortex and hippocampus are not entirely opposite across internal and external perspectives. Specifically, the activation level in the posterior parietal cortex shows a positive correlation with sequential distance during external-perspective tasks, but a negative correlation during internal-perspective tasks. In contrast, the activation level in the anterior hippocampus positively correlates with sequential distance, irrespective of the observer's perspective. Therefore, our results suggest that the parietal cortex, with its perspective-dependent activity, supports egocentric representation; the hippocampus, with its consistent activity across perspectives, supports allocentric representation.

      Second, while some of our findings align with previous fMRI studies, to our knowledge, no prior research has explicitly investigated how the neural representation of time may vary depending on the observer's viewpoint. This gap in the literature is the primary motivation for our current study.

      Strengths:

      The research topic is fascinating, and very few labs in the world are asking the question of how time is represented in the human brain. Working hypotheses have been recently formulated, and this work seems to want to tackle some of them.

      We appreciate the reviewer's acknowledgment of the theoretical significance of our study.

      Weaknesses:

      The current writing is fuzzy both conceptually and experimentally. I cannot provide a sufficiently well-informed assessment of the quality of the experimental work because there is a paucity of details provided in the report. Any future revisions will likely improve transparency.

      (1) Improving writing and presentation:

      The abstract and the introduction make use of loaded terms such as "construals", "mental timeline", "panoramic views" in very metaphoric and unexplained ways. The authors do not provide a comprehensive and scholarly overview of these terms, which results in verbiage and keywords/name-dropping without a clear general framework being presented. Some of these terms are not metaphors. They do refer to computational concepts that the authors should didactically explain to their readership. This is all the more important that some statements in the Introduction are misattributed or factually incorrect; some statements lack attributions (uncited published work). Once the theory, the question, and the working hypothesis are clarified, the authors should carefully explain the task.

      We appreciate the reviewer's critics.

      The formulation of the scientific question in the introduction is grounded in the spatial construals of time hypothesis and conceptual metaphor theory (e.g., Traugott, 1978; Lakoff & Johnson, 1980; see recent reviews by Núñez & Cooperrider, 2013; Bender & Beller, 2014). These frameworks were originally developed through analyses of how spatial metaphors are used to describe temporal concepts in natural language. Consequently, it is theoretically motivated and largely unavoidable to introduce the two primary temporal construals—mental time travel and mental time watching— using metaphorical expressions.

      However, we do agree with the reviewer that the introduction in the original manuscript was overly long and that the working hypothesis was not clearly stated. In the revised manuscript, we have streamlined the introduction and substantially revised the following two paragraphs to clarify the formulation of our working hypothesis (Pages 5-6):

      “Recent studies have already begun to investigate the neural representation of the memorized event sequence (e.g., Deuker et al., 2016; Thavabalasingam et al., 2018; Bellmund et al., 2019, 2022; see reviews by Cohn-Sheehy & Ranganath, 2017; Bellmund et al., 2020). Yet, the neural mechanisms that enable the brain to construct distinct construals of an event sequence remain largely unknown. Valuable insights may be drawn from research in the spatial domain, which diPerentiates the neural representation in allocentric and egocentric reference frames. According to an influential neurocomputational model (Byrne et al., 2007; Bicanski & Burgess, 2018; Bicanski & Burgess, 2020), allocentric and egocentric spatial representations are dissociable in the brain—they are respectively implemented in the medial temporal lobe (MTL)—including the hippocampus—and the parietal cortex. Various egocentric representations in the parietal cortex derived from diPerent viewpoints can be transformed and integrated into a unified allocentric representation and stored in the MTL (i.e., bottom-up process). Conversely, the allocentric representation in the MTL can serve as a template for reconstructing diverse egocentric representations across diPerent viewpoints in the parietal cortex (i.e., top-down process).”

      “In line with the spatial construals of time hypothesis, several authors have recently suggested that such mutually engaged egocentric and allocentric reference frames (in the parietal cortex and the medial temporal lobe, respectively) proposed in the spatial domain might also apply to the temporal one (e.g., Gauthier & van Wassenhove, 2016ab; Gauthier et al., 2019, 2020; Bottini & Doeller, 2020). If this hypothesis holds, it could explain how the brain flexibly generates diverse construals of the same event sequence. Specifically, the hippocampus may encode a consistent representation of an event sequence that is independent of whether an individual adopts an internal or external perspective, reflecting an allocentric representation of time. In contrast, parietal cortical representations are expected to vary flexibly with the adopted perspective that is shaped by task demands, reflecting an egocentric representation of time.”

      In the revised manuscript, we also corrected statements in the Introduction that may have been misattributed (see Reviewer 2, comment 4(ii)) and added several relevant and important publications.

      (2) The experimental approach lacks sufficient details to be comprehensible to a general audience. In my opinion, the results are thus currently uninterpretable. I highlight only a couple of specific points (out of many). I recommend revision and clarification.

      (a) No explanation of the narrative is being provided. The authors report a distribution of durations with no clear description of the actual sequence of events. The authors should provide the text that was used, how they controlled for low-level and high-level linguistic confounds.

      We thank the reviewer for the suggestions. The event sequence for the odd-numbered participants is shown in the original Figure 1. In the revised manuscript, we added to Figure 1 the figure supplement 1 to illustrate the actual sequence of events for the participants with both odd and even numbers. We also added the narratives used in the reading phase of the learning procedures for the participants with both odd and even numbers (Figure 1—source data 1).

      To control for low-level linguistic confounds, we included the number of syllables as a covariate in the first-level general linear model in the fMRI analysis. To address high-level linguistic confounds, such as semantic information (which is difficult to quantify), we randomly assigned event labels to the 15 events twice, creating two counterbalanced versions for participants with even and odd numbers (see Comment 2b below).

      (b) The authors state, "we randomly assigned 15 phrases to the events twice". It is impossible to comprehend what this means. Were these considered stimuli? Controls? IT is also not clear which event or stimulus is part of the "learning set" and whether these were indicated to be such to participants.

      We apologize for any confusion in the Results section and the legend of Figure 1. Our motivation was explained in the "Stimuli" section of the Methods. In the revised manuscript, we have clarified this by adding an explanation to the legend of Figure 1 and including the supplementary Figure 1: " To minimize potential confounds between the semantic content of the event phrases and the temporal structure of the events, we randomly assigned the phrases to the events, creating two versions for participants with even and odd ID numbers. Both versions can be seen in Figure1—figure supplement 1 and Figure 1—source data 1."

      (c) The left/right counterbalancing is not being clearly explained. The authors state that there is counterbalancing, but do not sufficiently explain what it means concretely in the experiment. If a weak correlation exists between sequential position and distance, it also means that the position and the distance have not been equated within. How do the authors control for these?

      We thank the reviewer for highlighting this point and apologize for the lack of clarity in the original manuscript. In the current version (Page 40), we have provided further clarification: “We carefully selected two sets of 20 event pairs from the 210 possible combinations, assigning them to the odd and even runs of the fMRI experiment. Using a brute-force search, we identified 20 pairs in which sequential distance showed only weak correlations with positional information for both reference and target events (ranging from 1 to 15), as well as with behavioral responses (Same vs. Different or Future vs. Past, coded as 0 and 1), with all correlation coefficients below 0.2. At the same time, we balanced the proportion of correct responses across conditions: for the external-perspective task, Same/Different = 11/9 and 12/8; for the internal-perspective task, Future/Past = 12/8 and 8/12. Under these constraints, the sequential distances in both sets ranged from 1 to 5. To further mitigate spatial response biases, we pseudorandomized the left/right on-screen positions of the two response options within each task block, while ensuring an equal number of correct responses mapped to the left and right buttons (i.e., 10 per block).”

      The event pairs we selected already represent the best possible choice given all the criteria we aimed to satisfy. It is impossible to completely eliminate all potential correlations. For instance, if the target event occurs near the beginning of the day, it will tend to fall in the past, whereas if it occurs near the end of the day, it is more likely to fall in the future. To further ensure that the significant results were not driven by these weak confounding factors, we constructed another GLM that included three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (d) The authors used two tasks. In the "external perspective" one, the authors asked participants to report whether events were part of the same or a different part of the day. In the "internal perspective one", the authors asked participants to project themselves to the reference event and to determine whether the target event occurred before or after the projected viewpoint. The first task is a same/different recognition task. The second task is a temporal order task (e.g., Arzy et al. 2009). These two asks are radically different and do not require the same operationalization. The authors should minimally provide a comprehensive comparison of task requirements, their operationalization, and, more importantly, assess the behavioral biases inherent to each of these tasks that may confound brain activity observed with fMRI.

      We understand the reviewer’s concern. We agree that there is a substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Rather, the neural correlates of temporal distance were first identified as brain regions showing a significant correlation between neural activity and temporal distance using the parametric modulation analysis. We then compared these neural correlates between the two tasks. Therefore, any general differences between the tasks should not be a confound for our main results. Our aim was to examine whether the hippocampal representation of temporal distance remains consistent across different perspectives, and whether the parietal representation of temporal distance varies as a function of the perspective adopted.

      Therefore, the main aim of our task manipulation was to ensure that participants adopted either an external or an internal perspective on the event sequence, depending on the task condition. In the Introduction (Pages 6–7), we clarify this manipulation as follows: “In the externalperspective task, participants localized events with respect to external temporal boundaries, judging whether the target event occurred in the same or a different part of the day as the reference event. In the internal-perspective task, participants were instructed to mentally project themselves into the reference event and localize the target event relative to their own temporal point, judging whether the target event happened in the future or the past of the reference event (see Methods for details of the scanning procedure).”

      We believe this task manipulation was successful. Behaviorally, the two tasks showed opposite correlations between reaction time and temporal distance, resembling the symbolic distance versus mental scanning effect. Neurally, contrasting the internal- and external-perspective tasks revealed activation of the default mode network, which is known to play a central role in self-projection (Buckner et al., 2017).

      (e) The authors systematically report interpreted results, not factual data. For instance, while not showing the results on behavioral outcomes, the authors directly interpret them as symbolic distance effects.

      Thank you for this comment. In the original paper, we reported the relevant statistics before our interpretation: “Sequential Distance was correlated positively with RT in the external-perspective task (z = 3.80, p < 0.001) but negatively in the internal-perspective task (z = -3.71, p < 0.001).” However, they may have been difficult to notice, and we are including a figure for the RT analysis in the revised manuscript.

      Crucially, the authors do not comment on the obvious differences in task difficulty in these two tasks, which demonstrates a substantial lack of control in the experimental design. The same/different task (task 1 called "external perspective") comes with known biases in psychophysics that are not present in the temporal order task (task 2 called " internal perspective"). The authors also did not discuss or try to match the performance level in these two tasks. Accordingly, the authors claim that participants had greater accuracy in the external (same/different) task than in the internal task, although no data are shown and provided to support this report. Further, the behavioral effect is trivialized by the report of a performance accuracy trade off that further illustrates that there is a difference in the task requirements, preventing accurate comparison of the two tasks.

      As noted in Question 2d, we acknowledge the substantial difference between the two tasks. However, the primary goal of this study was not to directly compare these tasks to isolate a specific cognitive component. Instead, we first identified the neural correlates of temporal distance as brain regions showing a significant correlation between neural activity and temporal distance, independent of task demands. We then compared these neural correlates across the two task conditions, which were designed to engage different temporal perspectives. Therefore, any general differences between the tasks should not be a confound for our main findings and interpretation.

      Our aim was to investigate whether the hippocampal representation of temporal distance remains consistent across different perspectives and whether the parietal representation of temporal distance varies as a function of the perspective adopted. We do not see how this doubledissociation pattern could be explained by differences in task difficulty.

      While we do not consider the overall difference in task difficulty between the two tasks to be a confounding factor, we acknowledge the potential confound posed by variations in task difficulty across temporal distances (1 to 5). This concern arises from the similarity between the activity patterns in the posterior parietal cortex and reaction time across temporal distances. To address this, we conducted control analyses to test this hypothesis (see the second and third points from Reviewer 2 for details).

      On page 8, we present the behavioral accuracy data: “Participants showed significantly higher accuracy in the external-perspective task than in the internal-perspective task (external-perspective task: M = 93.5%, SD = 4.7%; internal-perspective task: M = 89.5%, SD = 8.1%; paired t(31) = 3.33, p = 0.002).”

      All fMRI contrasts are also confounded by this experimental shortcoming, seeing as they are all reported at the interaction level across a task. For instance, in Figure 4, the authors report a significant beta difference between internal and external tasks. It is impossible to disentangle whether this effect is simply due to task difference or to an actual processing of the duration that differs across tasks, or to the nature of the representation (the most difficult to tackle, and the one chosen by the authors).

      We thank the reviewer for pointing out this important issue. Like temporal distance, the neural correlates of duration were not derived from a direct contrast between the two tasks. Instead, they were identified by detecting brain regions showing a significant correlation between neural activity and the implied duration of each event using the parametric modulation analysis. Therefore, what is shown in Figure 4 reflects the significant differences in these neural correlations with duration between the two tasks.

      The observed difference in the neural representation of duration between the two tasks was unexpected. In the original manuscript, we provided a post hoc explanation: “Since the externalperspective task in the current study encouraged the participants to compare the event sequence with the external parallel temporal landmarks, duration representation in the hippocampus may be dampened.”

      However, we agree that this difference might also arise from other factors distinguishing the two tasks. In the revised manuscript, we have clarified this possibility as follows: “The difference in duration representation between the two tasks remains open to interpretation. One possible explanation is that the hippocampus is preferentially involved in memory for durations embedded within event sequences (see review by Lee et al., 2020). In the internal-perspective task, participants indeed localized events within the event sequence itself. In contrast, the externalperspective task encouraged participants to compare the event sequence with external temporal landmarks, which may have attenuated the hippocampal representation of duration.”

      Conclusion:

      In conclusion, the current experimental work is confounded and lacks controls. Any behavioral or fMRI contrasts between the two proposed tasks can be parsimoniously accounted for by difficulty or attentional differences, not the claim of representational differences being argued for here.

      We hope that our explanations and clarifications above adequately address the reviewer’s concerns. We would like to reiterate that we did not directly compare the two tasks. Rather, we first identified the neural representations of sequential distance and duration, and then examined how these representations differed across tasks. It is unclear to us how the overall difference in task difficulty or attentional demands could lead to the observed pattern of results.

      By determining where the neural representations were consistent and where they diverged, we were able to differentiate brain regions that encode temporal information allocentrically from those that represent temporal information in a perspective-dependent manner, modulated by task demands.

      Reviewer #2 (Public review):

      Summary:

      Xu et al. used fMRI to examine the neural correlates associated with retrieving temporal information from an external compared to internal perspective ('mental time watching' vs. 'mental time travel'). Participants first learned a fictional religious ritual composed of 15 sequential events of varying durations. They were then scanned while they either (1) judged whether a target event happened in the same part of the day as a reference event (external condition); or (2) imagined themselves carrying out the reference event and judged whether the target event occurred in the past or will occur in the future (internal condition). Behavioural data suggested that the perspective manipulation was successful: RT was positively correlated with sequential distance in the external perspective task, while a negative correlation was observed between RT and sequential distance for the internal perspective task. Neurally, the two tasks activated different regions, with the external task associated with greater activity in the supplementary motor area and supramarginal gyrus, and the internal condition with greater activity in default mode network regions. Of particular interest, only a cluster in the posterior parietal cortex demonstrated a significant interaction between perspective and sequential distance, with increased activity in this region for longer sequential distances in the external task, but increased activity for shorter sequential distances in the internal task. Only a main effect of sequential distance was observed in the hippocampus head, with activity being positively correlated with sequential distance in both tasks. No regions exhibited a significant interaction between perspective and duration, although there was a main effect of duration in the hippocampus body with greater activity for longer durations, which appeared to be driven by the internal perspective condition. On the basis of these findings, the authors suggest that the hippocampus may represent event sequences allocentrically, whereas the posterior parietal cortex may process event sequences egocentrically.

      We sincerely appreciate the reviewers for providing an accurate, comprehensive, and objective summary of our study.

      Strengths:

      The topic of egocentric vs. allocentric processing has been relatively under-investigated with respect to time, having traditionally been studied in the domain of space. As such, the current study is timely and has the potential to be important for our understanding of how time is represented in the brain in the service of memory. The study is well thought out, and the behavioural paradigm is, in my opinion, a creative approach to tackling the authors' research question. A particular strength is the implementation of an imagination phase for the participants while learning the fictional religious ritual. This moves the paradigm beyond semantic/schema learning and is probably the best approach besides asking the participants to arduously enact and learn the different events with their exact timings in person. Importantly, the behavioural data point towards successful manipulation of internal vs. external perspective in participants, which is critical for the interpretation of the fMRI data. The use of syllable length as a sanity check for RT analyses, as well as neuroimaging analyses, is also much appreciated.

      We thank the reviewer for the positive and encouraging comments.

      Weaknesses/Suggestions:

      Although the design and analysis choices are generally solid, there are a few finer details/nuances that merit further clarification or consideration in order to strengthen the readers' confidence in the authors' interpretation of their data.

      (1) Given the known behavioural and neural effects of boundaries in sequence memory, I was wondering whether the number of traversed context boundaries (i.e., between morning-afternoon, and afternoon-evening) was controlled for across sequential length in the internal perspective condition? Or, was it the case that reference-target event pairs with higher sequential numbers were more likely to span across two parts of the day compared to lower sequential numbers? Similarly, did the authors examine any potential differences, whether behaviourally or neurally, for day part same vs. day part different external task trials?

      We thank the reviewer for the thoughtful comments. When we designed the experiment, we minimized the correlation between the sequential distance between the target and reference events and whether the reference and target events occurred within the same or different parts of the day (coded as Same = 0, Different = 1). The point-biserial correlation coefficient between these two variables across all the trials within the same run were controlled below 0.2.

      To investigate the effect of day-part boundaries on behavior, as well as the contribution of other factors, we conducted a new linear mixed-effects model analysis incorporating four additional variables. They are whether the target and the reference events are within the same or different parts of the day (i.e., Same vs. Different), whether the target event is in the future or the past of the reference event (i.e., Future vs. Past), and the interactions of the two factors with Task Type (i.e., internal- vs. external-perspective task).

      The results are largely the same as the original one in the table: There was a significant main effect of Syllable Length, and the interaction effects between Task Type and Sequence Distance and between Task Type and Duration remain significant. What's new is we also found a significant interaction effect between Task Type and Same vs. Different.

      As shown in the Figure 2—figure supplement 1, this Same vs. Different effect was in line with the effect of Sequential Distance, with two events in the same and different parts of the day corresponding to the short and long sequential distances. Given that Sequential Distance had already been considered in the model, the effect of parts of the day should result from the boundary effect across day parts or the chunking effect within day parts, i.e., the sequential distance across different parts of the day was perceived longer while the sequential distance within the same parts of the day was perceived shorter. We have incorporated these findings into the manuscript.

      Neurally, to further verify that the significant effects of sequential distance were not driven by its weak correlation with the Same/Different judgment or other potential confounding factors, we constructed another GLM that incorporated three additional parametric modulators: the sequence position of the target event (ranging from 1 to 15) and the behavioral responses (Future vs. Past in the internal-perspective task; Same vs. Different in the external-perspective task, coded as 0 and 1). The significant findings were unaffected.

      (2) I would appreciate further insight into the authors' decision to model their task trials as stick functions with duration 0 in their GLMs, as opposed to boxcar functions with varying durations, given the potential benefits of the latter (e.g., Grinband et al., 2008). I concur that in certain paradigms, RT is considered a potential confound and is taken into account as a nuisance covariate (as the authors have done here). However, given that RTs appear to be critical to the authors' interpretation of participant behavioural performance, it would imply that variations in RT actually reflect variations in cognitive processes of interest, and hence, it may be worth modelling trials as boxcar functions with varying durations.

      We appreciate the reviewer’s insightful comment on this important issue. Whether to control for RT’s influence on fMRI activation is indeed a long-standing paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For example, regions within the multiple-demand network are often positively correlated with RT across different cognitive domains.

      Our strategy in the manuscript is to first present the results without including RT as a control variable and then examine whether the effects are preserved after controlling for RT. In the revised manuscript, we have clarified this approach (Page 13): “Here, changes in activity levels within the PPC were found to align with RT. Whether to control for RT’s influence on fMRI activation represents a well-known paradox. On the one hand, RT reflects underlying cognitive processes and therefore should not be fully controlled for. On the other hand, RT can independently influence neural activity, as several brain networks vary with RT irrespective of the specific cognitive process involved—a domain-general effect. For instance, regions within the multiple-demand network are often positively correlated with RT and task difficulty across diverse cognitive domains (e.g., Fedorenko et al., 2013; Mumford et al., 2024). To evaluate the second possibility, we conducted an additional control analysis by including trial-by-trial RT as a parametric modulator in the first-level model (see Methods). Notably, the same PPC region remained the only area in the entire brain showing a significant interaction between Task Type and Sequential Distance (voxel-level p < 0.001, clusterlevel FWE-corrected p < 0.05). This finding indicates that PPC activity cannot be fully attributed to RT. Furthermore, we do not interpret the effect as reflecting a domain-general RT influence, as regions within the multiple-demand system—typically sensitive to RT and task difficulty—did not exhibit significant activation in our data.”

      The reason we did not use boxcar functions with varying durations in our original manuscript is that we also applied parametric modulation in the same model. In the parametric modulation, all parametric modulators inherit the onsets and durations of the events being modulated. Consequently, the modulators would also take the form of boxcar functions rather than stick functions—the height of each boxcar reflecting the parameter value and its length reflecting the RT. We were uncertain whether this approach would be appropriate, as we have not encountered other studies implementing parametric modulation in this manner.

      For exploratory purposes, we also conducted a first-level analysis using boxcar functions with variable durations. The same PPC region remained the strongest area in the entire brain that shows an interaction effect between Task Type and Sequential Distance. However, the cluster size was slightly reduced (voxel-level p < 0.001, cluster-level FWE-corrected p = 0.0610; see the Author response image 1 below). The cross indicates the MNI coordinates at [38, –69, 35], identical to those shown in the main results (Figure 4A).

      Author response image 1.

      (3) The activity pattern across tasks and sequential distance in the posterior parietal cortex appears to parallel the RT data. Have the authors examined potential relationships between the two (e.g., individual participant slopes for RT across sequential distance vs. activity betas in the posterior parietal cortex)?

      We thank the reviewer for this helpful suggestion. As shown in the Author response image 2, the interaction between Task Type and Sequential Distance was a stronger predictor of PPC activation than of RT. Because PPC activation and RT are measured on different scales, we compared their standardized slopes (standardized β) measuring the change in a dependent variable in terms of standard deviations for a one-standard-deviation increase in an independent variable. The standardized β for the Task Type × Sequential Distance interaction was −0.30 (95% CI [−0.42, −0.19]) for PPC activation and −0.21 (95% CI [−0.30, −0.13]) for RT. The larger standardized effect for PPC activation indicates that the Task Type × Sequential Distance interaction was a stronger predictor of neural activation than of behavioral RT.

      Author response image 2.

      A more relevant question is whether PPC activation can be explained by temporal information (i.e., the sequential distance) independently of RT. To test this, we included both Sequential Distance and RT in the same linear mixed-effects model predicting PPC Activation Level. As shown in the Author response table 1, although RT independently influenced PPC activation (F(1, 288) = 4.687, p = 0.031), the interaction between Task Type and Sequential Distance was a much stronger independent predictor (F(1, 290) = 19.319, p < 0.001).

      Author response table 1.

      PPC Activation Level Predicted by Sequential Distance and RT

      (3) Linear Mixed Model Formula: PPC Activation Level ~ 1 + Task Type * (Sequential Distance + RT) + (1 | Participant)

      (4) There were a few places in the manuscript where the writing/discussion of the wider literature could perhaps be tightened or expanded. For instance:

      (i) On page 16, the authors state 'The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study (Gauthier & van Wassenhove, 2016b). The authors found a similar region (the reported MNI coordinate of the peak voxel was 42, -70, 40, and the MNI coordinate of the peak voxel in the present study was 39, -70, 35), of which the activation level went up when the target event got closer to the self-positioned event. This finding aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.' Without providing a little more detail here about the Gauthier & van Wassenhove study and what participants were required to do (i.e., mentally position themselves at a temporal location and make 'occurred before' vs. 'occurred after' judgements of a target event), it could be a little tricky for readers to follow why this convergence in finding supports a role for the posterior parietal cortex in egocentric representations.

      We appreciate the reviewer’s comments. In the revised manuscript, we have provided a more detailed explanation of Gauthier and van Wassenhove’s study (Page 17): “The negative correlation between the activation level in the right PPC and sequential distance has already been observed in a previous fMRI study by Gauthier & van Wassenhove (2016b). In their study, the participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that time point. The authors identified a similar brain region (reported MNI coordinates of the peak voxel: 42, −70, 40), closely matching the activation observed in the present study (MNI coordinates of the peak voxel: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, which aligns with the evidence suggesting that the posterior parietal cortex implements egocentric representations.”

      (ii) Although the authors discuss the Lee et al. (2020) review and related studies with respect to retrospective memory, it is critical to note that this work has also often used prospective paradigms, pointing towards sequential processing being the critical determinant of hippocampal involvement, rather than the distinction between retrospective vs. prospective processing.

      We sincerely thank the reviewer for highlighting these important points. In response, we have revised the section of the Introduction discussing the neural underpinnings of duration (Pages 3-4). “Neurocognitive evidence suggests that the neural representation of duration engages distinct brain systems. The motor system—particularly the supplementary motor area—has been associated with prospective timing (e.g., Protopapa et al., 2019; Nani et al., 2019; De Kock et al., 2021; Robbe, 2023), whereas the hippocampus is considered to support the representation of duration embedded within an event sequence (e.g., Barnett et al., 2014; Thavabalasingam et al., 2018; see also the comprehensive review by Lee et al., 2020).”

      (iii) The authors make an interesting suggestion with respect to hippocampal longitudinal differences in the representation of event sequences, and may wish to relate this to Montagrin et al. (2024), who make an argument for the representation of distant goals in the anterior hippocampus and immediate goals in the posterior hippocampus.

      We thank the reviewer for bringing this intriguing and relevant study to our attention. In the Discussion of the manuscript, we have incorporated it into our discussion (Page 21): “Evidence from the spatial domain has suggested that the anterior hippocampus (or the ventral rodent hippocampus) implements global and gist-like representations (e.g., larger receptive fields), whereas the posterior hippocampus (or the dorsal rodent hippocampus) implements local and detailed ones (e.g., finer receptive fields) (e.g., Jung et al., 1994; Kjelstrup et al., 2008; Collin et al., 2015; see reviews by Poppenk et al., 2013; Robin & Moscovitch, 2017; see Strange et al., 2014 for a different opinion). Recent evidence further shows that the organizational principle observed along the hippocampal long axis may also extend to the temporal domain (Montagrin et al., 2024). In that study, the anterior hippocampus showed greater activation for remote goals, whereas the posterior hippocampus was more strongly engaged for current goals, which are presumed to be represented in finer detail.”

      Reviewing Editor Comments:

      While both reviewers acknowledged the significance of the topic, they raised several important concerns. We believe that providing conceptual clarification, adding important methodological details, as well as addressing potential confounds will further strengthen this paper.

      We thank the editor for the suggestions.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Please, provide the actual ethical approval #.

      We have added the ethical approval number in the revised manuscript (P 36): “The ethical committee of the University of Trento approved the experimental protocol (Approval Number 2019-018),”

      (2) Thirty-two participants were tested. Please report how you estimated the sample size was sufficient to test your working hypothesis.

      We thank the editor for pointing out this omission. In the revised manuscript, we have added an explanation for our choice of sample size (p. 36): “The sample size was chosen to align with the upper range of participant numbers reported in previous fMRI studies that successfully detected sequence or distance effects in the hippocampus (N = 15–34; e.g., Morgan et al., 2011; Howard et al., 2014; Deuker et al., 2016; Garvert et al., 2017; Theves et al., 2019; Park et al., 2021; Cristoforetti et al., 2022).”

      (3) All MRI figures: please orient the reader; left/right should be stated.

      In the revised manuscript, we have added labels to all MRI figures to indicate the left and right hemispheres.

      (4) In Figure 3A-B, the clear lateralization of the activation is not discussed in the Results or in the Discussion. Was it predicted?

      We thank the editors for highlighting this important point regarding hemispheric lateralization. The right-lateralization observed in our findings is indeed consistent with previous literature. In the revised manuscript, we have expanded our discussion to emphasize this aspect more clearly.

      For the parietal cortex, we now note (Page 17-18): “The negative correlation between activation in the right posterior parietal cortex (PPC) and sequential distance has previously been reported in an fMRI study by Gauthier and van Wassenhove (2016b). In their paradigm, participants were instructed to mentally position themselves at a specific time point and judge whether a target event occurred before or after that point. The authors identified a similar region (peak voxel MNI coordinates: 42, −70, 40), closely corresponding to the activation observed in the present study (peak voxel MNI coordinates: 39, −70, 35). In both studies, activation in this region increased as the target event approached the self-positioned time point, consistent with evidence suggesting that the posterior parietal cortex supports egocentric representations. Neuropsychological studies have further shown that patients with lesions in the bilateral or right PPC exhibit ‘egocentric disorientation’ (Aguirre & D’Esposito, 1999), characterized by an inability to localize objects relative to themselves (e.g., Case 2: Levine et al., 1985; Patient DW: Stark, 1996; Patients MU: Wilson et al., 1997, 2005).”

      For the hippocampus, we have added (Page 19): “Previous research has shown that hippocampal activation correlates with distance (e.g., Morgan et al., 2011; Howard et al., 2014; Garvert et al., 2017; Theves et al., 2019; Viganò et al., 2023), and that distributed hippocampal activity encodes distance information (e.g., Deuker et al., 2016; Park et al., 2021). Most studies have reported hippocampal ePects either bilaterally or predominantly in the right hemisphere, whereas only one study (Morgan et al., 2011) found the ePect localized to the left hippocampus.”

    1. Reviewer #1 (Public review):

      In this study, the authors provide an integrated proteogenomics pipeline to enable the discovery of novel peptides in an Ewing sarcoma cell line (A673). To identify novel full-length resolved isoforms, they performed long-read RNA sequencing (Oxford Nanopore Technology). Then, to increase the chance of detecting Ewing-specific neopeptides, the authors combined two approaches: a multi-protease digestion and a multi-dimensional proteomics approach.

      Given the importance of novel isoforms and cryptic sites in neoantigen discovery and its putative applications in immunotherapy, this method and resource paper are of interest for the Ewing community and potentially for a broader cancer audience. The originality of this paper relies mostly on this optimized method to discover novel peptides (long-read sequencing with multiprotease, multi-dimensional trapped ion mobility spectrometry parallel accumulation-serial fragmentation mass spectrometry). Although, to my knowledge, no study combining long-read sequencing and proteomics methods has been published on Ewing Sarcoma, this study appears limited by a few aspects:

      (1) The study is restricted to the analysis of a single cell line (A673). The authors should consider extending the analysis to other Ewing cell lines.

      (2) The characterization of the 1121 non-canonical transcripts can be improved. How many are just splice variants of known genes, and how many are bona fide neogenes? In this respect, the definition of what the authors call neogene is quite unclear. Is a transcript with a new exon reported as a neogene? Is a transcript with a new start site reported as a neogene? It should be clearly indicated which categories of Figure 4B are reported on Figure 4D. A general flow chart would be very useful to help follow the analysis process.

      (3) Similarly, the authors detect 3216 A673 specific proteins with no match in SwissProt. This number decreases to 72 "putative non-canonical proteoforms with unique peptides after BLASTp" against Uniprot. Again, a flow chart would conveniently enable one to follow the step-by-step analysis.

      (4) Finally, only 17 spectral matches are suggested to be derived from non-canonical proteoforms. It would be important to compare the spectrum of these detected peptides with that of synthetic peptides. Such an analysis would enable us to assess the number of reliably detected proteoforms that can be expected in an Ewing sarcoma cell line.

      (5) It is very unclear what the authors want to highlight in Supplementary Figure 5. Is it that non-canonical transcripts are broadly expressed in normal tissue? Which again raises the question of definitions of neogenes, non-canonical... Apparently, this figure shows that these non-canonical transcripts contain a large part of canonical sequences, which account for the strong signal in many normal tissues. A similar heatmap could be presented, including only the non-canonical sequences of the non-canonical transcripts. This figure should also include Ewing sarcoma samples.

    2. Reviewer #2 (Public review):

      The paper from Kulej et al. reports a set of tools for proteogenomic analysis of cancer proteomes. Their approach utilizes modern methods in long-read RNA sequencing to assemble a proteome database that is specific to Ewing sarcoma-derived A673 cells. To maximize proteome coverage and therefore increase the odds of detecting cancer-specific alterations at the protein level, the authors use multiple enzymes (trypsin, gluC, etc.) to digest cellular proteins and then perform multidimensional peptide fractionation. Peptide samples are then analyzed by LC-MS/MS using data-dependent and data-independent schemes on a timstof mass spectrometer. Proteogenomics is an important area of investigation for cancer research and does require new informatics tools.

      The authors describe an end-to-end workflow where they claim to have optimized four different steps:

      (1) Assembly of a sample-specific protein database using long-read transcriptomic data.

      (2) Use of 8 different proteolytic enzymes to maximize diversity of peptides.

      (3) Multiple stages of peptide fractionation using SCX and high pH rp chromatography.

      (4) Utilize acquisition methods on the timstof mass spec to provide MS/MS data from single-charged peptides and multiply-charged peptides.

      The authors published two earlier versions of ProteomeGenerator (versions 1 and 2) in the Journal of Proteome Research. In these earlier versions, 'ProteomeGenerator' was the set of software tools designed to integrate DNA and RNA sequencing to create a sample-specific protein database. To test the performance of each ProteomeGenerator version, the authors generated LC-MS/MS data using a combination of trypsin and LysC, then in the other paper, trypsin, LysC, and GluC. In both papers, they performed some levelof peptide fractionation prior to LC-MS/MS. They acquired LC-MS/MS data on a Thermo Q-Exactive in one paper and a Thermo Orbitrap mass spec in the other paper.

      In the current paper, the primary innovation is the use of long-read sequencing to potentially improve the quality of the sample specific protein database. The other three components noted above are incremental compared to the authors' previous two papers and generally accepted practices in the field of proteomics. To note one example, the authors previously digested proteins using three enzymes and now use eight. Similarly, they are now using a timstof Bruker mass spec instead of one from Thermo. The detailed descriptions around the use of many enzymes and peptide fractionation, etc., create a very technically oriented paper, similar to or more so than the authors' earlier papers in J. Proteome Research. So, while there is enthusiasm for the use of long-read sequencing across biomedical research, the impact here for proteogenomic applications is somewhat lost with all of the technical description for experimental details that are not particularly innovative. In this respect, the report is not well matched to a broad readership.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a well-structured and interesting manuscript that investigates how herbivorous insects, specifically whiteflies and planthoppers, utilize salivary effectors to overcome plant immunity by targeting the RLP4 receptor.

      Strengths:

      The authors present a strong case for the independent evolution of these effectors and provide compelling evidence for their functional roles.

      Weaknesses:

      Western blot evidence for effector secretion is weak. The possibility of contamination from insect tissues during the sample preparation should be avoided.

      Below are some specific comments and suggestions to strengthen the manuscript.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      (1) Western blot evidence for effector secretion:

      The western blot evidence in Figure 1, which aims to show that the insect protein is secreted into plants, is not fully convincing. The band of the expected size (~30 kDa) in the infested tissues is very weak. Furthermore, the high and low molecular weight bands that appear in the infested tissues do not match the size of the protein in the insects themselves, and a high molecular weight band also appears in the uninfested control tissues. It is difficult to draw a definitive conclusion that this protein is secreted into the plants based on this evidence. The authors should also address the possibility of contamination from insect tissues during the sample preparation and explain how they have excluded this possibility.

      Thank you for pointing out this. One or two bands between 25-35kDa were specifically identified in B. tabaci-infested plants, but not the non-infested plants, and the smaller high intensity band is the same size as that of BtRDP in salivary glands. This experiment has been repeated for six times. In the current version, we reperformed this experiment, and provided salivary gland sample as a positive control, which showed the same molecular weight with a specific band in infested sample. It is noteworthily that in the experiment of current version, only the smaller high intensity band appear, while the low intensity band did not appear. The detection of a protein within infested plant tissue is a key criterion for validating the secretion of salivary effectors, an approach supported by numerous studies in this field. Furthermore, our previous LC-MS/MS analysis of B. tabaci watery saliva identified six unique peptides matching BtRDP, providing independent evidence for its presence in saliva. Therefore, as we now state in the manuscript “the detection of BtRDP in infested plants (Fig. 1a) and in watery saliva (Fig. S1) collectively indicates that BtRDP is a salivary protein”.

      Regarding the higher molecular weight band that present in both infested and non-infested samples, we agree that it most likely represents a non-specific band, which is a common occurrence in Western blot assays. Such bands are sometimes used to indicate comparable sample loading. To address the possibility of contamination by insect tissues, we wish to clarify that all insects and deposited eggs were carefully removed from the infested leaves prior to sample processing. Moreover, BtRDP is undetectable at the egg stage, and no BtRDP-associated band can be detected even in egg contamination. We have revised the Methods section to explicitly state this procedure:

      “After feeding, the eggs deposited on the infested tobacco leaves were removed. The leaves showing no visible insect contamination were immediately frozen in liquid nitrogen and ground to a fine powder.”

      (2) Inconsistent conclusion (Line 156 and Figure 3c):

      The statement in line 156 is inconsistent with the data presented in Figure 3c. The figure clearly shows that the LRR domain of the protein is the one responsible for the interaction with BtRDP, not the region mentioned in the text. This is a critical misrepresentation of the experimental findings and must be corrected. The conclusion in the text should accurately reflect the data from the figure.

      We apologize for any confusion caused by the original phrasing. In our previous manuscript, the description “NtRLP4 without signal peptides and transmembrane domains” referred specifically to the truncated construct NtRLP4<sub>(23-541)</sub> used in the experiment. To prevent any misunderstanding, we have revised the sentence in the updated version to state explicitly: “Point-to-point Y2H assays reveal that NtRLP4<sub>(23-541)</sub> (a truncated version lacking the signal peptide and transmembrane domains) interacts with BtRDP<sup>-sp</sup>”.

      (3) Role of SOBIR1 in the RLP4/SOBIR1 Complex:

      The authors demonstrate that the salivary effectors destabilize the RLP4 receptor, leading to a decrease in its protein levels and a reduction in the RLP4/SOBIR1 complex. A key question remains regarding the fate of SOBIR1 within this complex. The authors should clarify what happens to the SOBIR1 protein after the destabilization of RLP4. Does SOBIR1 become unbound, targeted for degradation itself, or does it simply lose its function without RLP4? This would provide further insight into the mechanism of action of the effectors.

      Thank you for suggestion. In the current version, we assessed the impact of BtRDP on NtSOBIR1 following NtRLP4 destabilization. The results showed that while the NtRLP4-myc accumulation was markedly reduced, NtSOBIR1-flag levels remained unchanged, suggesting that destabilization of NtRLP4 did not affect NtSOBIR1 accumulation.

      (4) Clarification on specificity and evolutionary claims:

      The paper's most significant claim is that the effectors from both whiteflies and planthoppers "independently evolved" to target RLP4. While the functional data is compelling, this evolutionary claim would be more convincing with stronger evidence. Showing that two different effector proteins target the same host protein is a fascinating finding but without a robust phylogenetic analysis, the claim of independent evolution is not fully supported. It would be valuable to provide a more detailed evolutionary analysis, such as a phylogenetic tree of the effector proteins, showing their relationship to other known insect proteins, to definitively rule out a shared, but highly divergent, common ancestor.

      We appreciate the reviewer’s valuable suggestion to investigate a potential evolutionary link between BtRDP and NlSP104. Our initial analysis already indicated no detectable sequence similarity. To address this point more thoroughly, we attempted a phylogenetic analysis. However, we were unable to generate a meaningful alignment due to a complete lack of conserved amino acid sequences. Therefore, we conducted a comparative genomics analysis by blasting both proteins against the genomic or transcriptomic data of 30 diverse insect species. This analysis revealed that RDP is exclusively present in Aleyrodidae species, and SP104 is exclusively present in Delphacidae species (Table S1). Taken together, the absence of sequence similarity, their distinct protein structure, and their lineage-specific distributions, we conclude that BtRDP and NlSP104 are highly unlikely to be homologous and thus did not originate from a common ancestor.

      (5) Role of SOBIR1 in the interaction:

      The results suggest that the effectors disrupt the RLP4/SOBIR1 complex. It is not entirely clear if the effectors are specifically targeting RLP4, SOBIR1, or both. Further experiments, such as a co-immunoprecipitation assay with just RLP4 and the effector, could clarify if the effector can bind to RLP4 in the absence of SOBIR1. This would help to definitively place RLP4 as the primary target.

      We appreciate the reviewer’s insightful comments regarding whether the effector preferentially targets RLP4, SOBIR1, or both. In our study, we conducted reciprocal co-immunoprecipitation assays using RLP4 and BtRDP as controls. These assays showed that BtRDP interacts with RLP4 but does not interact with SOBIR1, supporting the conclusion that SOBIR1 is unlikely to be a direct target of BtRDP. We fully agree that testing the interaction between RLP4 and BtRDP in the absence of SOBIR1 would further strengthen the conclusion. However, we were unable to obtain N. tabacum SOBIR1 knockout mutants, and therefore could not experimentally assess whether the RLP4–BtRDP interaction persists in planta without SOBIR1. Nevertheless, our yeast two-hybrid assays demonstrate that RLP4 and BtRDP can directly interact, indicating that their association does not strictly depend on SOBIR1. Together, these results support the interpretation that RLP4 is the primary target of BtRDP, while SOBIR1 is not directly engaged by the effector.

      (6) Transcriptome analysis (Lines 130-143):

      The transcriptome analysis section feels disconnected from the rest of the manuscript. The findings, or lack thereof, from this analysis do not seem to be directly linked to the other major conclusions of the paper. This section could be removed to improve the manuscript's overall focus and flow. If the authors believe this data is critical, they should more clearly and explicitly connect the conclusions of the transcriptome analysis to the core findings about the effector-RLP4 interaction.

      Thank you for suggestion. As you and Reviewer #2 pointed, the transcriptomic analysis did not closely link to the major conclusions of the paper, and we got little information from the transcriptomic analysis. Therefore, we remove these analyses to improve the manuscript’s overall focus and flow.

      (7) Signal peptide experiments (Lines 145 and beyond):

      The experiments conducted with the signal peptide (SP) are questionable. The SP is typically cleaved before the protein reaches its final destination. As such, conducting experiments with the SP attached to the protein may have produced biased observations and could lead to unjustified conclusions about the protein's function within the plant cell. We suggest the authors remove the experiments that include the signal peptide.

      Thank you for pointing out this. The SP was retained to direct the target proteins to the extracellular space of plant cells. Theoretically, the SP is cleaved in the mature protein. This methodology is widely used in effector biology. For example, the SP directs Meloidogyne graminicola Mg01965 to the apoplast, where it functions in immune suppression, whereas Mg01965 without the SP fails to exert this function (10.1111/mpp.12759). In our study, the SP of BtRDP was expected to guide the target protein to the extracellular space, facilitating its interaction with RLP4. Moreover, the observed protein sizes of BtRDP with and without the SP in transgenic plants were identical, suggesting successful SP cleavage. Therefore, we have retained the experiments involving the SP in the current version.

      (8) Overly strong conclusion and unclear evidence (Line 176):

      The use of the word "must" on line 176 is very strong and presents a definitive conclusion without sufficient evidence. The authors state that the proteins must interact with SOBIR1, but they do not provide a clear justification for this claim. Is SOBIR1 the only interaction partner for NtRLP4? The authors should provide a specific reason for focusing on SOBIR1 instead of demonstrating an interaction with NtRLP4 first. Additionally, do BtRDP or NlSP694 also interact with SOBIR1 directly? The authors should either tone down their language to reflect the evidence or provide a clearer justification for this strong claim.

      Thank you for pointing this out. In the current version, the word “must” has been toned down to “may” due to insufficient supporting evidence. In this study, SOBIR1 was chosen because it has been widely reported to be required for the function of several RLPs involved in innate immunity. However, it remains unclear whether SOBIR1 is the only interaction partner of NtRLP4. In the current version, we have clarified the rationale for focusing on SOBIR1 prior to the experiments “The receptor-like kinase SOBIR1, which contains a kinase domain, has been widely reported to be required for the function of RLPs involved in innate immunity (Gust & Felix, 2014)” and discussed that “Although NtRLP4 interacts with SOBIR1, this alone does not confirm that it operates strictly through this canonical module. Evidence from other RLPs shows that co-receptor usage can be flexible, and some RLPs function partly or conditionally independent of SOBIR1. Therefore, a more definitive assessment of NtRLP4 signaling will therefore require genetic dissection of its co-receptor dependencies, including but not limited to SOBIR1.”. In addition, the direct interaction between BtRDP and SOBIR1 was experimentally tested, and the results showed that BtRDP failed to interact with SOBIR1.

      Minor Comments

      (9) The statement in the abstract, "However, it remains unclear how these invaders are able to overcome receptor perception and disable the plant signaling pathways," is not entirely accurate. The fields of effector biology and host-pathogen interactions have provided significant insight into how pathogens and pests manipulate both Pattern-Triggered Immunity (PTI) and Effector-Triggered Immunity (ETI). While the specific mechanism described in this paper is novel, the broader claim that the field is unclear on these processes weakens the initial hook of the paper. A more precise framing of the problem would be beneficial, perhaps by stating that the specific mechanisms used by these particular herbivores to target RLP4 were previously unknown.

      Thank you for this insightful comment. We agree that the original statement in the abstract overstated the lack of understanding in the field. In the current version, we have refined the sentence to more accurately reflect the current state of knowledge, emphasizing that while microbial suppression of plant immunity has been extensively studied, the strategies used by herbivorous insects to overcome receptor-mediated defenses remain less understood. The revised sentence now reads as follows: “Although the mechanisms used by microbial pathogens to suppress plant immunity are well studied, how herbivorous insects overcome receptor-mediated defenses remains unclear”.

      (10) The introduction is heavily focused on Pattern Recognition Receptors (PRRs), which, while central to the paper's findings, gives a somewhat narrow view of the plant's defense against herbivores. It would be beneficial to briefly acknowledge the broader context of plant defenses, such as physical barriers, direct chemical toxicity, and indirect defenses, before narrowing the focus to the specific molecular interactions of PRRs that are the core of this study. This would provide a more complete picture of the "arms race" between plants and herbivores.

      Thank you for this valuable suggestion. We agree that the original introduction focused too narrowly on pattern-recognition receptors (PRRs). In the current version, we have expanded the introductory section to provide a broader overview of plant defense mechanisms. Specifically, we now acknowledge the multiple layers of plant defenses, including physical barriers (e.g., cuticle and cell wall), chemical defenses (e.g., toxic secondary metabolites and anti-nutritive compounds), and indirect defenses mediated by herbivore-induced volatiles. This addition provides a more complete context for understanding the molecular interactions discussed in this study. The revised paragraph now reads as follows: “Plants have evolved sophisticated defense systems to survive constant attacks from pathogens and herbivorous insects. These defenses operate at multiple levels, including physical barriers such as the cuticle and cell wall, chemical defenses involving toxic secondary metabolites and anti-nutritive compounds, and indirect defenses that attract natural enemies of herbivores through the emission of herbivore-induced volatiles. Beyond these general strategies, plants also rely on highly specialized molecular immune responses that allow them to detect and respond rapidly to invaders.”

      (11) The figure legends are generally clear, but some could be more detailed. For instance, in Figure 2, it would be helpful to explicitly state what each bar represents in the graph and to include the statistical test used. Please ensure all panels in all figures have clear labels.

      Thank you for this helpful suggestion. We have revised the legend of Fig. 2 and other figures to provide more detailed information for each panel. Specifically, we now explicitly describe what each bar represents in the graphs and specify the statistical test used. In addition, we ensured that all panels are clearly labeled. These changes improve clarity and allow readers to better interpret the data.

      (12) The methods section is comprehensive, but it would be helpful to include more specifics on the statistical analyses used. For example, the type of statistical test (e.g., t-test, ANOVA) and the software used should be mentioned for each experiment.

      Thank you for your suggestion. We have revised the Methods section (Statistical analysis) to provide more detailed information on the statistical analysis used for each experiment.

      (13) The manuscript's overall impact is weakened by the inclusion of unnecessary words and a few grammatical issues. A focused revision to tighten the language would make the major findings stand out more clearly. For example, on page 2, line 18, "in whitefly Bemisia tabaci, BtRDP is an Aleyrod..." seems to have an incomplete sentence. A thorough proofreading for typos and grammatical errors is highly recommended to improve the overall readability.

      Thank you for your suggestion. We have carefully revised the abstract and the manuscript to improve clarity, readability, and grammatical correctness. In addition, we sought the assistance of a professional English editor to thoroughly proofread and polish the manuscript, ensuring that the language meets high academic standards.

      (14) The discussion section is strong, but it could benefit from a more explicit connection between the findings and the broader ecological implications. For instance, how might the independent evolution of these effectors in different insect species impact plant-insect co-evolutionary dynamics?

      We thank the reviewer for the valuable suggestion. In the current version, we have added a paragraph in the Discussion section highlighting the broader ecological and evolutionary implications of our findings. Specifically, we discuss how the independent evolution of RLP4-targeting effectors in different insect lineages may drive plant-insect co-evolution, influence selection pressures on both plants and herbivores, and potentially shape defense diversification across plant communities. This addition helps to link our molecular findings to ecological outcomes and co-evolutionary dynamics.

      (15) The sentence on line 98, which reads " A few salivary proteins have been reported to attach to salivary sheath after secretion" seems to serve an unclear purpose in the introduction. It would be helpful for the authors to clarify its relevance to the surrounding context or to the paper's overall argument. Its inclusion currently disrupts the flow of the introduction and makes it difficult for the reader to understand its intended purpose.

      We thank the reviewer for the comment. We have revised the paragraph to clarify the relevance of salivary sheath localization to the study. Specifically, we now introduce the role of the salivary sheath as a potential scaffold for effector delivery and explicitly link previous reports of sheath-associated salivary proteins to our observation that BtRDP localizes to the salivary sheath after secretion.

      (16) The writing in lines 104-106 is both grammatically inconsistent and overly wordy. The authors switch between present and past tense ("is" and "was"), and the sentences could be made more concise to improve the clarity and flow of the text. Also check entire paper.

      We thank the reviewer for pointing this out. We have revised the sentence to improve grammatical consistency and clarity, and also checked the manuscript for similar issues. The sentence is now split into two concise statements. In addition, we have thoroughly checked the entire manuscript for similar tense inconsistencies and overly wordy sentences, and have made revisions throughout to ensure consistent past tense usage and improved readability.

      (16) The sentences on lines 111-113 are quite wordy. The core conclusion, which is that the protein affects the insect's feeding probe, could be expressed more simply and directly to improve clarity and flow. I suggest rephrasing this section to be more concise and to highlight the primary finding without the added language.

      We thank the reviewer for the helpful suggestion. We have revised the sentences to make them more concise and to emphasize the main finding that BtRDP influences the whitefly’s feeding behavior as follow: “Compared with the dsGFP control, dsBtRDP-treated B. tabaci showed a marked reduction in phloem ingestion and a longer pathway duration, indicating that BtRDP is required for efficient feeding (Fig. 2c).”

      (17) On line 118, the authors mention "subcellular location." It is not clear where the protein is localized. The authors should explicitly state the specific subcellular compartment of the protein, as this is crucial for understanding its function and interaction with other proteins.

      We thank the reviewer for this valuable comment. To clarify the subcellular localization of BtRDP, we have revised the manuscript accordingly. The transgenic line overexpressing the full-length BtRDP including the signal peptide (oeBtRDP) is expected to localize in the apoplast (extracellular space), whereas the line expressing BtRDP without the signal peptide (oeBtRDP<sup>-sp</sup>) is likely retained in the cytoplasm.

      (18) Lines 121-128, the description of the fecundity and choice assays in this section is overly wordy. The authors should present the main conclusion of these experiments more directly and concisely. The key finding is that the protein affects feeding behavior; this central point is somewhat lost in the detailed, and sometimes repetitive, phrasing.

      We thank the reviewer for this suggestion. In the revised manuscript, we have simplified the description of the fecundity and two-choice assays to highlight the main conclusion as follow: “Fecundity and two-choice assays showed that BtRDP, whether localized in the apoplast (oeBtRDP) or cytoplasm (oeBtRDP<sup>-sp</sup>), enhanced whitefly settling and oviposition compared with EV controls (Fig. 2d-i; Fig. S10), indicating that BtRDP promotes whitefly feeding behavior regardless of its subcellular location.”

      (19) Line 148, the manuscript mentions experiments involving transformation, but the transformation efficiency is not provided. Please include the transformation efficiency for all transformation experiments, as this is crucial for the reproducibility of the results.

      We thank the reviewer for raising this point. We would like to clarify that no transformation experiments were performed in this section. The experiments described involved Y2H screening using BtRDP<sup>-sp</sup> as a bait to identify interacting proteins from a N. benthamiana cDNA library. Therefore, there is no transformation efficiency to report.

      (20) Line 159, the manuscript refers to a sequence similarity around line 159 but does not provide the specific data. It is important to show the actual sequence similarity, perhaps in a supplementary figure or table, to support the claims being made.

      We thank the reviewer for this suggestion. To support our statement regarding sequence similarity, we have added the corresponding alignment figure in the Fig. S11.

      (21) Line 159, the manuscript refers to "three randomly selected salivary proteins." It is unclear from where these proteins were selected. The authors should clarify the source of this selection (e.g., a specific database or a previous study) to ensure the methodology is transparent and the results are reproducible.

      We thank the reviewer for raising this point. These proteins were selected based on previously reports (10.1093/molbev/msad221; 10.1111/1744-7917.12856). In the current version, we provide the accession of these proteins in the MS.

      (22) Line 160, the description "NtcCf9 without signal peptide and transmembrane domains" is difficult to understand. It would be clearer and more consistent to use a term like "truncated NtcCf9" and then specify which domains were removed, as this is a standard practice in molecular biology for describing protein constructs.

      We thank the reviewer for this suggestion. We have revised the manuscript to describe the construct as “truncated NtCf9” and specified that the signal peptide and transmembrane domains were removed

      (23) The phrase "incubated with anti-flag beads" on line 172 is a detail of a routine method. Such details are more appropriate for the Methods section rather than the main text, which should focus on the results and their implications. Please remove such descriptions from the main text to improve readability and flow.

      We thank the reviewer for this suggestion. We have removed the methodological detail from the main text to improve readability. We also check this throughout the MS.

      I am excited about the potential of this work and look forward to seeing the current version.

      We sincerely thank the reviewer for the positive feedback and encouragement. We appreciate your time and thoughtful comments.

      Reviewer #2 (Public review):

      Summary:

      The authors tested an interesting hypothesis that white flies and planthoppers independently evolved salivary proteins to dampen plant immunity by targeting a receptor-like protein.

      Strengths:

      The authors used a wide range of methods to dissect the function of the white fly protein BtRDP and identify its host target NtRLP4.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      Weaknesses:

      (1) Serious concerns about protein work.

      I did not find the indicated protein bands for anti-BtRDP in Figures 1a and 1b in the original blot pictures shown in Figure S30. In Figure 1a, I can't get the point of showing an unspecific protein band with a size of ~190 kD as a loading control for a protein of ~ 30 kD.

      The data discrepancy led me to check other Western blot pictures. Similarly, Figures 2d, 3b, 3d, and S15b (anti-Myc) do not correspond to the original blots shown. In addition, the anti-Myc blot in Figure 4i, all blot pictures in Figures 5b, 5h, and S19a appeared to be compressed vertically. These data raised concerns about the quality of the manuscript.

      Blots shown in Figure 3d, 4f, 4g, and 4h appeared to be done at a different exposure rate compared to the complete blot shown in Figure S30. The undesirable connection between Western blot pictures shown in the figures and the original data might be due to the reduced quality of compressed figures during submission. Nevertheless, clarification will be necessary to support the strength of the data provided.

      We sincerely thank the reviewer for carefully examining our Western blot data and for pointing out these inconsistencies. The discrepancy between the figures in the main text and the original blots (Figure S30) resulted from an oversight during manuscript revision. This manuscript had undergone multiple rounds of revision after submission to another journal. During this process, the main figures and supplementary figures were updated separately, and we mistakenly failed to replace the original blot files with the corresponding current versions.

      For the different exposure rate, the blots shown in the main text were adjusted for overall contrast and brightness to enhance band visibility and presentation clarity, whereas the original images in Figure S30 were raw, unprocessed scans directly from the imaging system. For example, in the Author response image 1 below, to visualize the loading of the input sample, the output figure was adjusted for overall contrast and brightness. This was acceptable for image processing (https://www.nature.com/nature-portfolio/editorial-policies/image-integrity)

      Author response image 1.

      The same figure with brightness and contrast changes across the entire image.

      For the vertical compression, in the previous version, some images were vertically compressed for layout purposes to make the composite figures appear more visually balanced. However, after consulting relevant publication guidelines, we realized that such one-dimensional compression is not encouraged by certain journals as it may alter the original aspect ratio of the image. Therefore, in the manuscript, we have avoided any non-proportional scaling and retained the original aspect ratio of all images.

      We have now carefully rechecked all Western blot data, replaced the outdated raw blot images with the correct corresponding ones, avoid vertical compression, and ensured that the processed figures in the main text match their original data. The revised supplementary figures now accurately reflect the raw experimental results.

      (2) Misinterpretation of data.

      I am afraid the authors misunderstood pattern-triggered immunity through receptor-like proteins. It is true that several LRR-type RLPs constitutively associate with SOBIR1, and further recruit BAK1 or other SERKs upon ligand binding. One should not take it for granted that every RLP works this way. To test the hypothesis that NtRLP4 confers resistance to B.tabaci infestation, the author compared transcriptional profiles between an EV plant line and an RLP4 overexpression line. If I understood the methods and figure legends correctly, this was done without B. tabaci treatment. This experimental design is seriously flawed. To provide convincing genetic evidence, independent mutant lines (optionally independent overexpression lines) in combination with different treatments will be necessary. Otherwise, one can only conclude that overexpressing the RLP4 protein generated a nervous plant. In addition, ROS burst, but not H2O2 accumulation, is a common immune response in pattern-triggered immunity.

      We agree with the reviewer that not every RLP functions through the same mechanism as the canonical SOBIR1–BAK1 pathway. In the current version, we further examined the interaction between the whitefly salivary protein and SOBIR1, and found that they do not interact. However, our interaction assays clearly demonstrated that NtRLP4 does interact with SOBIR1. Whether NtRLP4 functions through, or exclusively through, SOBIR1 remains uncertain, and we have emphasized this limitation in the Discussion section as follow: “Although NtRLP4 interacts with SOBIR1, this alone does not confirm that it operates strictly through this canonical module. Evidence from other RLPs shows that co-receptor usage can be flexible, and some RLPs function partly or conditionally independent of SOBIR1 [39]. Therefore, a more definitive assessment of NtRLP4 signaling will therefore require genetic dissection of its co-receptor dependencies, including but not limited to SOBIR1.”

      Regarding the transcriptome analysis, our original aim was to explore why B. tabacishowed such a pronounced preference among tobacco plants. As this preference was assessed using uninfested plants, we also performed transcriptome sequencing using plants without B. tabaci treatment. The enrichment analysis demonstrated that the majority of up-regulated DEGs were associated with plant–pathogen interaction, environmental adaptation, MAPK signaling, and signal transduction pathways, while down-regulated DEGs were enriched in glutathione, carbohydrate, and amino acid metabolism. Notably, many DEGs were annotated as RLK/RLPs or WRKY transcription factors, most of which were upregulated, suggesting an enhanced defense state in the NtRLP4-overexpressing plants. The altered expression of JA- and SA-related genes (e.g., upregulation of FAD7 and downregulation of PAL and NPR1) further supported this enhanced defense and hormonal crosstalk. We agree that combining overexpression or knockout lines with insect infestation treatments would provide more direct genetic evidence for NtRLP4-mediated resistance, and we have acknowledged this as an important future direction. Nevertheless, our current data are consistent with the conclusion that NtRLP4 overexpression confers increased resistance to B. tabaci infestation.

      Finally, DAB staining for H<sub>2</sub>O<sub>2</sub> accumulation is also a well-established indicator of PTI responses, and many studies have shown that overexpression of salivary elicitors can trigger such accumulation.

      (3) Lack of logic coherence.

      The written language needs substantial improvement. This impeded the readability of the work. More importantly, the logic throughout the manuscript appeared scattered. The choice of testing protein domains for protein-protein interactions, using plants overexpressing an insect protein to study its subcellular localization, switching back and forth between using proteins with signal peptides and without signal peptides, among others, lacks a clear explanation.

      We appreciate the reviewer’s careful reading and valuable comments regarding the logical coherence of our manuscript.

      (1) To improve the English quality, the entire manuscript has been professionally edited by a certified language-editing service.

      (2) Regarding the rationale for testing protein domains in the protein–protein interaction assays: NtRLP4 is a membrane-anchored receptor-like protein composed of extracellular, transmembrane, and short intracellular domains. We aimed to determine which region of NtRLP4 is responsible for interacting with the salivary protein, as this would help infer the likely site of interaction in planta. In addition, not all RLPs contain a malectin-like domain, and we sought to verify whether the BtRDP–NtRLP4 interaction depends on this domain. To enhance the logical flow, we introduced a brief statement explaining the experimental purpose before presenting the interaction assays in the current version as follow: “These findings raised the question of which domain of NtRLP4 is responsible for binding BtRDP, as identifying the interacting domain could help infer where the salivary protein contacts the receptor in planta. We therefore dissected the NtRLP4 domains accordingly.”

      (3) With respect to using plants overexpressing an insect protein to examine subcellular localization: since both the brown planthopper and the whitefly are non-model species for which stable genetic transformation is technically unfeasible, many previous studies have used Agrobacterium-mediated transient expression or transgenic plant systems to investigate the subcellular localization of insect salivary proteins within host cells. Following these precedents, our study also employed plant systems to determine the localization of the insect protein and to assess how different localizations affect plant defense responses.

      (4) As for switching between constructs with or without signal peptides: the subcellular localization of effectors can influence their biological activity and interactions. Previous studies have used the presence or absence of signal peptides, or replacement with a PR1 signal peptide, to direct protein targeting (for example, Frontiers in Plant Science, 2022, 13:813181). Because salivary sheaths are generally considered to localize in the apoplastic space, we generated two transgenic N. tabacum lines overexpressing BtRDP: one carrying the full-length coding sequence including the signal peptide (oeBtRDP), expected to be secreted into the apoplast, and another lacking the signal peptide (oeBtRDP-sp), likely retained in the cytoplasm. In the current version, we clarified this rationale and added references to similar studies to improve the manuscript’s logic and readability. Details are as follow: “To investigate the role of BtRDP in different subcellular location of host plants, we constructed two transgenic N. tabacum lines overexpressing BtRDP: one carrying the full-length coding sequence including the signal peptide (oeBtRDP), which is expected to be secreted into the apoplast (extracellular space), and the other lacking the signal peptide (oeBtRDP<sup>-sp</sup>), which is likely retained in the cytoplasm.”

      Reviewer #3 (Public review):

      Summary:

      In this study, Wang et al. investigate how herbivorous insects overcome plant receptor-mediated immunity by targeting plant receptor-like proteins. The authors identify two independently evolved salivary effectors, BtRDP in whiteflies and NlSP694 in brown planthoppers, that promote the degradation of plant RLP4 through the ubiquitin-dependent proteasome pathway. NtRLP4 from tobacco and OsRLP4 from rice are shown to confer resistance against herbivores by activating defense signaling, while BtRDP and NlSP694 suppress these defenses by destabilizing RLP4 proteins.

      Strengths:

      This work highlights a convergent evolutionary strategy in distinct insect lineages and advances our understanding of insect-plant coevolution at the molecular level.

      Thank you very much for your comments. We have carefully revised the MS following your valuable suggestions and comments.

      Weaknesses:

      (1) I found the naming of BtRDP and NlSP694 somewhat confusing. The authors defined BtRDP as "B. tabaci RLP-degrading protein," whereas NlSP694 appears to have been named after the last three digits of its GenBank accession number (MF278694, presumably). Is there a standard convention for naming newly identified proteins, for example, based on functional motifs or sequence characteristics? As it stands, the inconsistency makes it difficult for readers to clearly distinguish these proteins from those reported in other studies.

      Thank you for your comment. These are species-specific salivary proteins that have not been reported or annotated in previous studies. Because no homologous genes could be identified in other species, there are no existing names or annotations for these proteins. For such lineage-specific salivary proteins, it is common in recent studies to name them according to their experimentally identified functions. For example, a recently reported salivary protein was named SR45-interacting salivary protein (SISP) based on its function (10.1111/nph.70668). Following this convention, we adopted a similar functional naming strategy in this study. We acknowledge that there may not yet be a standardized rule for naming such proteins, and we would be glad to follow a more authoritative naming guideline if possible.

      (2) Figure 2 and other figures. Transgenic experiments require at least two independent lines, because results from a single line may be confounded by position effects or unintended genomic alterations, and multiple lines provide stronger evidence for reproducibility and reliability.

      We appreciate the reviewer’s suggestion. In our study, two independent transgenic lines were used to ensure the reproducibility and reliability of the results. One representative line was presented in the main figures, while data from the second independent line were included in the supplementary figures. To make this clearer, we have emphasized in the manuscript that bioassays were conducted using two independent transgenic lines.

      (3) Figure 3e. Quantitative analysis of NtRLP4 was required. Additionally, since only one band was observed in oeRLP, were any tags included in the construct?

      Thank you for your comment. In the current version, quantitative analysis of NtRLP4 expression has been performed and is now presented in Figure 3. For the oeRLP plants, no tag was fused to NtRLP4; thus, anti-RLP serum was used to detect the target bands. In contrast, oeBtRDP and oeBtRDP-sp were fused with C-terminal FLAG tags, and their detection was carried out using anti-FLAG serum. This information has been clarified in the revised Methods section as follows: “The oeBtRDP and oeBtRDP<sup>-sp</sup> were fused with C-terminal FLAG tags, while no tag was fused to oeNtRLP4.”

      (4) Figure 4a. The RNAi effect appears to be well rescued in Line 1 but poorly in Line 2. Could the authors clarify the reason for this difference?

      Thank you for pointing this out. We also noticed that the RNAi effect appeared to be better rescued in Line 2 than in Line 1. Based on our measurements, the silencing efficiency of NtRLP4 in RNAi-RLP4 Line 1 was markedly weaker than in Line 2, which likely explains the difference in rescue efficiency. In the current version, we have clarified this point as follows: “Both RNAi-RLP lines showed reduced NtRLP4 levels compared with EV plants, with RNAi-RLP#2 exhibiting a stronger silencing effect (Fig. S19a).” “The differential rescue effect between the two RNAi lines likely resulted from their different NtRLP4 silencing efficiencies, with the lower NtRLP4 level in RNAi-RLP#2 leading to a more complete rescue phenotype.”

      (5) ROS accumulation is shown for only a single leaf. A quantitative analysis of ROS accumulation across multiple samples would be necessary to support the conclusion. The same applies to Figure 16f.

      Thank you for pointing this out. The H<sub>2</sub>O<sub>2</sub> accumulation experiments have been repeated for 5 times in Figure 4 and Figure S16f. In the current version, we addressed that “the experiment is repeated five times with similar results” in the figure legends.

      (6) Figure 4f: NtRLP4 abundance was significantly reduced in oeBtRDP plants but not in oeBtRDP-SP. Although coexpression analysis suggests that BtRDP promotes NtRLP4 degradation in an ubiquitin-dependent manner, the reduced NtRLP4 levels may not result from a direct interaction between BtRDP and NtRLP4. It is possible that BtRDP influences other factors that indirectly affect NtRLP4 abundance. The authors should discuss this possibility.

      Thank you for your valuable suggestion. We agree that the reduced NtRLP4 abundance may not necessarily result from a direct interaction between BtRDP and NtRLP4. In the manuscript, we have further discussed this possibility as follows: “Notably, BtRDP and NlSP104 shared no sequence or structural similarity and lack resemblance to known eukaryotic ubiquitin-ligase domains. Their interaction with RLP4s occurs in the extracellular space (Fig. 3d; Fig. 5c), whereas the ubiquitin-proteasome system primarily functions in the cytosol and nucleus [46]. Furthermore, NtRLP4 reduction is observed only in oeBtRDP transgenic plants, not in oeBtRDP-sp plants (Fig. 4f), suggesting that BtRDP exerts its influence on NtRLP4 in the extracellular space. These observations collectively argue against the possibility that BtRDP or NlSP694 possesses intrinsic E3 ligase activity capable of directly ubiquitinating RLP4s within plant cells. Importantly, the reduced NtRLP4 levels may not result from a direct physical interaction between BtRDP and NtRLP4. Instead, BtRDP may indirectly affect RLP4 post-translational modification, thereby accelerating its degradation, which warrants further investigation”

      (7) The statement in lines 335-336 that 'Overexpression of NtRLP4 or NtSOBIR1 enhances insect feeding, while silencing of either gene exerts the opposite effect' is not supported by the results shown in Figures S16-S19. The authors should revise this description to accurately reflect the data.

      Thank you for pointing this out. We agree that our original statement was not precise, as we measured the insect settling preference and oviposition on transgenic plants, but did not directly assess the feeding behavior of B. tabaci. Therefore, we have revised the description in the manuscript to more accurately reflect our data as follows: “Overexpression of NtRLP4 or NtSOBIR1 in N. tabacum is attractive to B. tabaci and promotes insect reproduction, whereas silencing of either gene exerts the opposite effect.”

      (8) BtRDP is reported to attach to the salivary sheath. Does the planthopper NlSP694 exhibit a similar secretion localization (e.g., attachment to the salivary sheath)? The authors should supplement this information or discuss the potential implications of any differences in secretion localization between BtRDP and NlSP694 for their respective modes of action.

      Thank you for your insightful suggestion. We agree that determining the secretion localization of NlSP694 would provide valuable information for understanding its potential mode of action. Immunohistochemical (IHC) staining is indeed a critical approach for such analysis. However, in this study, we were unable to express NlSP694 in Escherichia coli, and the antibody generated using a synthesized peptide did not show sufficient specificity or sensitivity for IHC detection. Consequently, we were unable to determine whether NlSP694 is attached to the salivary sheath. Therefore, whether BtRDP and NlSP694 acted in different mode require further investigation.

      Recommendations for the authors:

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 1e. The BtRDP-labeled fluorescent signal is difficult to discern. An enlarged view of the target region would be helpful for clarity.

      Thank you for your suggestion. In the current version, an enlarged view of the target region was provided below the figure.

      (2) The finding that BtRDP accumulates in the salivary sheath secreted by Bemisia tabaci is important for understanding the subcellular localization of this protein during actual insect feeding. I suggest moving Figure S5 to the main text.

      Thank you for your suggestion. Figure S5 has been moved to Fig. 1f in the current version.

      (3) Please carefully cross-check the figure numbering to ensure that all in-text citations correspond to the correct figures and panels. i.e., lines 136,188,192, and 194.

      Thank you for pointing this out. We corrected them in the current version.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      The manuscript by Haghighi and McPhie et al. builds upon their previous findings by exploring the mitochondrial localization as a disease-associated phenotype in mental disorders, particularly in psychotic disorders. They recruited a cohort of patients diagnosed with schizophrenia, schizoaffective disorder, bipolar disorder and MDD. By taking advantage of skin biopsies, they screened patient-derived fibroblasts for aberrant mitochondrial localization and morphology using common staining techniques. Then, they use a machine learning approach to classify patients into their respective groups, which was effective for BP, SZA and pooled psychotic patients. Authors then develop a single feature for phenotyping, Mito-SLOPE, a metric of mitochondria density distribution across a cell by radial areas. With this metric, psychotic patients tend to have more nuclear-localized than edge-localized mitochondria; whereas MDD patients show a trend for higher edge-to-nucleus distribution. To find candidate drugs, authors screen publicly available datasets of cells treated with small compounds using mito-SLOPE. Furthermore, authors then apply mitoSLOPE on a CRISPR screen dataset, showcasing the role of mitochondrial dynamics genes and three genes of interest because of their association with psychosis. Finally, they identified the top genes whose KO or overexpression may explain (or reverse) the mitoSLOPE phenotype.

      Overall, the manuscript is well-written, the conclusions are supported within their limitations and this work represents an advancement in the field. I recommend it for publication provided these concerns are addressed:

      Major comments:

      1. The mitoSLOPE measure is very interesting and most likely reflects a subtle changes in mitochondrial transport. How does the microtubule network look like in the patient fibroblasts, are there obvious alterations in e.g. their posttranslational modifications? Is there a difference in mito transport speed or pausing frequency?
      2. I concur with the exclusion of compounds that obviously alter cell shape, as the authors mention for the cancer therapeutics. Some cancer therapeutics actually affect microtubule dynamics (see 1st point), which may underlie their effect on both cell shape and mitoSLOPE. To undertand the mechanism of action, the top hits should also be tested for the integrity of the microtubular network and mitochondrial transport parameters.
      3. While I agree with the authors' reasoning that the observed phenotype could be a result of the disease or the result of a compensatory mechanism, their hypothesis could be experimentally tested by addition of any of the top hits in order to reverse mitoSLOPE in their patient cell lines. It may not have worked for Lithium in their last manuscript, but the mechanism of action of the novel compounds could be cell intrinsic.
      4. Does recreation of the CRISPR cell line in their hands produce the same phenotype?
      5. Additionally, the observed phenotypes could also be a product of the medication taken by the patients. Deeper patient data from the cohort may be relevant to put the findings in context. How were patients diagnosed? Which medications were the patients taking? Was substance abuse present? In Mertens et al, Lithium responders and Lithium non-responders showed a differential mitochondrial response, how does this affect their dataset?
      6. While MDD itself is not a psychotic disorder, it can still present with psychotic features. Was this evaluated during the recruitment? Also important, were they on antipsychotic medication in addition to antidepressant therapy?
      7. The fact that CACNA1C is excluded from the "unbiased" hit discovery (Fig 8) undermines the power of the filtering criteria selected by the authors. Authors should include some discussion around this.

      Minor comments:

      1. Colored images should be made colorblind-accessible. This applies to microscopy images and graphs.
      2. Fig 3: Exact p-values should be reported in the graphs
      3. Fig. 5 and Fig 7a-b: It is not immediately clear what the lines in these graphs represent. Is it the individual drug/gene hits in a pre-ranked manner?
      4. Fig 6 b-c: should the "m" be capitalized for Molarity?
      5. The annotation of divalproex/valproic acid as a "benzodiazepine receptor agonist" is incorrect. While it is known to enhance GABAergic neurotransmission, the mechanism is supported to be through GABA synthesis rather than being a GABA-A receptor agonist (see eg. PMID: 23407051).
      6. Supplementary Fig 3 and 4 could be swapped to match the main text order.
      7. One reference was inaccessible: Anon, Phenomics-Enabled Discovery and Optimization of Small Molecule RBM39 Degraders as Alternative to CDK12 Targeting in High-Grade Serious Ovarian Cancer (HGSOC).

      Significance

      Recently, mitochondria have emerged as mediators of anxious behavior and are increasingly studied in the context of neuropsychiatric disorders. However, the molecular mechanisms that connect altered mitochondrial performance to specific neuropathological conditions are unknown. This study extends our knowledge in this realm. While it is in principle an extension of earlier work from the authors (Cataldo, A.M. et al. Am. J. Pathol. 2010), it has added value due to the application of their automated analysis to publicly available datasets, providing a clear technical advance. This identified known as well as novel compounds that could revert the mitochondrial phenotype and makes this study specifically interesting to an audience interested in translational research. The strength of the manuscript certainly lies in the large number of examples studied and their well-rounded discussion of their findings. It is limited by the fact that the phenotype of neuropsychiatric conditions is studied in peripheral cells, and thus may not be a simple cell-autonomous response but a compensatory, systemic response that is not easy to replicate in a fibroblast in isolation. No mechanistic insight is gained on the underlying cell biology in the current format.

    1. Reviewer #1 (Public review):

      Summary:

      In this manuscript, Seegren and colleagues demonstrate that in a mouse model of neonatal E. coli meningitis, loss of endothelial toll-like receptor 4 (TLR4) leads to a marked decrease in transcriptional dysregulation across multiple leptomeningeal cell types, a decrease in vascular permeability, and a decrease in macrophage abundance. In contrast, loss of macrophage TLR4 had less pronounced effects. Using cultured wild-type and TLR4-knockout endothelial cells, the authors further demonstrate that TLR4-NF-κB signaling leads to reversible internalization of the tight junction protein claudin-5, establishing a potential mechanism of increased vascular permeability. Finally, the authors use RNA-sequencing of wild-type and TLR4-knockout endothelial cells to define the TLR4-dependent cell-autonomous transcriptional response to E. coli.

      Strengths:

      (1) The authors address an important, well-motivated hypothesis related to the cellular and molecular mechanisms of leptomeningeal inflammation.

      (2) The authors use model systems (mouse conditional knockouts and cultured endothelial cells) that are appropriate to address their hypotheses. The data are of high quality.

      Weaknesses:

      (1) The authors perform single-nucleus RNA-seq on dissected leptomeninges from control and E. coli-infected mice across three genotypes (WT, Tlr4MKO, and Tlr4ECKO). A major discovery from this experiment, as summarized by the authors, is: "Tlr4ECKO mice exhibited a global attenuation of infection-induced transcriptional responses across all major leptomeningeal cell types, as judged by the positions of cell clusters in the UMAP." This conclusion could be considerably strengthened by improving the qualitative and quantitative analysis.

      (2) The authors interpret E. coli infection-induced increases in leptomeningeal sulfo-NHS-biotin as evidence of compromised BBB integrity (i.e., extravasation from the vasculature) (Results, page 7), but another possible route in this context is sulfo-NHS-biotin entry from the dura across a compromised arachnoid barrier. The complete rescue in Tlr4ECKOs is strongly suggestive that the vascular route dominates, but it would strengthen the work if the authors could assess arachnoid barrier fidelity (e.g. via immunohistochemistry). At a minimum, authors should mention that the sulfo-NHS-biotin signal in this context may represent both vascular and arachnoid barrier extravasation.

      (3) The authors state that "deletion of TLR4 prevented both NF-κB nuclear translocation and Cldn5 internalization in response to E. coli (Figure 4A-D)" (Results, page 9). In Figures 4C and D, however, there is no indicator of a statistical test directly comparing the two genotypes. A comparison of within-genotype P-values should not be used to support a genotype difference (PMID: 34726155).

      (4) In the first paragraph of the Results, the authors summarize the meningeal layers as (1) pia, (2) subarachnoid space, (3) arachnoid, and (4) dura, and then state "The second and third layers constitute the leptomeninges." This definition of leptomeninges seems to omit the pia, which is widely considered part of the leptomeninges (PMID: 37776854).

      (5) The Cdh5-CreER/+;Tlr4 fl/- mouse lacks TLR4 in all endothelial cells (i.e., in peripheral organs as well as CNS/leptomeninges), and, as the authors note, the periphery is exposed to E. coli. It would be helpful if the authors could comment in the Discussion on the possibility that peripheral effects (e.g., peripheral endothelial cytokine production, changes to blood composition as a result of changes to peripheral endothelial permeability) may contribute to the observed leptomeningeal phenotypes.

    2. Reviewer #2 (Public review):

      Summary:

      The authors use a postnatal mouse model of E. coli bacterial meningitis and a mouse brain endothelioma cell line combined with cell-type-specific gene deletion to study the function of endothelial TLR4, a cell surface receptor that recognizes gram positive bacterial wall components, in the local leptomeningeal (LPM) response with a focus on endothelial barrier breakdown mediated by TLR4. Single-cell transcriptional profiling and imaging studies using whole-mount preps of the LPM support that LPM endothelial, CD206+ local macrophage and LPM fibroblast and arachnoid barrier cell inflammatory response and is abrogated in endothelial-specific KO of TLR4, pointing to a role for endothelial TLR4 in local LPM response. Culture studies using Bend3.1 cells (a mouse brain endothelioma cell line) support a direct role for TLR4 in the bacteria-mediated inflammatory response and in internalization of Cldn5 via the endosomal-lysosomal pathway, resulting in loss of barrier integrity

      Strengths:

      The local LPM cell response in meningitis and the role of specific LPM cells in inflammation and CNS barrier breakdown have not been extensively studied, despite ample evidence for primary immune response in the meninges in human patients and in animal models. The authors employ a robust, multi-model approach using both in vivo and in vitro models with cell-type-specific knockout to study the function of TLR4 in brain endothelial cell response. The authors nicely combine functional barrier assays with IF for junctional localization in their experimental design, and they delve into potential mechanisms of Cldn5 internalization using markers of endosomal-lysosomal pathway localization. The authors also describe a new type of barrier assay using a streptavidin-coated plate upon which barrier-forming cell cultures can be placted, this could be a very useful alternative or complement to other size-selective barrier assays and presumably could work for other barrier forming cells types, likely epithelial cells.

      Weaknesses:

      (1) There are no measures of bacterial burden in peripheral organs, blood, in the LPM or brain in the TLR4 endothelial cKO mice. Lack of TLR4 in endothelial cells could prevent bacterial 'access' into the LPM and brain, essentially preventing meningitis and leading to a lack of inflammatory responses in the LPM-located cells simply because there is no bacteria present. Bacteremia may also be reduced, as might inflammatory responses in peripheral organs with TLR4-deficient peripheral endothelium. Bacterial counts and inflammatory measures in peripheral organs and blood are important to better understand the mechanism(s) underlying the reduced inflammatory profile in LPM cells and no LPM endothelial breakdown in the Tlr4 endothelial cKO mice. In other words, does deleting TLR4 in EC protect against the development of meningitis by somehow blocking bacteria access to the LPM (this would be supported by low or no CFU counts in infected Tlr4 endothelial cKO) or is it what the authors appear to propose in Figure 1J that TLF4 in EC is the only cell responding to the bacteria to trigger the immune cascade in the LPM? More data is needed to resolve this, as this is a major claim of the paper.

      (2) The authors look at the underlying cortical response (cerebral vasculature for ICAM and immune cells) but do not use markers that could identify microglia (Iba1), the primary resident immune cell (CD206 is not useful, at this stage, in perivascular macrophages that are extremely sparse in the postnatal brain). This would be important to better study the impact on CNS resident immune cell morphological activation.

      (3) The authors suggest that Cldn5 junctional localization is selectively disrupted upon bacterial exposure, mediated by TLR4 - they suggest this based on studying PECAM, GLUT-1, ZO-1 and B-catenin (all normally junction or cell surface located in cultured Bend3.1) in relationship to Cldn5 localization (normally high) - it is possibly these are also impact by bacteria exposure (maybe through different mechanisms?) - a better measure would be to use the similar cyto/PM measure they do for Cldn5 in Fig. 4D and to evaluate this or to use intensity measurements.

      (4) The discussion could benefit from delving more into the prior literature on E.coli-mediated breakdown of junctions in cultured human microvascular brain endothelial cell model and critical host-pathogen interactions of the bacteria with ECs (PMID: 14593586), and how this might involve TLR4.

      (5) It would be important to discuss how their results relate to earlier studies on TLR4-/- and TLR2-/- global knockout mice and protection vs vulnerability to development of meningitis (see PMCID: PMC3524395) - this paper showed that TLR4 global KO mice have increased susceptibility to die from meningitis and have much higher CFU counts in the CNS. In this manuscript and their prior work (Wang et al., 2023), this group shown that both global TLR4-/- mutants and their EC-specific KO have reduced barrier permeability, but we don't have any information about CFU or susceptibility to death from meningitis in their models.

    3. Reviewer #3 (Public review):

      Summary:

      This study investigates the molecular underpinnings of immune responses in the leptomeninges in neonatal bacterial meningitis. Bacterial meningitis is a major disease burden, particularly for neonates, and it has previously been noted that the meningeal immune environment in infants is permissive to opportunistic infection (Kim et al., Sci Immunol, 2023). There is less known about the contribution of the stromal compartment to meningeal immune responses. Seegren et al. interrogate the role of leptomeningeal endothelium in host defence in E. coli infected neonatal mice using mouse genetic tools to delete the LPS receptor Tlr4 from either endothelial cells (using Cdh5-CreER) or macrophages (using LysM-Cre). The authors use snRNAseq, cleared cortical mounts, and in vitro work to define the impact of E. coli infection on leptomeningeal endothelial cells. This study uses a range of innovative techniques to probe the role of the stromal compartment in meningitis.

      Strengths:

      This study makes excellent use of cleared cortical mounts to examine the biology of the leptomeninges, in particular, changes to the endothelium, with unprecedented detail. In combination with high-quality sequencing data provide new insights into the impact of meningitis on the leptomeninges. The data presented by the authors is of very high quality.

      Weaknesses:

      The weaknesses of the study were in terms of interpretation and perhaps study design.

      (1) Most importantly, the authors need to provide additional validation of their conditional knockout models. The authors need to confirm that the Cdh5-CreER does not impact leptomeningeal fibroblasts and to confirm gene deletion in macrophages.

      (2) The authors could also strengthen the paper by providing data on the impact of these conditional knockout models on the course of meningitis and bacterial burden.

      (3) Finally, it is perhaps not surprising that Tlr4 is required for meningitis responses with E. coli. However, it is unclear if these findings can be generalised to other, more common, meningitis infections (streptococcal/pneumococcal).

      (4) There are additional minor issues; for instance, the arachnoid fibroblast 2 population appears to closely resemble dural border cells.

      (5) The cell line model (bEnd.3) is a relatively low-fidelity model of BBB endothelial cells, and this should be acknowledged.

      With these caveats, it is difficult to be certain that the endothelium alone is the driver of meningeal immune responses in meningitis, and what the impact of these is.

    1. Reviewer #1 (Public review):

      This is an excellent paper from Dr. Yokoyama and colleagues. The experiments are technically demanding, given the very low cell numbers and the challenges of working with implantation sites at gestational days 6.5, 10.5, and 14.5. Overall, the impact of TGF-β receptor II deficiency in the NK lineage on uterine trNK cell numbers and litter size is convincing, and the authors' conclusions are well supported by the data. Less convincing, however, is the claim that the decrease in trNK cells is compensated by an increase in cNK cells; rather, the absence of TGF-β receptor II appears to result in an overall reduction of NK/ILC1 cells.

      Major Points:

      (1) Figure 1A and B

      Although a trend is evident, it does not appear that the absolute number of cNK cells at day 14 is significantly changed from day 6.5?

      (2) Figure 2E

      The authors state, "This reduction of uterine trNK cells was accompanied by a concomitant increase in the absolute number and frequency of CD49b+Eomes+ cNK cells within the pregnant uterus of TGF-βRIINcr1Δ dams (Figure 2 D, E). The number of cNK cells appears relatively low (visually ~1,000-1,300), and although the difference is statistically significant, its physiological relevance is unclear. More importantly, this modest increase does not correlate with the marked decrease in trNK and ILC1 populations, as cNK cells do not appear to accumulate. In my opinion, the conclusion "Collectively, these findings indicate that a TGF-β-driven differentiation pathway directs the conversion of peripheral cNK cells into uterine trNK cells during murine pregnancy" should be slightly toned down.

      (3) Figures 2-4

      It is unclear whether the littermate controls are floxed mice or floxhet-Ncr1iCre mice? This distinction is important, as Ncr1iCre expression itself could potentially lead to a phenotype.

    2. Reviewer #2 (Public review):

      In their manuscript "TGF-β drives the conversion of conventional NK cells into uterine tissue-resident NK cells to support murine pregnancy", Yokoyama and colleagues investigate the role of Tgfbr2 expression by NK cells in the formation of tissue-resident uterine NK cells and subsequent importance in murine pregnancy. By transferring congenic splenic conventional NK cells into pregnant mice, they show conversion of circulating NK cells into uterine ivCD45 negative tissue-resident NK cells. When interfering with the formation of uterine trNK cells, spiral artery remodelling was impaired, fetal resorption rates were increased, and litter sizes were reduced.

      Generally, this is a research topic of high interest, yet the manuscript is lacking detailed mechanistic insights, and some questions remain open. At the current state, the data represent an interesting characterisation of the Tgfbr2-fl/fl Ncr1-Cre mice in pregnancy, but considering (a) the recent publication by the group (Reference 17) on the role of Eomes+ cNK cells during pregnancy, (b) the previously described role of Tgfbr2 and autocrine TGFb expression for uterine NK cell differentiation in virgin mice (also cited by the authors), and (c) the well-known relevance of uterine NK cells during pregnancy, additional experiments addressing the specific role of Tgfb during pregnancy would help to improve novelty and significance of the manuscript. To this end, the following aspects should be discussed and, where applicable, experimentally addressed by the authors:

      (1) The authors suggest cNK extravasation and local differentiation into iv- trNK.

      Can it be estimated how much this process contributes to the trNK pool vs. a potential local proliferation of already existing trNK? How do absolute numbers of CD49a+ Eomes+ trNK change during pregnancies? (In Figure 1A, the cell numbers of CD49a+ Eomes+ trNK seem to go down dramatically between gd 6.5 and 14.5). The plot in 1B could also include absolute numbers of ILC1s and trNKs. Would recruited cNK cells compensate for a potential loss of CD49a+ Eomes+ trNK?

      (2) Figure 1C: 2.5

      Mio cNK cells have been transferred, but only very few cells can be detected within the uterus (concatenated FACS plot shown). What may represent the limit to generate uterine trNK out of cNK? Is the niche supporting cNK-trNK differentiation limited? Is it only a specific subset of (splenic) cNK capable of differentiating into trNK? Is gd 0.5 the optimal timepoint for the transfer? Is there continuous recruitment of cNK into the uterus and differentiation into trNK, or is it enhanced at specific timepoints of pregnancy? Could there be local proliferation of cNK-derived trNK? This could be studied by proliferation dye dilution of WT cNK cells in this transfer-setup.

      (3) The authors should consider inducible Tgfbr2 deletion (e.g. with Tamoxifen-inducible Cre) to enable development of the uterine NK compartment in virgin mice and only ablate trNK differentiation during pregnancy. This could help to estimate the turnover of cNK into trNK, or to understand if constant cNK recruitment is required to form the uterine trNK compartment during pregnancy.

      (4) Did the authors consider transfer of Tgfbr2-floxed Ncr1-Cre cNK in the same setup as in Fig. 1C? This experiment could confirm the requirement of Tgfbr-dependent signalling for cNK to trNK conversion during pregnancy versus effects of Tgfb signals on trNK numbers in the uterus at steady state (before pregnancy).

      (5) Figures 2D/E

      The authors should state that ILC1s are reduced in the virgin uterus of female Tgfbr2-floxed or Tgfb1-floxed Ncr1-Cre mice and cite the relevant work (the Ref #29 discussed in this context did not show that?). It would be helpful to include an analysis of all three uterine ILC subsets in steady state. This could help to answer the question if the cNK cell changes are pregnancy-specific or a general phenomenon in Tgfbr2-floxed Ncr1-Cre mice.

      (6) Figure 2E

      Please phrase more carefully about the "concomitant increase" of cNKs, since this increase is much less pronounced compared to the very strong reduction (absence) of trNKs in Tgfbr2-floxed Ncr1-Cre mice. Do the authors suggest that cNKs are halted at this stage and cannot differentiate into trNK, based on these data?

      (7) Figure 3/4

      Can the reduced litter size and the abnormal spiral artery formation be rescued by transfer of WT cNK into Tgfbr2-floxed Ncr1-Cre mice?

    1. Partenariat Parents-École : Un Pilier pour la Réussite Scolaire

      Résumé Analytique

      Ce document de synthèse analyse les points clés de la conférence organisée par Parents Partenaires en Éducation (PPE) Ontario, portant sur l'importance cruciale du partenariat entre les familles et les institutions scolaires.

      Le message central est que la réussite des élèves ne repose pas uniquement sur l'école, mais sur une collaboration étroite et proactive où les parents agissent en tant que « co-éducateurs ».

      L'engagement parental est structuré autour de trois dimensions : l'investissement personnel, l'investissement cognitif et l'engagement institutionnel.

      Pour les familles, particulièrement celles issues de l'immigration, cette implication est un levier majeur pour déconstruire les biais inconscients, valoriser l'identité culturelle et assurer une intégration réussie.

      L'analyse démontre que l'inclusion est un choix délibéré et que le sentiment d'appartenance ne peut émerger que lorsque les voix des parents participent activement aux processus de décision au sein des conseils d'école et des comités.

      --------------------------------------------------------------------------------

      1. Cadre Conceptuel de l'Engagement Parental

      L'engagement parental ne se limite pas à la supervision des devoirs ; il s'agit d'un investissement multidimensionnel qui influence directement les performances académiques et le bien-être socio-affectif de l'enfant.

      Les Trois Dimensions de l'Engagement

      Selon la littérature scientifique citée, l'engagement se décline comme suit :

      | Dimension | Description | Exemples concrets | | --- | --- | --- | | Investissement personnel | Aspirations et intérêt manifesté pour la vie scolaire de l'enfant. | Discussions sur la journée, intérêt pour les camarades et les activités. | | Investissement cognitif | Accompagnement dans les tâches et respect des structures scolaires. | Supervision des devoirs, fréquentation de la bibliothèque, respect des règles (ex: usage des appareils électroniques). | | Engagement institutionnel | Présence effective et participation aux processus de décision. | Participation aux conseils d'école, comités de parents, réunions et bénévolat actif. |

      --------------------------------------------------------------------------------

      2. L'Identité et les Valeurs : Fondements du Partenariat

      L'identité et les valeurs des parents ne doivent pas rester à la porte de l'école. Elles constituent les filtres à travers lesquels le partenariat s'exprime.

      L'identité comme outil de décodage : Le système scolaire a besoin de connaître l'identité socioculturelle des familles pour adapter son offre de services (enseignants, travailleurs sociaux).

      La décolonisation de l'esprit : Pour les parents immigrants, il est essentiel d'articuler leur identité face au choc culturel et de valoriser leurs origines pour que l'enfant se sente en sécurité dans son environnement scolaire.

      Le filtre des valeurs : Les décisions majeures concernant l'éducation de l'enfant doivent être passées au filtre des valeurs familiales. L'implication dans les conseils d'école permet de challenger l'approche « taille unique » (one size fits all) des politiques scolaires.

      --------------------------------------------------------------------------------

      3. Analyse des Bénéfices de la Collaboration

      La collaboration entre les parents et l'école crée une dynamique « gagnant-gagnant » pour toutes les parties prenantes.

      Pour l'Élève

      Renforcement de la confiance : L'enfant est fier de voir sa famille impliquée et valorisée.

      Motivation accrue : La proximité des parents stimule l'engagement de l'élève dans ses propres apprentissages.

      Réduction des biais : Une collaboration étroite permet de changer le regard du personnel scolaire sur l'enfant, transformant parfois une perception négative (ex: hyperactivité perçue comme un trouble) en une reconnaissance de traits positifs (ex: curiosité et créativité).

      Pour les Parents

      Fluidité de la communication : Les échanges directs avec les enseignants facilitent la résolution rapide des problématiques.

      Acteur du changement : Les parents peuvent influencer les politiques (ex: code vestimentaire, introduction de l'uniforme, littératie financière).

      Lutte contre l'isolement : L'implication favorise l'intégration sociale et culturelle, surtout pour les nouveaux arrivants.

      Pour le Personnel Scolaire

      Meilleure compréhension culturelle : Les parents aident les enseignants à décoder les comportements des élèves sous un angle culturellement adapté.

      Soutien opérationnel : Le bénévolat parental (ex: accompagnement au musée) enrichit l'expérience pédagogique.

      --------------------------------------------------------------------------------

      4. Diversité, Inclusion et Appartenance

      Une distinction cruciale est faite entre ces trois concepts pour guider l'action parentale :

      1. La Diversité : Un fait statistique (nombres, quotas, pluralité linguistique et culturelle).

      2. L'Inclusion : Un choix individuel et collectif. C'est la volonté d'accueillir et de s'intégrer activement.

      3. L'Appartenance : Le stade ultime, atteint uniquement lorsque les voix des minorités sont intégrées aux discussions et aux processus de décision.

      --------------------------------------------------------------------------------

      5. Exemples d'Impact par l'Engagement Proactif

      La source met en lumière plusieurs cas où l'initiative parentale a transformé l'environnement scolaire :

      Adaptation culturelle : La proposition d'un coin calme pour la prière a permis à un élève de vivre sa foi en sécurité, harmonisant les valeurs de la maison et de l'école.

      Valorisation identitaire : Une séance de lecture de contes et de danses africaines a transformé la perception d'une élève sur ses vêtements traditionnels, passant de la honte à la fierté.

      Innovation curriculaire : L'initiative d'un parent a mené à l'adoption de la littératie financière comme priorité au sein d'un conseil d'école.

      Réorientation stratégique : La proximité entre une mère et une enseignante a permis de rediriger un élève vers un programme plus adapté à son profil (Baccalauréat International), modifiant ainsi sa trajectoire académique.

      --------------------------------------------------------------------------------

      6. Conclusion et Appel à l'Action

      Le document conclut que le manque de temps est souvent une barrière perçue plutôt que réelle. Une heure par mois offerte au conseil d'école peut suffire pour exercer une influence positive.

      Messages clés pour l'avenir :

      • Les parents sont les premiers éducateurs ; l'école fournit l'instruction, les parents fournissent l'éducation.

      • L'implication des parents est le seul moyen efficace pour que le système scolaire connaisse et respecte l'identité des familles qu'il sert.

      • Chaque parent possède un pouvoir d'influence et doit choisir d'être un acteur du changement pour garantir une société pluraliste et enrichie par ses différences.

    1. Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

      Comments on revisions:

      The authors have successfully addressed my previous questions and concerns. The public comments above reflect my views on the initial submission and, in my opinion, will remain helpful for general readers. Given this, I do not have additional public comments and will keep my previous public review unchanged.

    2. Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strength

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests-were conducted, and the results are clearly reported.

      Comments on revisions:

      The authors have addressed our comments in their revised manuscript and in their response to the reviewers. We don't have any further suggestions or comments.

    3. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      The manuscript titled "The distinct role of human PIT in attention control" by Huang et al. investigates the role of the human posterior inferotemporal cortex (hPIT) in spatial attention. Using fMRI experiments and resting-state connectivity analyses, the authors present compelling evidence that hPIT is not merely an object-processing area, but also functions as an attentional priority map, integrating both top-down and bottom-up attentional processes. This challenges the traditional view that attentional control is localized primarily in frontoparietal networks.

      The manuscript is strong and of high potential interest to the cognitive neuroscience community. Below, I raise questions and suggestions to help with the reliability, methodology, and interpretation of the findings.

      Thank you for a nice summary of the key points of our study. Below you will find our reply to your questions.

      (1) The authors argue that hPIT satisfies the criteria for a priority map, but a clearer justification would strengthen this claim. For example, how does hPIT meet all four widely recognized criteria, such as spatial selectivity, attentional modulation, feature invariance, and input integration, when compared to classical regions such as LIP or FEF? A more systematic summary of how hPIT meets these benchmarks would be helpful. Additionally, to what extent are the observed attentional modulations in hPIT independent of general task difficulty or behavioral performance?

      Great suggestions! For the first suggestion, we have included a clearer justification in the discussion part of manuscript (line 405-406). For the second one, all participants received task practice prior to scanning, and task accuracy exceeded 90%, suggesting the tasks were not overly demanding. Although ceiling effects limit the interpretability of behavioral-performance correlations, we argue that higher task demands would likely require greater attentional effort, leading to stronger modulation in hPIT, which aligns with our findings.

      (2) The authors report that hPIT modulation is invariant to stimulus category, but there appear to be subtle category-related effects in the data. Were the face, scene, and scrambled images matched not only in terms of luminance and spatial frequency, but also in terms of factors such as semantic familiarity and emotional salience? This may influence attentional engagement and bias interpretation.

      The response of hPIT is not sensitive to stimulus category, but attentional modulation in hPIT is slightly stronger to faces than scenes and scrambled images. Although faces used in the task had neutral expressions and the scene pictures were also neutral, we acknowledge that we indeed cannot exclusively eliminate the possibility that potential semantic familiarity or emotional salience may contribute to the subtle category-related effects in the results of experiment 3. This limitation has been noted in the discussion part of manuscript (line 440-442).

      (3) The result that attentional load modulates hPIT is important and adds depth to the main conclusions. However, some clarifications would help with the interpretation. For example, were there observable individual differences in the strength of attentional modulation? How consistent were these effects across participants?

      Yes, individual differences exist. In the manuscript, we have included individual subject data points in the figure 6B. No data exceeded three standard deviations from the group mean, suggesting that the attentional modulation effects were generally consistent across participants.

      (4) The resting-state data reveal strong connections between hPIT and both dorsal and ventral attention networks. However, the analysis is correlational. Are there any complementary insights from task-based functional connectivity or latency analyses that support a directional flow of information involving hPIT? In addition, do the authors interpret hPIT primarily as a convergence hub receiving input from both DAN and VAN, or as a potential control node capable of influencing activity in these networks? Also, were there any notable differences between hemispheres in either the connectivity patterns or attentional modulation?

      Though it’s hard to generate directional flow of information from fMRI due to the low temporal resolution. We agree that besides resting-state connection, task-based functional connectivity analyses would have the potential to provide additional information about whether hPIT serves as a convergence node or a control hub. We have conducted task-based functional connectivity analyses, specifically PPI, using data from experiment 2, which revealed task-modulated right hPIT connectivity with FFA, LOp, and TPJ, suggesting hPIT may allocate attentional resources to object-processing regions following priority map generation (line 378-383). Given the limited number of significant PPI results and the inherent constraints of fMRI in capturing fast or transient attention-related interactions, the present data do not allow us to determine the role of hPIT. Future studies combining effective connectivity or causal perturbation methods (e.g., DCM, TMS-fMRI) would be ideal to test whether hPIT acts as a control node influencing activity within DAN and VAN.

      We also observed modest hemispheric asymmetries in connectivity—for instance, both left and right hPIT showed stronger connectivity with right-hemisphere attention nodes. This has been described in the results part of manuscript (line 373-377).

      (5) A few additional questions arise regarding the anatomical characteristics of hPIT: How consistent were its location and size across participants? Were there any cases where hPIT could not be reliably defined? Given the proximity of hPIT to FFA and LOp, how was overlap avoided in ROI definition? Were the functional boundaries confirmed using independent contrasts?

      We can see a relatively consistent size and location of hPIT across subjects in Supplementary Figure 1, where the voxel size and location for individual subjects reported. The consistency also demonstrated by figure 4C.

      We avoided overlap with the FFA and LOp by manually delineating the hPIT which is defined by conjunction maps across three tasks and by avoiding overlapping voxels. The FFA was defined using an independent contrast (Exp3 contrast [face-scene]) and the Lop location was defined by anatomical parcellation (Glasser et al., 2016).

      Reviewer #2 (Public review):

      Summary

      This study investigates the role of the human posterior inferotemporal cortex (hPIT) in attentional control, proposing that hPIT serves as an attentional priority map that integrates both top-down (endogenous) and bottom-up (exogenous) attentional processes. The authors conducted three types of fMRI experiments and collected resting-state data from 15 participants. In Experiment 1, using three different spatial attention tasks, they identified the hPIT region and demonstrated that this area is modulated by attention across tasks. In Experiment 2, by manipulating the presence or absence of visual stimuli, they showed that hPIT exhibits strong attentional modulation in both conditions, suggesting its involvement in both bottom-up and top-down attention. Experiment 3 examined the sensitivity of hPIT to stimulus features and attentional load, revealing that hPIT is insensitive to stimulus category but responsive to task load - further supporting its role as an attentional priority map. Finally, resting-state functional connectivity analyses showed that hPIT is connected to both dorsal and ventral attention networks, suggesting its potential role as a bridge between the two systems. These findings extend prior work on monkey PITd and provide new insights into the integration of endogenous and exogenous attention.

      Strengths

      (1) The study is innovative in its use of specially designed spatial attention tasks to localize and validate hPIT, and in exploring the region's role in integrating both endogenous and exogenous attention, as prior works focus primarily on its involvement in endogenous attention.

      (2) The authors provided very comprehensive experiment designs with clear figures and detailed descriptions.

      (3) A broad range of analyses was conducted to support the hypothesis that hPIT functions as an attentional priority map -- including experiments of attentional modulation under both top-down and bottom-up conditions, sensitivity to stimulus features and task load, and resting-state functional connectivity. These analyses showed consistent results.

      (4) Multiple appropriate statistical analyses - including t-tests, ANOVAs, and post-hoc tests - were conducted, and the results are clearly reported.

      Thank you for a nice summary of the key points and strengths of our study.

      Weaknesses

      (1) The sample size is relatively small (n = 15), and inter-subject variability is big in Figures 5 and 6, as seen in the spread of individual data points and error bars. The analysis of attention-modulated voxel map intersections appears to be influenced by multiple outliers.

      We agree that the sample size (n = 15) is not ideal, and we acknowledge that some data points in Figures 5 and 6 appear to be potential outliers. However, according to conventional outlier detection criteria, all data points fell within three standard deviations of the group mean and were therefore retained for analysis.

      Moreover, the attention-modulated voxel intersection map shown in Figure 4C is insensitive to outliers, because the intersection plotted is based on the number of subjects

      (2) The authors acknowledge important limitations, including the lack of exploration of feature-based attention and the temporal constraints inherent to fMRI.

      Yes, we have mentioned these limitations in the discussion.

      (3) Prior research has established that regions such as the prefrontal cortex (PFC) and posterior parietal cortex (PPC) are involved in both endogenous and exogenous attention and have been proposed as attentional priority maps. It remains unclear what is uniquely contributed by hPIT, how it functionally interacts with these classical attentional hubs, and whether its role is complementary or redundant. The study would benefit from more direct comparisons with these regions.

      In this study, we define the ROI base on intersection across three different types of spatial attention tasks, which is a stricter criterion. And the results didn’t reveal spatial attentional modulation across tasks besides PITd. This could be due to the lack of lateralized responses in PFC/PPC. To evaluate whether a region qualifies as a priority map, we applied four widely accepted criteria (as mentioned in introduction). While dorsal and ventral attention network (DAN and VAN) regions can be considered supportive components of the priority map system, our findings suggest that among the regions tested, only hPIT fully meets all criteria. In Experiment 2, we included regions such as VFC (as part of PFC) and IPS (as part of PPC), and our findings suggest these areas are more involved in top-down attention. In the revision, we have performed additional analysis on PPC (IPS) and PFC (FEF, VFC), shown in Figure S2.

      (4) The functional connectivity analysis is only performed on resting-state data, and this approach does not capture context-dependent interactions. Task-based data analysis can provide stronger evidence.

      We acknowledge that resting-state FC is limited in assessing task-specific communication. To further investigate the role of hPIT, we have conducted PPI analysis, which revealed task-modulated right hPIT connectivity in attention allocation (line 378-383).

      (5) The study does not report whether attentional modulation in hPIT is consistent across the two hemispheres. A comparison of hemispheric effects could provide important insight into lateralization and inter-individual variability, especially given the bilateral localization of hPIT.

      We thank the reviewer for this suggestion. hPIT was localized bilaterally using the same intersection-based method in Experiment 1. We have now performed additional analysis and found hemispheric differences in hPIT attentional modulation (Experiment 2). Besides, we also found in Experiment 3, the difference of load modulation (averaged across stimulus categories) in left and right hPIT was not significant. These results have been reported in the results part of manuscript (line 347-351).

    1. The meditation training involved six 60-minute group sessions(held over 7 weeks, because of religious holidays) with 20 –30participants per group. All sessions were led by a stress-management specialist (Sandra M. Finkel) with extensive experi-ence practicing and teaching LKM. The median number of ses-sions attended was five (M  4.3, SD  1.8). At the first session,participants were given a CD that included three guided medita-tions of increasing scope, led by the workshop instructor. DuringWeek 1, participants practiced a meditation directing love andcompassion toward themselves. During Week 2, the meditationadded loved ones. During subsequent weeks, the meditation builtfrom self, to loved ones, to acquaintances, to strangers, and finally,to all living beings. The first meditation lasted 15 min, and thefinal one lasted 22 min.

      LKM and the different levels the meditation group were exposed to. 1) guided CD meditation led by instructor 2) added loved ones 3)acquaintances 4) strangers 5) all living beings

    Annotators

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      General statements.

      We thank the reviewers for their positive response and useful suggestions on our manuscript. They recognize the ‘proof of concept’ nature of the work and the importance of extending the number of human mutation-specific DMD mouse models from one to five for preclinical research. We feel that the quality of the manuscript has been improved upon implementation of the reviewer’s suggestions.

      Reviewer 1.

      OPTIONAL - From the point of view of the reviewer, it seems plausible to use CRISP/Cas9 to "clean up" the original hDMDmdx mouse line by selectively removing one of the YACs forming the tail-to-tail tandem in the mouse genome. Once such single copy mouse line is generated (and proven viable?) any subsequent rearrangement of the hDMD transgene would prove much less challenging. Such mouse line would also better represent human model where only one DMD copy is carried on the X chromosome.

      The reviewer gives the optional suggestion that the generation of these models could have been combined with the removal of one of the copies of the YAC to extend the use of the new models to CRISPR-based therapies. This is correct, but we note that when the data on the removal of a copy of the YAC were published, our new models were already generated and in different stages of QC, colony building and analysis. The procedure described by Chey et al could be used on our new models, but this would require additional time and funding and is therefore outside the scope of this manuscript.

      The labels in figure 2B and 3A would benefit from showing the PCR fragment lengths as well as the sizes of obtained hDMD exon deletions. On could also include an additional figure panel demonstrating the principle of ASO-induced exon skipping

      Reviewer #1 also has a minor comment regarding the exact deletions in figure 2B and 3A. For fig. 2B he/she suggests to include the sizes of the PCR fragments next to the gel. Especially for the gel regarding PCR1, which detects the deleted YAC copy, this will not be very informative as this can be (and is) different for different clones depending on the NHEJ-mediated repair in the specific clone. Adding sizes is only interesting for each specific clone, and adding them all will make a very messy figure. The important message from this gel is the presence of any fragment, as the undeleted copy is not amplified under the conditions used. For the gel of PCR2 the opposite is the case, here the PCR fragment shown is simply the undeleted YAC copy, and here we are only interested in the absence of the PCR fragment.

      We thank the reviewer for the suggestion of adding the deletion sizes to fig 3A. This made us realize that an additional table with the details of the mutant alleles in all models had been omitted, and we apologize for this error. With the revised version we include details on the size of the deletions and their genomic coordinates (in the human genome as it is in the human YAC) of each of the new models (revised Sup. Table 1). We trust that adding these details will clarify this reviewer’s minor comment.

      The reviewer requests to include an additional figure panel demonstrating the principle of ASO-induced exon skipping. We have now added this to the revised version of the manuscript (new fig. 5).

      The study is fairly limited in scope and will be of primary interest to those working in the DMD field.

      We are aware of 9 clinical trials for exon 51 and 53 studies that are ongoing or were recently stopped. For four of these compounds companies have a license to our hDMDdel52/mdx mouse model, and one of these studies has been published. An additional 7 clinical trials are planned or ongoing for exon 44, 45 and 50 skipping for which the newly developed models are being or can be used for preclinical studies.

      Reviewer 2.

      To further strengthen the rigor of the study, it would be valuable to include an analysis of potential off-target effects of CRISPR editing, particularly given that double targeting of two YAC copies was required. This is especially important for germline edits, as off-target mutations could introduce confounding phenotypes in the resulting mice. Demonstrating minimal or absent off-target activity would increase confidence in the specificity and safety of the generated models.

      There has indeed been one major study suggesting a large number of CRISPR-induced off-target mutations in mouse models. However, this publication was rapidly questioned by multiple groups for having used the wrong control animals and the original publication was retracted (https://doi.org/10.1038/nmeth0518-394a). Another study at that time, using the correct controls, did not find mutations that could be attributed to CRISPR-induced off-target mutations. A more recent study analysed founder animals from transgenic projects using 163 different guide RNAs and concluded ‘In total, only 4.9% (8/163) of guides tested have detectable off-target activity, at a rate of 0.2 Cas9 off-target mutations per founder analysed. In comparison, we observe __~1,100 unique variants in each mouse regardless of genome exposure to Cas9 __indicating off-target variants comprise a small fraction of genetic heterogeneity in Cas9-edited mice.’ In short, the background mutation rate in mice is much higher than the Cas9 off-target mutation rate. In addition to this, we only used guide RNAs that did not have any predicted off-target sites (according to the CRISPOR tool; https://crispor.gi.ucsc.edu/crispor.py) on the same chromosome or in protein coding sequences, so that any undetected off-target mutation will rapidly be lost in the subsequent breeding. We also would like to refer the reviewer to the ‘referee cross-commenting remark’ from reviewer #3 on this topic.

      The validation of the dystrophic phenotype is generally convincing. However, the authors should clarify how "human dystrophin" is detected in the deletion models. Since only part of the dystrophin gene in these mice is humanized (the remainder is murine), it is important to specify, also in the results, which antibody was used and which epitope/exon it recognizes. If the antibody targets a deleted exon in a given model, this could lead to misinterpretation of the dystrophin signal. Providing this clarification would ensure the conclusions regarding dystrophin expression are fully supported.

      This question is based on the incorrect assumption that only part of the DMD gene in these models is humanized. As described in the original publication on the YAC transgenics the complete human gene is in the YAC. Here, we deleted a particular exon from this complete human DMD gene. In combination with the mdx allele, these mice lack the full-length mouse and human dystrophin isoforms expressed in muscle. As mentioned in the materials section, the human dystrophin protein was detected with the Mandys 106 antibody (recognizing exon 43; amino acids 2063-2078), which only has reactivity with human dystrophin according to the product specification of Sigma Aldrich. We confirmed this for wild type mouse tissue, showing no dystrophin for this antibody. In fig 4 we confirm lack of human dystrophin in the deletion models using this antibody. The mouse and human dystrophin protein was detected with the AB154168 antibody of Abcam (recognizing the last 100 amino acids of the C-terminal part of the protein), which has reactivity with both mouse and human. So neither antibody did target a deleted exon. For the exon skipping validation, solely the Abcam antibody was used, as none of the deleted or skipped exons was recognized by this antibody. Information regarding the targeted protein region has now been added to the materials section.

      Additionally, to further strengthen the characterization of the muscular dystrophy phenotype, the authors could quantify muscle fibre size and the percentage of centrally nucleated fibres, both of which are widely accepted quantitative markers of ongoing degeneration/regeneration in DMD models.

      and

      The validation of exon skipping in the new hDMD deletion models is convincing at the molecular level. However, since the ASOs were injected into both gastrocnemius and triceps muscles, it would be helpful to include at least a brief characterization of the triceps, even in supplementary data, as different muscles can show slightly different pathology and responses. Additionally, while the molecular readouts (RT-PCR and Western blot) demonstrate restoration of dystrophin expression, including simple histological analysis, such as H&E staining, could further support functional improvement and reinforce the physiological relevance of exon skipping in these models.

      The proof-of-principle nature of the current manuscript is focused on restoration of dystrophin expression shortly after ASO treatment, and the current sample sizes (n=3 mice per strain) are too limited for actual quantification of histopathological improvements. Furthermore, the timespan between the intramuscular injection and tissue collection (2 weeks) does not allow sufficient time for histopathological improvements to develop. Notably, a large natural history analysis of all these new models is currently ongoing, which includes a large variety of in vivo functional outcome measures and provides a full description of the histopathological aspects of these mice. The proposed characterization of the triceps is now included as supplementary data of the manuscript (Sup. Fig 1).

      Reviewer 3.

      This reviewer starts with pointing out some typos, or requested rephrasing to sentences for clarification. We appreciate this and have addressed this in the revised version of the manuscript.

      Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".

      The simple answer to this is that we were (pleasantly) surprised ourselves by the efficiency we got in the ES cells (which was based on the previous experience generating the del52 model). For animal welfare reason we prefer to generate models via ES cells if we expect a long and cumbersome quality control process and / or very low efficiency, as ES cells allow us to do this QC before the actual animals are generated, thus reducing the number of animals generated during the model generation phase. Expecting very low efficiency, we originally picked 10 x 96 well plates of clones for this del44 targeting, but after pre-screening the first two plates (192 clones), we realized this was an enormous overkill in clones, and the additional 8 plates were not analysed. With this much higher than expected efficiency, and the power of the two-step pre-screen described in the manuscript, we decided to try the next model (the del45) directly in zygotes. This was found efficient enough to also do the last two models directly in zygotes. We can only speculate on the much higher efficiency than observed for the del52 targeting. Clearly the fact that we knew of the double integration this time allowed us to develop the successful 2-step pre-screen. Another difference is that the del52 model was generated using TALENs as genome editors, whereas now we could use CRISPR/Cas9.

      Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.

      The requested additional details on ASO design and dosing have been added to the materials section of the revised manuscript. The reviewer also pointed out that fig 4 includes both a protein sample diluted to 10% of protein of both a C57BL/6J and hDMD/mdx control mouse, and requested a justification for this. We included samples of both wildtype strains to confirm species reactivity of the dystrophin antibodies used, with the AB145168 antibody being specific for both mouse and human protein (showing a dystrophin band in both wildtype samples), and the Mandys106 antibody being specific to only human protein (showing a dystrophin band in the hDMD/mdx control only).

      Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies).

      It is true, as he/she points out, that qRT-PCR data in the original YAC transgenic publication showed double expression of the human transcript, consistent with the double integration. However, fig. 3b in the same paper shows that at the protein level the expression of human DMD is comparable to the mouse protein. We don’t know the reason for the discrepancy between transcript and protein levels in this model, but in the current manuscript we are referring to this protein expression.

      A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4.

      We feel that proper quantification can only be done with the utilization of a standard curve. As we expected no, or trace levels of dystrophin in the deletion models, we only included wildtype samples diluted to 10% of wildtype protein. This prevents us from accurate quantification of the trace dystrophin levels observed in the del45 and del51 models. However, as can be appreciated from fig 4, expression is very minimal. We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend.

      Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.

      We added information on the marker in the materials section, and indicated the size (85 kDa) in the figure legend. Lastly, we also added the requested arrows to the pictures of fig. 4B to allow distinction between different histopathological hallmarks, and refer to these in the figure legend.

      Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      This is indeed based on the extension time of the PCR reaction shown in PCR 1 from fig 2B, amplified with primers upstream and downstream of the deleted region (see fig 1 and 2A). However, the approx. 1.6 kb fragment the reviewer refers to is the undeleted-specific amplification shown in Fig 2B PCR 2, which is the result of a primer outside and a primer inside the deleted region (fig 1and 2A). Amplification of the undeleted copy with the primers used in PCR 1 would give a fragment of 3902 nt. The deletion of exon 44 in the final model is 3584 nt, which details will be shown in the excel file that was erroneously omitted (see our response to reviewer #1), with the PCR 1 product of the deleted copy in the clone used for the mouse model being 318 nt. It is straight-forward to select an extension time that would be insufficient for a 3.9 kb fragment, but which can amplify fragments that are shorter due to the deletion. Even in a clone with a single copy of exon 44 deleted, one would not expect to see the 3902 nt fragment due to preferential amplification of the much shorter mutant band. This has now been clarified in the legend of figure 2 of the revised version of the manuscript.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      The manuscript by van Putten et al. describes the generation and initial characterization of four new mouse models of DMD, based on the previously generated hDMD/mdx murine model, which expressed human dystrophin from a yeast artificial chromosome (YAC) in a DMD null (mdx) background. The four new models are based on the deletion of four Exons (44, 45, 51 & 53), which accounts for most human deletions (hotspot) in DMD.

      The description of the generation of these models using CRISPR/Cas9 gene editing is thorough, and the quality control is adequate. Moreover, preliminary testing of exon skipping therapy using ASO showed it is possible to restore the production of dystrophin protein (albeit truncated) in these models, which increase their translational value. Although the study is valuable and methodologically sound, there are minor points that need to be addressed:

      • Few typos need to be corrected:
        • "Therapeutic approaches aiming to restore dystrophin for DMD are based on the discrepancy between DMD and BMD mutations." This needs to be rephrased to clarify the meaning for readers not familiar with DMD.
        • "Western blot and immune fluorescence analysis on gastrocnemius muscles..." replace" immune fluorescence" with immunofluorescence.
        • "Two weeks after the last injection muscles were isolated, and RNA and protein was isolated from muscle..." protein WERE isolated.
        • "However, gene editing-based therapies could run into the same unpredictable outcome reduced efficiency of a therapy ..." This sentence is confusing, consider rephrasing.
      • Generation of the models: it is not clear why the authors generated line 44 in ES cells, then switched to direct gene editing in zygotes. Was this due to advent of electroporation of zygotes at the time? This may need clarification beyond the sentence "Encouraged by the specificity of our new prescreen workflow and the efficiency of correct targeting of human exon 44 in ES cells, we generated additional models ... directly in mouse zygotes".
      • Antisense oligonucleotide treatment: there is no description of the design of the ASOs beyond their sequence in suppl. Table 4. How were they designed? Moreover, they have been injected at two different doses (i.e., 50ul for Exon 51 & 53; 100ul for Exon 44 & 45). What is the rational for this? There is no justification in the manuscript.
      • Phenotypic validation of the new models: a description of the mdx line with C57BL/6J mice is mentioned. Is this why Fig.4 includes "10% Bl6" and "10% hDMD/mdx"? If so, this should be clarified in the text (or deleted from the figure). The authors mentioned "As expected, the gastrocnemius of healthy hDMD/mdx mice expressed dystrophin of human origin at wildtype levels". Why would this be expected? If 2 copies of the gene, including the human promoter, are integrated, why would one expect a wildtype level of expression? In fact, in the original paper describing the hDMD/mdx model ('t Hoen et al. 2008), the human transcripts are expressed at 2 to 4-fold higher than their endogenous counterparts (which is in line with the integration of 2 copies). A quantification of the expression levels on Figure 4 should be done (normalized to actinin) to resolve this. The size of the Marker should also be added on Figure 4. Finally, the authors observed histological hallmarks of the disease in the new models (i.e., muscle degeneration and fibrosis). Although obvious on the images, it may be useful to add indications (e.g., arrows) on the images for readers non familiar with DMD.
      • Prescreen PCR of hDMD/mdx ES cells (Fig. 2): the authors mentioned that "The PCR conditions were chosen for not being able to amplify the undeleted allele." What does this mean? Was the elongation time reduced? As per the text, the theoretical size of a WT band is around 1.6kb. Yet, on the gel, bands higher than 1kb are visible for some clones.

      Referee cross-commenting

      The comments from the other reviewers seem fair, reasonable, and should be easily addressed by the authors. The off-target analysis might however be a bit of a stretch, given that (as per published data) the off-target rate is low (i.e., no higher than genetic drift) in mouse zygotes when using CRISPR RNPs, and any potential off-target mutation could easily be segregated out by means of backcrossing.

      Significance

      The four new mouse models generated in this study will advance the field both at the preclinical and the clinical levels, because they more closely recapitulate the human mutations linked to DMD than previous models, while presenting with a translational potential (the authors showed a proof of concept of exon-skipping therapy in these mice).

    1. Reviewer #1 (Public review):

      Summary:

      The article investigates how the Japanese macaque makes gait transitions between quadruped and biped gaits. It presents a compelling neuromechanical simulation that replicates the transition and an interesting analysis based on an inverted pendulum that can explain why some transition strategies are successful and others are not.

      Strengths:

      I enjoyed reading this article. I think it presents an interesting study and elegant modeling approaches (musculoskeletal + inverted pendulum). The study is well conducted, and the results are interesting. I particularly liked how the success of gait transitions could be predicted based on the inverted pendulum and its saddle node stability. I think it makes a useful and interesting contribution to the state of the art.

      Weaknesses:

      The article is already in great shape, but could be improved a bit by:

      (1) Strengthening the comparison to animal data. In particular, videos of the real animal should be included + snapshots of their gaits (quadruped, biped, and transitions).

      (2) Exploring and testing a broader range of conditions. I think it would be very interesting to test gaits and gait transitions on up and down slopes (both with the musculoskeletal model and with the inverted pendulum model). This could be used to make predictions on how the real animal adapts to those conditions. Ideally, this should be tested on the animal as well. I think this could increase (even more) the impact of this work.

      (3) Better explaining several aspects of the PSO optimization.

      (4) (Ideally) performing a sensitivity analysis on the optimized parameters (e.g. variations of +-5, 10, 20%) in order to determine their respective importance and how much their instantiated values have influenced the results.

      (5) Running a spell checker, as there are quite a few typos.

    2. Reviewer #2 (Public review):

      Summary:

      This article presents a neuromusculoskeletal (NMS) model of the Japanese Macaque. This model is added with a neural feedforward controller based on CPG and synergy that allows for reproducing quadrupedal and bipedal gait as well as the transition between quadrupedal and bipedal gait. The model and controller were validated using experimental data. Results were also compared to an inverted pendulum model to show that the transition between quadrupedal and bipedal in macaque is using this kind of representation for transition and stability. Overall, the article is very interesting, but it sometimes lacks clarity.

      Strengths:

      The results of the model present impressive results for quadrupedal, bipedal, and transition, validated by experimental data. NMS controllers based on feedforward controllers are very difficult to fine-tune.

      Weaknesses:

      (1) The movement regulator is not clear and should be better explained. At first, it seems that it is just a new CPG/synergy (feedforward) added, but in the methods, it seems to be a feedback controller.

      (2) It is also not clear what is meant by discretizing the weight for the trigger limb from 0 to 1 (page 8).

      (3) The controller is mainly using a feedforward controller, allowing only anticipatory movement. Animals are also using a reflex-based feedback controller. A controller with feedback/reflex could reduce failed attempts in training and better represent the transition.

      (4) There are small typos throughout the article that should be corrected.

    1. Author response:

      eLife Assessment

      This study provides a valuable contribution to understanding how negative affect influences food-choice decision making in bulimia nervosa, using a mechanistic approach with a drift diffusion model (DDM) to examine the weighting of tastiness and healthiness attributes. The solid evidence is supported by a robust crossover design and rigorous statistical methods, although concerns about the interpretation of group differences across neutral and negative conditions limit the interpretability of the results.

      We are grateful for this improved assessment. Below, we provide detailed responses that we believe address the noted concerns about interpreting group differences across conditions. If these clarifications resolve the interpretability concerns, we would be grateful if the editors would consider updating the eLife assessment accordingly.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Using a computational modeling approach based on the Drift and Diffusion Model (DDM) introduced by Ratcliff and McKoon in 2008, the article by Shevlin and colleagues investigates whether there are differences between neutral and negative emotional states in:

      (1) The timings of the integration in food choices of the perceived healthiness and tastiness of food options in individuals with bulimia nervosa (BN) and healthy participants

      (2)The weighting of the perceived healthiness and tastiness of these options.

      Strengths:

      By looking at the mechanistic part of the decision process, the approach has potential to improve the understanding of pathological food choices.

      Weaknesses:

      I thank the author for reviewing their manuscript.

      However, I still have major concerns.

      The authors say that they removed any causal claims in their revised version of the manuscript. The sentence before the last one of the abstract still says "bias for high-fat foods predicted more frequent subjective binge episodes over three months". This is a causal claim that I already highlighted in my previous review, specifically for that sentence (see my second sentence of my major point 2 of my previous review).

      We appreciate the Reviewer's continued attention to causal language. We acknowledge that our use of the term 'predicted', though intended to refer to statistical prediction in a regression model, could be misinterpreted as implying causation. We have therefore revised this sentence to read: 'bias for high-fat foods was associated with more frequent subjective binge episodes over three months’.

      I also noticed that a comment that I added was not sent to the authors. In this comment I was highlighting that in Figure 2 of Galibri et al., I was uncertain about a difference between neutral and negative inductions of the average negative rating after the induction in the BN group (i.e. comparing the negative rating after negative induction in BN to the negative rating after neutral induction in BN). Figure 2 of Galibri et al. looks to me that:

      (1) The BN participants were more negative before the induction when they came to the neutral session than when they came to the negative session.

      (2) The BN participants looked almost negatively similar (taking into account the error bars reported) after the induction in both sessions

      These observations are of high importance because they may support the fact that BN patients were likely in a similar negative state to run the food decision task in both conditions (negative and neutral). Therefore, the lack of difference in food choices in BN patients is unsurprising and nothing could be concluded from the DDM analyses. Moreover, the strong negative ratings of BN patients in the neutral condition as compared to healthy participants together with almost similar negative ratings after the two inductions contradict the authors' last sentence of their abstract.

      I appreciate that the authors reproduced an analysis of their initial paper regarding the negative ratings (i.e. Table S1). It partly answers my aforementioned point but does not address the fact that BN may have been in a similar negative state in both conditions (neutral and negative) when running the food decision task: if BN patients were similarly negative after both induction (neutral and negative), nothing can be concluded from their differences in their results obtained from the DDM. As the authors put it, "not all loss-ofcontrol eating occurs in the context of negative state", I add that far from all negative states lead to a loss-of-control eating in BN patients. This grounds all my aforementioned remarks and my remarks of my first review.

      A solution for that is to run a paired t-test in BN patients only comparing the score after the induction in the two conditions (neutral and negative) reported in Figure 2 of their initial article.

      We appreciate the reviewer’s concern. We understand how the visual representation in Figure 2, which displays between-subject error bars, might suggest similar post-induction affect levels. However, the within-subject paired comparison (which appropriately accounts for individual differences in baseline affect) reveals a significant difference, which we detail below.

      While BN participants did report higher baseline negative affect than the HC group prior to the mood inductions, this does not negate the effectiveness of the manipulation. The critical comparison is the within-subject change from pre- to post-induction (detailed below) which shows that negative affect was significantly higher after the negative induction than the neutral induction.

      As we reported in the Supplementary Information (Table S1), our initial analyses of self-reported affect ratings used a linear mixed-effects model with group (HC = 0, BN = 1), condition (Neutral = 0, Negative = 1), and time (pre-induction = 0, post-induction = 1) as fixed effects, including all interactions, and random intercepts for participants. This approach accounts for individual differences in baseline affect.

      However, to address the reviewer's concerns, we conducted two simple effects analyses using estimated marginal means. As the reviewer suggested, we directly compared post-induction affect between conditions within the BN group (described in the second analysis below). In the first analysis, we examined the diagnosis × time interaction within each condition separately. In the Negative condition, individuals with BN demonstrated a substantial increase in negative affect from pre- to post-induction (mean difference = 20.36, t = 4.84, p < 0.0001, Cohen’s d = 0.97). In the second analysis, we examined the condition × time interaction within each group separately. Among the BN group, we found that reported affect was significantly higher following the negative mood induction than after the neutral affect induction (mean difference = -17.40, t = -4.13, p = 0.0003, Cohen’s d = 0.83). This difference in post-induction negative affect between conditions within the BN group represents a meaningful and statistically robust difference in affective states. These within-group effects confirm that the negative mood induction was (1) effective in the BN group and (2) produced significantly greater negative affect than the neutral mood induction.

      These findings confirm that participants completed the food decision task under meaningfully different affective states, supporting the interpretability of the subsequent DDM analyses. We now report these analyses in the Supplementary Information.

      I appreciate the analysis that the authors added with the restrictive subscale of the EDE-Q.

      That this analysis does not show any association with the parameters of interest does not show that there is a difference in the link between self reported restrictions and self reported binges. Only such a difference would allow us to claim that the results the authors report may be related to binges.

      We thank the reviewer for raising this important point about specificity. To address this concern, we examined the correlation between self-reported binge frequency (both subjective binge episodes and objective binge episodes over the past three months) and EDE-Q Restraint subscale in our BN sample.

      The correlation between these measures were modest and non-significant (subjective binge frequency: Spearman’s p = 0.21, p = 0.306; objective binge frequency: Spearman’s p = 0.05, p = 0.806), indicating that both binge frequency measures and dietary restraint were relatively independent dimensions of eating pathology in our sample. This dissociation supports the specificity of our findings: the fact that our DDM parameters were associated with binge frequency but not with dietary restraint suggests that the affect-induced changes in decisionmaking we observed are specifically related to binge-eating behavior rather than reflecting a correlate of dietary restraint. We now report this analysis in the Supplementary Information.

      I appreciate the wording of the answer of the authors to my third point: "the results suggest that individuals whose task behavior is more reactive to negative affect tend to be the most symptomatic, but the results do not allow us to determine whether this reactivity causes the symptoms". This sentence is crystal clear and sums very well the limits of the associations the authors report with binge eating frequency. However, I do not see this sentence in the manuscript. I think the manuscript would benefit substantially from adding it.

      We thank the reviewer for the suggestion. We have added the following sentences that convey this information to the end of the third paragraph of the discussion:

      “These results suggest that individuals whose task behavior is more reactive to negative affect tend to be the most symptomatic. However, our correlational design does not allow us to determine whether this reactivity causes the symptoms.”

      Statistical analyses:

      If I understood well the mixed models performed, analyses of supplementary tables S1 and S27 to S32 are considering all measures as independent which means that the considered score of each condition (neutral vs negative) and each time (before vs after induction) which have been rated by the same participants are independent. Such type of analyses does not take into account the potential correlation between the 4 scores of a given participant. As a consequence, results may lead to false positives that a linear mixed model does not address. The appropriate analysis would be to run adapted statistical tests pairing the data without running any mixed model.

      We appreciate the reviewer's attention to the statistical approach. However, we respectfully note that mixed-effects models do account for within-subject correlations, contrary to the reviewer’s interpretation.

      The linear mixed-effects model we employed explicitly accounts for the correlation among repeated measures from the same participant through the random intercept term. This random effect structure models the non-independence of observations within participants, allowing for correlated errors within individuals while assuming independence between individuals. This is a standard and appropriate approach for analyzing repeated-measures data (Bates et al., 2015).

      The mixed-effects model is, in fact, more appropriate than separate paired t-tests for our design because it:

      (1) Simultaneously models all fixed effects (group, condition, time) and their interactions in a single unified framework;

      (2) Properly partitions variance into within-subject and between-subject components;

      (3) Provides greater statistical power and more precise estimates by using all available data simultaneously; and

      (4) Allows for direct testing of three-way interactions that cannot be assessed through pairwise comparisons alone.

      Paired tests (e.g., t-tests), as the reviewer suggests, would require multiple separate analyses and would not allow us to test our primary hypotheses about group × condition × time interactions. The mixed-effects approach provides a more comprehensive and statistically rigorous analysis of our repeated-measures design. To clarify this even further in the manuscript, we have added the following in our methods when describing our model, “participant-level random intercepts were included to account for within-subject correlations across repeated measurements.”

      Notes:

      It is not because specific methods like correlating self reported measures over long periods with almost instantaneous behaviors (like tasks) have been used extensively in studies that these methods are adapted to answer a given scientific question. Measures aggregated over long periods miss the variations in instantaneous behaviors over these periods.

      We acknowledge the reviewer’s concern about the temporal mismatch between our session-level task measures and the 3-month aggregated symptom reports. This is a valid limitation of crosssectional designs, and we agree that examining how task performance fluctuates in relation to real-time symptom variation would provide richer insights into the potential dynamics of these relationships.

      We agree that we cannot capture how daily changes in task performance relate to momentary symptom occurrence. In response to previous rounds of helpful reviews, we added this limitation to the Discussion section, noting that future research employing ecological momentary assessment (EMA) or daily diary methods could examine whether the decision-making processes we identified also fluctuate in relation to real-time symptom occurrence.

      We note that our finding that affect-induced changes in decision-making parameters were associated with subjective binge frequency suggests that this laboratory-measured reactivity may reflect a stable individual difference that manifests across contexts and time periods. While our current study provides initial evidence that individual differences in affect-related decisionmaking are associated with symptom severity, we acknowledge that longitudinal designs with repeated assessments would strengthen causal and temporal inferences.

      Reviewer #2 (Public review):

      Summary:

      Binge eating is often preceded by heightened negative affect, but the specific processes underlying this link are not well-understood. The purpose of this manuscript was to examine whether affect state (neutral or negative mood) impacts food choice decisionmaking processes that may increase the likelihood of binge eating in individuals with bulimia nervosa (BN). The researchers used a randomized crossover design in women with BN (n=25) and controls (n=21), in which participants underwent a negative or neutral mood induction prior to completing a food-choice task. The researchers found that despite no differences in food choices in the negative and neutral conditions, women with BN demonstrated a stronger bias toward considering the 'tastiness' before the 'healthiness' of the food after the negative mood induction.

      Strengths:

      The topic is important and clinically relevant, and the methods are sound. The use of computational modeling to understand nuances in decision-making processes and how that might relate to eating disorder symptom severity is a strength of the study.

      Weaknesses:

      Sample size was relatively small, and participants were all women with BN, which limits generalizability of findings to the larger population of individuals who engage in binge eating. It is likely that the negative affect manipulation was weak and may not have been potent enough to change behavior. These limitations are adequately noted in the discussion.

      We are grateful to Reviewer #2 for their careful and supportive review of our manuscript. We appreciate their recognition that computational modeling can reveal nuanced alterations in decision-making processes that may not be apparent in overt behavioral choices. Their balanced assessment of both the strengths and limitations of our work has been helpful in contextualizing our findings appropriately. We have carefully considered their comments regarding sample size and the potential limitations of our mood induction procedure, both of which we discuss in detail in the manuscript's limitations section.

      Reviewer #3 (Public review):

      Summary:

      The study uses the food choice task, a well-established method in eating disorder research, particularly in anorexia nervosa. However, it introduces a novel analytical approach-the diffusion decision model-to deconstruct food choices and assess the influence of negative affect on how and when tastiness and healthiness are considered in decision-making among individuals with bulimia nervosa and healthy controls.

      Strengths:

      The introduction provides a comprehensive review of the literature, and the study design appears robust. It incorporates separate sessions for neutral and negative affect conditions and counterbalances tastiness and healthiness ratings. The statistical methods are rigorous, employing multiple testing corrections.

      A key finding-that negative affect induction biases individuals with bulimia nervosa toward prioritizing tastiness over healthiness-offers an intriguing perspective on how negative affect may drive binge eating behaviors.

      Weaknesses:

      A notable limitation is the absence of a sample size calculation, which, combined with the relatively small sample, may have contributed to null findings. Additionally, while the affect induction method is validated, it is less effective than alternatives such as image or film-based stimuli (Dana et al., 2020), potentially influencing the results.

      We are grateful to Reviewer #3 for their thoughtful evaluation of our work. We appreciate their recognition that the diffusion decision model provides a novel analytical lens for understanding how negative affect influences the dynamics of food-related decision-making in bulimia nervosa. Their balanced assessment of both the methodological strengths of our design (counterbalancing, rigorous statistical corrections) and its limitations (sample size, mood induction efficacy) has been valuable in ensuring we appropriately contextualize our findings and their implications. Specifically, we have taken their comments regarding sample size and the relative efficacy of different mood induction methods seriously, and we address these important methodological considerations in our discussion of the study's limitations.

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The authors have addressed my previous comments, and I do not have any additional suggestions for improvement.

      We thank the reviewer for their time, effort, and insightful feedback.

      Reviewer #3 (Recommendations for the authors):

      The authors have adequately addressed my feedback. I have no further comments.

      We thank the reviewer for their time, effort, and insightful feedback.

    2. Reviewer #1 (Public review):

      Summary:

      Using a computational modeling approach based on the Drift and Diffusion Model (DDM) introduced by Ratcliff and McKoon in 2008, the article by Shevlin and colleagues investigates whether there are differences between neutral and negative emotional states in:

      (1) The timings of the integration in food choices of the perceived healthiness and tastiness of food options in individuals with bulimia nervosa (BN) and healthy participants (2) The weighting of the perceived healthiness and tastiness of these options.

      Strengths:

      By looking at the mechanistic part of the decision process, the approach has potential to improve the understanding of pathological food choices.

      Weaknesses:

      I thank the authors for revising their manuscript.

      I still notice that the authors did not go through their manuscript to look for wordings refering to a prediction interpretation of their results while I already highlighted the inappropriateness of this wording in my two first rounds of reviews: e.g. there is still "we used zero-inflated negative binomial models to predict the three-month frequency" and I can find other statements like this. The design of their study does not allow such claims.

      The authors answered my major concern regarding the experimental induction towards a negative or a neutral state before running the food decision task. My concern is: BN patients already seemed to be already in a high negative state before undergoing the neutral induction, while these patients are in a lower negative state before undergoing the negative induction. It is therefore not surprising that patients seem to report a similar level of negative state after the two inductions (according to the figure of the authors' previous article). Of note is that the additional analysis the authors ran within the BN group only provides a significant result: this result shows that there has been an induction but does not rule out that patients were in the exact same magnitude of negative state to perform the task as the figure in their previously published article suggests it. The major issue is to show that:

      (1) As compared to the neutral induction, there has been a higher variation in negative state after as compared to before the negative induction.

      (2) The magnitude of the negative state after the negative induction is higher than the magnitude of the negative state after the neutral induction.

      The first point shows that the induction worked. The second point shows that the participants are in two distinct states. Without showing the second point, it may be possible that one induction increases the negative state of participants to the same level as the one of the second induction that has not increased anything.

      Within this context, how is it possible to associate, in patients, a difference in the DDM between the two sessions to a negative state (which is one of the main focus of the article) rather than to another parameter that has not been captured? A similar situation would be in an experiment studying the consequence of stress, a stressfull induction over relaxed participants attending the lab has high chances to raise the level of stress of those participants to the same level as the one that the same participants would experience after a neutral induction when these participants attend the lab with an already high level of stress. In that case, would it be approrpiate to claim that a difference at a task performed after the induction would be related to stress while the participants would be at the same level of stress when performing the task despite the fact that the induction worked ?

      In the experiment performed by the authors, the additional analysis to perform would be a paired sample t-test (or the appropriate non-parametric test) to check whether the magnitude of negative state of BN patients was different between the negative and neutral conditions after the induction only. If not, associating the difference at the DDM with negative states in BN is highly misleading.

      I read carefully the authors' answer related to mixed models: they claim that mixed models take into account correlations within their repeated data. The specification of the structure of the covariance matrix allows to control only partly for that. I notice that the authors did not specify the structure of that matrix: the article they refer to to justify the appropriatness of their analyses is not adapted. The specification of the structure of the covariance matrix needs to address, in a mixed model, the difference in handling 4 repeated data per participants that cannot be paired as compared to 4 repeated data that can be paired (two per session with one before and one after the neutral or negative priming sessions, if I count right). Of note is that a covariance structure that is left free of constraint for the fit of the model does not capture appropriately the pairing of the data: it has all chances to capture the covariance in a different way. And a covariance structure that has constraints has more chances to lead to a model that cannot be estimated because of an absence of convergence of the algorithms.

      By the way, a single two-sample t-test (or a Mann-Whitney test if appropriate), and not a set of multiple paired-sample t-test as the authors suggest, would answer the goal of the authors to test for what they call the three-way interaction in their comment. This test would be performed between the two groups of participants (BN/controls) with the computation for each participant separately: (assessment after neutral induction-assessment before neutral induction)-(assessment after negative induction-assessment before negative induction). This analysis answers points 1, 2 and 4 they raise together with my point of controlling for the paired data. I would have agreed with their choice of a mixed model if they had an unbalanced dataset within each participant.

    1. Author response:

      eLife Assessment

      Hoverflies are known for their sexually dimorphic visual systems and exquisite flight behaviors. This valuable study reports how two types of visual descending neurons differ between males and females in their motion- and speed-dependent responses, yet surprisingly, the behavior they control lacks any sexual dimorphism. The results convincingly support these findings, which will be of interest for studies of visuomotor transformations and network-level brain organization.

      This statement perfectly recapitulates our findings.

      Public Reviews:

      Reviewer #1 (Public review):  

      Summary: 

      Hoverflies are known for a striking sexual dimorphism in eye morphology and early visual system physiology. Surprisingly, the male and female flight behaviors show only subtle differences. Nicholas et al. investigate the sensori-motor transformation of sexually dimorphic visual information to flight steering commands via descending neurons. The authors combined intra- and extracellular recordings, neuroanatomy, and behavioral analysis. They convincingly demonstrate that descending neurons show sexual dimorphisms - in particular at high optic flow velocities - while wing steering responses seem relatively monomorphic. The study highlights a very interesting discrepancy between neuronal and behavioral response properties.

      Thank you for this summary. Most of the statement perfectly recapitulates the main findings of our paper. However, we want to emphasize that some hoverfly flight behaviors are strongly sexually dimorphic, especially those related to courtship and mating. Indeed, only male hoverflies pursue targets at high speed, chase away territorial intruders, and pursue females for mating. However, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not sexually dimorphic. We will amend the Introduction to make the difference between flight behaviors clear.

      More specifically, the authors focused on two types of descending neurons that receive inputs from well-characterized wide-field sensitive tangential cells: OFS DN1, which receives inputs from so-called HS cells, and OFS DN2, which receives input from a set of VS cells. Their likely counterparts in Drosophila connect to the neck, wing, and haltere neuropils. The authors characterized the visual response properties of these two neuronal classes in both male and female hoverflies and identified several interesting differences. They then presented the same set of stimuli, tracked wing beat amplitude, and analyzed the sum and the difference of right and left wing beat amplitude as a readout of lift or thrust, and yaw turning, respectively. Behavioral responses showed little to no sexual dimorphism, despite the observed neuronal differences.

      Thank you for this very nice summary of our work. We want to clarify that LPTC input to DN1 and DN2 has not been shown directly in hoverflies using e.g. dye coupling, or dual recordings. Instead, the presumed HS and VS input is inferred from morphological and physiological DN evidence, and comparisons to similar data in Drosophila and blowflies. We will amend the Introduction to clarify this. The rest of the paragraph perfectly recapitulates the main findings of our paper.

      Strengths:

      I find the question very interesting and the results both convincing and intriguing. A fundamental goal in neuroscience is to link neuronal responses and behavior. The current study highlights that the transformations - even at the level of descending neurons to motoneurons - are complex and less straightforward than one might expect.

      Thank you.

      Weaknesses:

      The authors investigated two types of descending neurons, but it was not clear to me how many other descending neurons are thought to be involved in wing steering responses to wide-field motion. I would suggest providing a more in-depth overview of what is known about hoverflies and Drosophila, since the conclusions drawn from the study would be different if these two types were the only descending neurons involved, as opposed to representing a subset of the neurons conveying visual information to the wing neuropil.

      This is a great point. There are around 1000 fly DNs, of which many could respond to widefield motion, without being specifically tuned to widefield motion. For example, many looming sensitive neurons also respond to widefield motion, and could therefore be involved in the WBA movements that we measured here. In addition, there are many multimodal neurons that could be involved in optomotor responses in free flight, but these may not have been stimulated when we only provided visual input. Furthermore, many visual neurons are modulated by proprioceptive feedback, which is lacking in immobilized physiology preps. Finally, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, and in Drosophila 3 have been identified morphologically and physiologically. In summary, it is more than likely that other neurons project visual widefield motion information to the wing neuropil. We will amend our Introduction and Discussion to make this important point clear to the readers.

      Both neuronal classes have counterparts in Drosophila that also innervate neck motor regions. The authors filled the hoverfly DNs in intracellular recordings to characterize their arborization in the ventral nerve cord. In my opinion, these anatomical data could be further exploited and discussed a bit more: is the innervation in hoverflies also consistent with connecting to the neck and haltere motor regions? Are there any obvious differences and similarities to the Drosophila neurons mentioned by the authors? If the arborization also supports a role in neck movements, the authors could discuss whether they would expect any sexual dimorphism in head movements.

      These are all great points. We did not see any clear arborizations to the frontal nerve, where we would expect to find the neck motor neurons (NMNs). In addition, while we did see fine arborizations throughout the length of the thoracic ganglion, we saw no strong outputs projecting directly to the haltere nerve (HN). In the revised version of the MS we will modify figure 4 (morphological characterization) to clarify.

      There are important differences between the morphology of DN1 and DN2 in hoverflies and DNHS1 and DNOVS2 in Drosophila, in terms of their projections in the thoracic ganglion. For example, In Drosophila DNOVS2, there are several fine branches along the length of the neuron in the thoracic ganglia. Similarly, we found fine branches in Eristalis tenax DN2, however, in addition, we found a wide branch projecting to the area of the thoracic ganglion where the prothoracic and pterothoracic nerves likely get their inputs (Figure 4), suggesting that the neuron could contribute to controlling the wings and/or the forelegs (which is why we quantified the WBA). In Drosophila DNHS1, there is a similar fat branch to the prothoracic and pterothoracic nerves, which we also found in Eristalis tenax OFS DN1 (Figure 4). Indeed, while Drosophila DNHS1 and DNOVS2 have quite strikingly different morphology, DN1 and DN2 in Eristalis looked quite similar. We will modify the Results section to make this clear.

      In addition, to investigate this further, in the revised version of the MS we will include analysis of the movement of different body parts (including the head) to investigate the presence of any potential sexual dimorphism. Unfortunately, however, this will not include the halteres, as they cannot be seen well in the videos.

      Reviewer #2 (Public review):

      Summary:

      Many fly species exhibit male-specific visual behaviors during courtship, while little is known about the circuit underlying the dimorphic visuomotor transformations. Nicholas et al focus on two types of visual descending neurons (DNs) in hoverflies, a species in which only males exhibit high-speed pursuit of conspecifics. They combined electrophysiology and behavior analysis to identify these DNs and characterize their response to a variety of visual stimuli in both male and female flies. The results show that the neurons in both sexes have similar receptive fields but exhibit speed-dependent dimorphic responses to different optic flow stimuli.

      This statement perfectly recapitulates the main findings of our paper. However, as mentioned above, while hoverfly flight behaviors related to courtship and mating are strongly sexually dimorphic, other flight behaviours, such as those related to optomotor responses and flights between flowers when feeding, are not. We will amend the Introduction to make the difference between flight behaviors clear.

      Strengths:

      Hoverflies, though not a common model system, show very interesting dimorphic behaviors and provide a unique and valuable entry point to explore the brain organization behind sexual dimorphism. The findings here are not only interesting on their own right but will also likely inspire those working in other systems, particularly Drosophila.

      Thank you.

      The authors employed rigorous morphology, electrophysiology, and behavior methods to deliver a comprehensive characterization of the neurons in question. The precision of the measurements allowed for identifying a subtle and nuanced neuronal dimorphism and set a standard for future work in this area.

      Thank you.

      Weaknesses:

      Cell-typing using receptive field preferred directions (RFPDs): if I understood correctly, this classification method mostly relies on the LPDs near the center of the receptive field (median within the contour in Fig.1). I have two concerns here. First, this method is great if we are certain there are only two types of visual DNs as described in the manuscript. But how certain is this? Given the importance of vision in flight control, I would expect many DNs that transmit optic flow information to the motor center. I'd also like to point out that there are other lobula plate tangential cells (LPTCs) than HS and VS cells, which are much less studied and could potentially contribute to dimorphic behaviors.

      This is very true, and an important point. As mentioned above, in blowflies, up to 5 optic flow sensitive DNs have been identified morphologically, however, if these correspond to 5 different physiological types remain unclear. In both blowflies and Drosophila 3 have been identified morphologically and physiologically (DNHS1, DNOVS1, DNOVS2). Importantly, in both blowflies and fruitflies DNOVS1 gives graded responses, and no action potentials, meaning that we would not be able to record from it using extracellular electrophysiology.

      We previously used clustering techniques to show that in Eristalis, we can reliably distinguish two types of optic flow sensitive DNs from extracellular electrophysiological data, based on a range of receptive field parameters, and we think that these correspond to DNHS1 and DNOVS2 in Drosophila (Nicholas et al, J Comp Physiol A, 2020, cited in paper). As mentioned above in response to Reviewer 1, this does not mean that there are no other neurons that could respond to widefield optic flow, and which might be involved in the WBA we recorded in the paper. However, the point of this paper was not to conclusively show that there are only two optic flow sensitive descending neurons. The point was to say that there are two quite distinct optic flow sensitive neurons that have similar receptive fields in males and females, while the responses to widefield motion show differences between males and females.

      We will modify the Introduction and Discussion to make these important points clear to the Reader, including the discussion of the 45-60 LPTCs that exist in the lobula plate, and what their role might be.

      Second, this method feels somewhat impoverished given the richness of the data. The authors have nicely mapped out the directional tuning for almost the entire visual field. Instead of reducing this measurement to 2 values (center and direction), I was wondering if there is a better method to fully utilize the data at hand to get a better characterization of these DNs. As the authors are aware, local features alone can be ambiguous in characterizing optic flows. What's more, taking into account more global features can be useful for discovering potentially new cell types.

      This is a great point, and we did an extensive analysis of other receptive field properties in this study (shown in supp fig 1). In addition, and as mentioned above, we have published a clustering analysis across receptive field properties of these neurons (Nicholas et al, J Comp Physiol A, 2020, cited in paper). The point that we attempted to make in this paper was that by using two strikingly simple metrics, we can reliably distinguish which of the two neuron types we are recording from (if we accept that there are two main types that we are likely to record from) simply based on location and overall directional preference. This makes automated analysis very easy and straightforward. Indeed, we now use this routinely to ID what neuron we are recording from, rather than making a human-based assumption.

      However, we agree that further in depth analysis is warranted. Therefore, to address this, we will provide additional receptive field analysis and clustering in the revised version of the MS. In addition, we want to highlight that all data is uploaded to DataDryad for anyone interested in doing additional in-depth analyses.

      Line 131, it wasn't clear to me why full-screen stimuli were used for comparison here, instead of the full receptive field maps. Male flies exhibit sexual dimorphic behaviors only during courtship, which would suggest that small-sized visual stimuli (mimicking an intruder or female conspecific) would be better suited to elicit dimorphic neuronal responses. A similar comment applies to the later results as well. Based on the receptive field mapping in Figure 1, I'm under the impression that these 2 DN types are more suited to detect wide-field optic flows, those induced by self-motion as mentioned in the manuscript. The results are still very interesting, but it's good to make this point clear early on to help set appropriate expectations. Conversely, this would also suggest that there are other visual DN types that are responsible for the courtship-related sexually dimorphic behaviors.

      Thank you for mentioning these important points. Our reasoning for using full-screen stimuli for the analysis on line 131 was that since we used the small sinusoidal gratings for mapping the receptive fields, and to subsequently classify the neurons, it would be unfair to use the same data to investigate potential sexual dimorphism. I.e., we selected neurons that fulfilled certain criteria, and then we cannot rightfully use the same criteria to determine differences. This was not explicitly mentioned in the paper, so we will modify the text to make this clear to the Reader.

      However, in Supp Figure 1d/e we show that there are no striking receptive field differences between males and females in terms of receptive field center nor directional preference. In Supp Figure 1f we show that there is no difference between male and female receptive field height and width. We will modify the text to draw the Reader’s attention to this figure, and also mention the additional analysis done in response to the comment above.

      As a side note, I personally expected at least DNHS1 to have a smaller receptive field in males, as the hoverfly HSN is strikingly sexually dimorphic (Nordström et al, Curr Biol 2008), and also very sensitive to small objects. However, while optic flow sensitive DNs do respond to small objects (see e.g. the J Comp Physiol paper mentioned above) we did not detect any obvious sexual dimorphism in receptive field properties. Indeed, we think that a different subset of DNs control target pursuit behavior (target selective DNs (TSDNs)). This will be addressed in the modified version of the paper.

    1. Reviewer #1 (Public review):

      Summary:

      The dysgranular retrosplenial cortex (RSD) and hippocampus both encode information related to an animal's navigation through space. Here, the authors study the different ways in which these two brain regions represent spatial information when animals navigate through interconnected rooms. Most importantly, they find that the RSD contains a small fraction of neurons that encode properties of interconnected rooms by firing in different head directions within each room. This direction is shifted by 180 degrees in 2-room environments, and by 90 degrees in 4-room environments. While it cannot be definitively proven that this encoding is not just related to the presence of exits (doors) in each room, this is a noteworthy finding and will motivate further study in more complex and well-controlled environments to understand this coding scheme in the RSD. The recordings and analyses used to identify these multi-directional cells are mostly solid. Additional conclusions regarding the rotational symmetry across rooms seen in the RSD neurons that do not encode direction (representing the majority of RSD neurons) remain incomplete, given the evidence presented thus far. The differences between RSD and hippocampus encoding of space are clear and consistent with prior observations.

      Strengths:

      (1) Use of tetrode recordings from the RSD to identify multi-direction cells that only encode one direction in each room, but shift the preferred direction by either 180 or 90 degrees depending on the number of rooms in the environment.

      (2) Solid controls to show that this multi-direction encoding is stable over time and across some environmental manipulations.

      (3) Convincing evidence that these multi-direction cells can co-exist with single-direction head direction cells in the RSD (as both cell types can be simultaneously recorded).

      (4) Convincing evidence for clear differences between directional and spatial encoding in the RSD versus hippocampus, consistent with prior observations.

      Weaknesses:

      (1) The paper mostly uses the term "retrosplenial cortex", but it is important to clarify that the study is only focused on the dysgranular retrosplenial cortex (RSD; Brodmann Area 30) and not the granular retrosplenial cortex (Brodmann Area 29). These are two distinct regions (despite the similar names), each with distinct connectivity and distinct behavioral encoding and function, so it is important to clarify in the abstract and title that the present study is solely about the RSD to prevent confusion in the literature.

      (2) The proportion of each observed cell type is not clearly stated, although it is clear that the multi-directional cells are in the minority. Having the proportion of well-isolated neurons in distinct sessions that encode each type of information (e.g., multi vs single direction encoding) would greatly aid the interpretation of the result and help the field know how common each cell type is in the RSD.

      (3) The authors state that "MDCs [multi-directional cells] never exhibited multidirectional activity within a single room" - but many of the single room examples from the 4-room environment (shown in Figures 2E and 2F) reveal multi-peaked directional encoding. This suggests that the multi-direction encoding may be more compatible with encoding some property of the number of exits rather than relative room orientations.

      (4) The spatial rotation analyses of non-directional cell analyses are considered incomplete. This is impacted by the slower speed at the doors and hence altered firing rates (as evidenced in spatial rate plots). The population rate is not relevant as the correlational analyses are done on a single cell level. Since some cells fire more with increasing speed and others fire less, that will necessarily result in a population rate map that minimizes firing rate differences near the doorway, where the animals move more slowly. But on a single cell level, that reduced speed is having a big effect, as evidenced by individual rate map examples, and the rooms will need to be rotated to obtain a higher correlation by overlapping the doorway regions. This does not necessarily say anything about spatial coding across the two or four interconnected rooms being rotationally symmetric, and it would appear difficult to draw any conclusions related to spatial encoding from those analyses.

    2. Reviewer #2 (Public review):

      Summary:

      Laurent et al. perform in vivo electrophysiological recordings in the retrosplenial cortex of rats foraging in multi-compartment environments with either identical or unique visual features. The authors characterize two types of directional signals in the area that they have previously reported: classic head direction cells anchored to the global allocentric reference frame and multi-direction cells (MDCs), which have a rotationally preserved directional field anchored to local compartments. The primary finding of this work is that MDCs seem sensitive to local environmental geometry rather than visual context. They also show that MDC tuning persists in the absence of hippocampal place field repetition, further dissociating the RSC local directional signal from the broader allocentric representation of space. A novel observation is that RSC non-directional spatial signals are anchored to the local environment, which could and should be explored further. While the data is solid and the analyses are mostly appropriate, the primary findings are incremental, and more interesting novel claims are not explored in detail or not explicitly tested.

      Strengths:

      The environmental manipulations clearly demonstrate that tuning is not modulated by complex visual information.

      The finding that RSC two-dimensional spatial responses are stable and anchored to environmental features is novel and can be further explored in future work.

      Weaknesses:

      The observation that BDCs and MDCs are insensitive to visual context builds upon the author's previous work (and replicates aspects of Zhang et al., 2022) but leaves many open questions that are not addressed with the current set of experiments. Specifically, what exactly are MDCs anchoring to? The primary theory is that they anchor to environmental geometry, but there are no explicit experimental manipulations to test this theory. It is important to note that 2- and 4-compartment environments share many features, including the same cardinal axes, making any differences/similarities in these two conditions difficult to interpret.

      The main finding presented with respect to BDC/MDs tuning is that they are not sensitive to visual context as manipulated by distinct visual patterns on the wall and floor in multicompartment environments. One could argue that the individual rooms are, in actuality, quite similar in low-level visual features - each possesses a large white background square visual feature on a single wall with a fixed relationship to the door(s). How can the authors rule out that i) BDC/MDC responses are modulated by these low-level features rather than geometry and/or ii) that the rats are not paying attention to any visual features at all? There is no task requiring them to indicate which room they are in. Furthermore, the doorways themselves are prominent visual features that are present in each context. It would be interesting to see if MDC/BDC tuning persisted in a square room where the number of doorways was manipulated to rule out this possibility.

      A strong possibility is that the rotational symmetry of both MDCs and non-directional spatial neurons is related to i) door-related firing, 2) stereotyped movement, and 3) stereotyped directional sampling. In Supplemental Figure 8, the authors begin to address this by comparing a 'population ratemap' to a 'population speed map.' I do not think this is sufficient and is difficult to interpret. Instead, the authors should assess whether MDC and BDCs fire more at doorways and what the overlap is with the speed-modulated cells they report. Moreover, they should assess whether the spatial speed profile itself is rotationally symmetric within each session. It would also be useful to look at the confluence of the variables simultaneously using some form of regression analysis. The authors could generate a directional predictor that captures the main response property of these cells and see if it accounts for greater variability in spiking than speed or x,y position. Finally, rotationally symmetric directional sampling biases could arise from the doors being present on the same two walls in each room. The authors should assess whether MDC tuning is still present if directional sampling is randomly downsampled to match directional observations in each compartment.

      Recent work has demonstrated that neurons with egocentric corner or boundary tuning are observed in RSC. The authors do not address whether egocentric tuning contributes to MDC signals. An explicit analysis of the relationship and potential overlap of MDC and egocentric populations is warranted.

      Many of the MDCs presented in the main figures are not especially compelling. This includes alterations to MDC tuning in Figure 2, which is a key datapoint. The authors should show significantly more (if not all) examples of MDCs in each environment. It would similarly be useful to see all/more examples of non-directional spatially tuned neurons with rotationally symmetric firing patterns.

      "One might hypothesize that specific environmental cues, such as door orientation or landmark positioning, drive these tuning shifts. However, our results argue against this interpretation. In four-room environments, each room had multiple entry points, yet MDCs never exhibited multidirectional activity within a single room."

      I do not understand the logic here. Can the authors unpack this? Also, it is clear that some of the example cells have more than one peak in individual compartments. How is this quantified?

    3. Reviewer #3 (Public review):

      Summary:

      The authors examine firing of dysgranular retrosplenial cortex (dRSC) neurons in relation to head orientation and location for rats exploring open-field environments. One environment utilized was a square arena with high walls that is split into two rectangular spaces connected by a doorway. Another environment is a square arena split into quadrants connected by doors near the center. For each, the different sub-spaces of the environments are either identical in terms of visual and tactile cues or different. For head direction neurons, the authors present one population where each neuron maintains a single tuning direction for the two or four sub-compartments of the two environments. A second population exhibits what is termed multi-directional firing, wherein neurons exhibit (overall) two or four head direction peaks in firing. For such neurons, firing in each of the sub-compartments is associated with only a single preferred direction, but the directions across compartments are shown to be at 180-degree (two-compartment environment) or 90-degree offsets. The offsets evidence tuning to the "same" orientation for the sub-compartments that are, in the global reference frame, oriented at 180 or 90 degree offsets. The results are similar whether or not the sub-compartments have the same or different tactile and visual cues. Thus, the first population is said to be global in its head direction tuning, while the second relates to each local environment in a way that is systematic across sub-compartments. Spatially-specific activity of another population of non-direction-tuned RSC neurons is examined, and comparisons of sub-compartment spatial firing maps suggest that spatial tuning in RSC also repeats across compartments when the firing maps for the compartments are rotated to match each other (as in physical space). Finally, a population of hippocampal "place" cells exhibited different location mapping across sub-compartments. The findings are interpreted to indicate that RSC can simultaneously map orientation in both local and global reference frames, possibly forming a mechanism whereby the sub-compartments' shared geometry (given by the boundary shapes and the door locations) can be related to each other and to the global space they share.

      Strengths:

      This paper addresses an interesting problem and expands how the field will think about directional tuning.

      Weaknesses:

      It is not clear that the experimental design allows for a clear interpretation of the data. Rates for preferred turning are low, as are ratemap correlations for spatially-tuned neurons.

      (1) It is concerning that the neurons with head direction tuning have fairly low peak firing rates (mean close to 5 Hz), where prior studies examining head direction tuning in dRSC found head direction-tuned neurons with peak rates more than an order of magnitude higher (100 Hz or more). Under circumstances where neurons are tuned well to variables other than head direction (for example, angular velocity of movement), weak head direction tuning may be observed if those other variables are not sampled equally across head directions. The manuscript contains no rigorous control for this possibility. One place to start to address this issue would be to map out variables such as angular velocity by head orientation, and to test whether such relationships also carry 90 and 180 degree offsets.

      (2) There is some question as to whether dRSC neurons (spatial or directional) following the sub-compartment "geometry" is appropriate in terms of interpreting the data. In the condition with sub-compartments carrying different tactile and visual cues, it seems that such cues pertain only to the floor of the environments. The distal visual space of the boundaries appears to be identical. One is left to wonder whether distinguishing environments according to boundary wall visual cues would lead to different results. The CA1 data does not help to rule this possibility out. A second reason to doubt the "shared geometry" interpretation is that there is no condition where sub-compartment geometry is varied. It is also the case that the sub-compartment doorways may stand as the only salient distal visual cue linking the environments. Local sensory cues and geometry seem not so disentangled in this study, but this is a major claim in the abstract.

      (3) There is some concern with the interpretation that the spatial tuning of some dRSC neurons repeats in rotated form across sub-compartments. The firing rate map correlations are very low on average (~0.2), and far lower than the population of CA1 having repeating fields across the same vs different visual/tactile cue conditions. The authors should define the chance level of ratemap correlation by shuffling neuron identities. Apologies if this is indeed the current approach, but it seems not to be (I was left a bit lost by the description in the methods). For any population of hippocampal place cells, the cross-neuron correlations of firing rate maps are typically not zero, and correlations at 0.2 would normally be evidence for remapping.

      (4) A somewhat picky point here that is not meant to claim that multi-compartment studies are not useful - the introduction states that real-world environments typically consist of multi-compartment rooms. This is certainly not true for rodents and is only sometimes true in humans.

      (5) The discussion lacks a consideration of how such dRSC output might impact the target structures of dRSC.

      (6) The discussion speaks to the idea that multi-directional neurons may aid in transitioning between contexts (sub-compartments). But it is notable that none of the multidirectional neurons have multi-directional tuning in all sub-compartments, but such firing was seen in the 2017 Nature Neuroscience study by Jacob/Jeffery. The discussion should address this difference and perhaps posit a means by which the firing of global and local head direction neurons can be related to each other to yield navigation that depends on both scales.

      (7) The authors should provide the size of the smoothing function for spatial firing rate maps.

      (8) The authors should devise a measure to define directional tuning in 4 directions (with 90-degree offsets).

      (9) Figures 2D and 2H - The offsets in preferred tuning across sub-compartments are rather variable.

    1. Reviewer #1 (Public review):

      Summary:

      This manuscript presents a tunable Bessel-beam two-photon fluorescence microscopy (tBessel-TPFM) platform that enables high-speed volumetric imaging with stable axial focus. The work is technically strong and broadly significant, as it substantially improves the flexibility and practicality of Bessel-beam-based two-photon microscopy. The demonstrations are generally strong and bridge a wide range of neuroimaging applications, namely vascular dynamics, neurovascular coupling, optogenetic perturbation, and microglial responses. These convincingly show that the approach enables biological measurements that are difficult or impractical with existing methods.

      The evidence supporting the technical and biological claims is generally strong. The optical design is carefully motivated, clearly described, and validated through a combination of simulations and experimental characterization. The biological applications are diverse and well chosen to highlight the strengths of the proposed method, and the data are of high quality, with appropriate controls and comparative measurements where relevant.

      Strengths:

      (1) The optical innovation addresses a well-recognized limitation of existing Bessel-TPFM implementations, namely axial focus drift during tuning, and does so using a relatively simple, light-efficient, and cost-effective design.

      (2) The manuscript provides convincing experimental evidence for this being a versatile platform to map flow dynamics across diverse vessel sizes and orientations in both healthy and pathological states.

      (3) Biological demonstrations are comprehensive and span multiple domains such as hemodynamics, neurovascular coupling, and neuroimmune responses.

      (4) Quantitative analyses of blood flow across vessel sizes and orientations, including kilohertz line scanning, are particularly compelling and clearly beyond the reach of standard Gaussian TPFM.

      (5) Particular advantages are that higher blood slow speeds become measurable up to 23mm/sec (20x more than conventional frame scanning), and that simultaneous (Bessel-)imaging and (Gaussian-)perturbation are possible because of the stable axial focus.

      Weaknesses:

      (1) At present, the paper does not properly position the new Bessel-beam method against previous work, and fails to compare it to alternative fast volumetric imaging methods without Bessel beams.

      (2) The cost-effectiveness of the proposed method is not well described or supported by evidence; it would be useful to include more detail or remove this claim.

      (3) Some biological conclusions, e.g., regarding novel features of microglial dynamics (i.e., the observed two-wave responses and coordinated extension-retraction), are based on relatively limited sample size and would benefit from clearer discussion of variability across animals and fields of view.

      (4) The use of neural network-based denoising for microglial imaging is reasonable but introduces potential concerns about trustworthiness; additional clarification of validation or failure modes would strengthen confidence in these results.

      To conclude, most of the authors' claims are well supported by the data. The central conclusion, namely that tBessel-TPFM provides tunable volumetric imaging enabling experiments not feasible with existing two-photon approaches, is justified. Some biological interpretations would benefit from a more cautious framing, but they do not undermine the main technical and methodological contributions of the study. This is a strong and technically rigorous manuscript that makes a substantial methodological advance with clear relevance to neuroscience and intravital imaging. Minor clarifications and a slightly more measured discussion of certain biological findings are recommended.

    1. Reviewer #2 (Public review):

      Summary:

      This is a very interesting paper bringing new and important information about the poorly understood rhodopsin 7 photoreceptive molecule. The very ancient origin of the gene is revealed in addition to data supporting a signaling pathway that is different from the one known for the canonical rhodopsins. Precise expression data, particularly in the optic lobe of the fly, as well as clear behavioral phenotypes in responses to light changes, make this study a strong contribution to the understanding of the still-debated function of rhodopsin 7.

      Specific comments

      (1) Title and abstract: Contribution of Rh7 to circadian clock regulation

      (a) It is not that clear to me what rhodopsin does in terms of circadian regulation (even though its function might be circadianly regulated). The clear role in the light/dark distribution of activity might not be circadian per se, but mostly light/dark-driven, and there is no evidence here for a role in the entrainment of the clock.

      (b) The authors should cite Lazopulo, which nicely shows that Rh7 has an important role in peripheral neurons to allow flies to escape from blue light (see below).

      (2) Figure 2 C

      The finding showing that Galphaz but not Galphaq can trigger signaling from light-excited Rh7 is a very intriguing finding to better understand Rh7 function. Since Galphaz is related to Gi/o, it would be interesting to test those, for example, by expressing RNAi with Rh7-gal4 and testing the Light-dark or light-off response behavior.

      (3) Figures 3-4

      The change in the locomotor activity distribution between light and dark in LD conditions provides a nice assay for Rh7 function. Since Lazopulo et al. (2019) have shown that wild-type but not Rh7 mutants do escape from blue light, it would be important to compare and discuss these LD behavior data with the Lazopulo results. Precisely, is this nighttime preference linked to blue light?

      The expression data are really nice and show that Rh7 is mostly a non-retinal photoreceptor. However, the paper would be strongly reinforced by correlating this with the LD behavior. The LD phenotype should be tested in flies with Rh7 expression rescued under Rh7gal4 control (as done for the startle response). This is important to show whether the expression pattern is likely responsible for the described Rh7 function in LD. If L5 and or M11 drivers are available, they should be used to rescue Rh7? Since expression in some clock neurons is shown, the rescue experiment should also be done with a clock neuron driver.

      In the same line, can the LD phenotype (or startle response phenotype of Figure 4) be restored by expressing Rh7 under ppk control, as shown for the blue light avoidance phenotype by Lazopulo et al?

      Finally, the Rh7 "darkfly" rescued flies should be tested in LD.

    1. The Examined Life is Wise Living: The Relationship Between Mindfulness, Wisdom, and the Moral Foundations.Published in:Journal of Adult Development, Dec2020,Academic Search CompleteBy:Verhaeghen, PaulVerhaeghen, Paul The Examined Life is Wise Living: The Relationship Between Mindfulness, Wisdom, and the Moral Foundations  This correlational study of two independent samples (260 college students and 173 Mechanical Turk workers aged 21–74) examined whether and how mindfulness (broadly construed as a manifold of self-awareness, self-regulation, and self-transcendence), influences wisdom about the self (Adult Self-Transcendence Inventory and Self-Assessed Wisdom Scale) and wisdom about the (social) world (Three-Dimensional Wisdom Scale), and how mindfulness and wisdom impact ethical sensitivities (the five moral foundations). Mindfulness predicted wisdom about the self, and wisdom about the self was linked to an emphasis on the individualizing moral foundations of care/harm avoidance and fairness and, to a lesser degree, on the binding moral foundations of loyalty, authority, and purity. Wisdom about the (social) world was not associated with either mindfulness or the moral foundations. Age was a significant positive predictor for wisdom about the self once the self-awareness component of mindfulness was taken into account. Keywords: Wisdom; Mindfulness; Moral foundations; Ethics This paper investigates the links between trait mindfulness, wisdom, and ethical sensitivities (operationalized as sensitivity to the five moral foundations) in two independent samples, one of college students and one of adults spanning ages 21–74. Two principal ideas guided the study. The first idea is that wisdom, whether one conceptualizes it as a form of expertise or as a virtue or personality characteristic, might be well served by the specific quality or qualities of attention the individual brings to their experiences. It makes sense to expect that a habitual mindful attitude (i.e., taking an open, non-judgmental, reflective, self-regulatory, and sometimes self-transcendent stance towards life) might be a good indicator or exemplifier of such qualities. The second idea is that most, if not all, current adult-developmental theories consider wisdom to be of practical consequence, in the sense that wise people are expected to generally display prosocial attitudes and behavior (for a review, see Bangen et al. [10]). Consequentially, one might expect this wise stance to give rise to ethical sensitivities that are compatible with the characteristics of wisdom (as defined within these theories). Wisdom It is probably fair to say that within the field of psychology the study of wisdom started from an adult development perspective (e.g., Clayton and Birren [20]; Erikson [26]; Kramer [44]; Pascual-Leone [54]). Initial conceptualizations tended to view wisdom primarily from a cognitive angle, that is, as an advanced form of postformal thought. For instance, Baltes and Staudinger ([ 9 ]) define wisdom as 'expertise in the conduct and meaning of life' (p. 124). In this approach, wisdom is conceptualized as a form of crystallized intelligence, more specifically 'expert knowledge in the fundamental pragmatics of life that permits exceptional insight, judgment, and advice about complex and uncertain matters' (Pasupathi et al. [56], p. 351). Other approaches—Glück and Bluck ([31]) label these 'integrative views'—have supplemented this cognitive view by additionally emphasizing the reflective, affective, and conative qualities of the wise person, making wisdom more akin to a personality characteristic or a virtue (e.g., Ardelt [ 3 ]; Mitchell et al. [52])—wisdom as 'personal, concrete, applied, and involved' (Ardelt [ 3 ], p. 262). The different conceptualizations of wisdom do have a common core. From a review of 24 different key theories or definitions of wisdom, Bangen et al. ([10]) concluded that five subcomponents were present in at least half of the papers: (a) social decision making and pragmatic knowledge of life; (b) prosocial attitudes and values; (c) reflection and self-understanding (including a desire to learn); (d) acknowledgement of and coping with uncertainty; and (e) emotional homeostasis. Although there are qualitative, performance-based measures of wisdom, such as the Berlin wisdom paradigm (Baltes and Smith [ 8 ]), where participants describe how they would solve a particular life problem and answers are scored along a series of dimensions, self-report measures were used here, simply because quantitative measures allow for more efficient data collection and scoring, which in turn allows to query a larger sample of respondents. Specifically, I used the three quantitative self-report measures for wisdom recommended by Glück ([30]), Glück et al. ([34]), and Staudinger and Glück ([64])—Ardelt's Three-Dimensional Wisdom Scale (3D-WS; [ 2 ]), Levenson's Adult Self-Transcendence Inventory (ASTI; Levenson et al. [47]), and Webster's Self-Assessed Wisdom Scale (SAWS; [71], [72]). These three scales have different emphases. The 3D-WS measures wisdom as the integration of cognitive, reflective, and affective/compassionate personal characteristics; the SAWS gauges five dimensions, namely critical life experience, emotional regulation, reminiscence and reflectiveness, humor, and openness; the ASTI taps into self-transcendent wisdom, defined as a self-expansive process entailing decreased self-concern and increased empathy, understanding, spirituality, and feelings of connectedness with past and future generations. Not all of these scales cover all five subcomponents mentioned above: Arguably, the 3D-WS does; the SAWS covers social decision making, self-reflection, and emotional homeostasis; and the ASTI includes items about prosocial attitudes, self-reflection, and emotional homeostasis. Glück et al. ([34]) and Staudinger and Glück ([64]) additionally make a distinction between personal and general wisdom. The former refers to a person's insight into themselves and their own lives; the latter to insights into life and the world in general. The assumption is that personal wisdom is obtained through actual personal experience, whereas general wisdom does not have personal experience as a necessary condition. In Glück's conceptualization, all three scales mentioned above measure personal wisdom; only performance-based measures tap into general wisdom. Glück et al. ([34]) also posit a third, often underappreciated facet of wisdom, namely other-related wisdom, which they define as 'an empathy-based caring concern for both concrete other people and humankind at large' (p. 5); it is most evident in two of the three 3D-WS scales, namely the cognitive and reflective scales, and is possibly a subcomponent of personal wisdom. In (partial) confirmation of this view, Glück et al. found that all three 3D-WS scales loaded on a different factor than the two other quantitative scales. Given that the cognitive scale of the 3D-WS contains items that are indeed about the other (e.g., 'People are either good or bad' and 'You can classify almost all people as either honest or crooked'—both items are reverse-scored), but also items that are often general and external (e.g., 'ignorance is bliss' and 'It is better not to know too much about things that cannot be changed'—both items are reverse-scored), it seems to us that this dimension could be labeled more accurately as 'wisdom about the (social) world', in contrast with the 'wisdom about the self' tapped in personal-wisdom scales. Mindfulness Mindfulness is often defined as a particular way of paying attention—the ability or propensity to engage in "nonelaborative, non-judgmental, present-centered awareness in which each thought, feeling, or sensation that arises in the attentional field is acknowledged" (Bishop et al. [12], p. 232); this awareness requires cultivation (Nilsson and Kazemi [53]). One corollary is that "thought or events are observed as events in the mind without over-identifying with them and without reacting to them in an automatic, habitual pattern of reactivity", thus "introducing a 'space' between one's perception and response" and allowing one "to respond to situations more reflectively (as opposed to reflexively)" (Bishop et al. [12], p. 232). Mindfulness has been found to be broadly beneficial to the individual—mindfulness interventions lead to positive outcomes regarding stress, well-being, anxiety, depression, negative emotions, emotion regulation, rumination, self-compassion, and empathy (Eberth and Sedlmeier [25]; Verhaeghen [68]). These relationships are at least partially causal: changes in dispositional mindfulness after meditation training correlate with changes in self-perceived stress, anxiety, depressed mood, positive affect, negative affect, rumination, and general well-being (Gu et al. [40]; Khoury et al. [43]). Recent theoretical work within the field has converged on the conclusion that mindfulness is a complex concept, more akin to a manifold (or even a cascade of processes) than to a singular construct. The starting point of this work has been an examination of the reasons why mindfulness interventions lead to such a wide array of positive outcomes. Many models have been advanced to explain the translation of mindfulness into positive outcomes (e.g., Baer [ 5 ]; Brown et al. [16]; Chiesa et al. [19]; Creswell and Lindsay [21]; Grabovac et al. [35]; Hölzel et al. [42]; Segal et al. [59]; Shapiro et al. [60]; Vago and Silbersweig [67]), each with their own emphases and levels of complexity. Although details of the different proposed models vary, the list of proposed mechanisms generally contains three categories, as Vago and Silbersweig ([67]) point out. A first proposed mechanism is a change in self-awareness. This involves recognizing automatic habits and automatic patterns of reactivity, as well as an increased awareness of momentary states of body and mind—what is typically meant by mindfulness. A second proposed mechanism is a change in self-regulation. This includes better regulation of emotions, heightened self-compassion, increased emotional and cognitive flexibility, decreased rumination and worry, and increased nonattachment and acceptance. A final proposed mechanism is increased self-transcendence . This implies increased decentering, a stronger awareness of interdependence between self and others, and heightened compassion. Vago and Silbersweig label this common-denominator model the S-ART model, after its three components: self-awareness, self-regulation, and self-transcendence. Our own empirical work on the subject (Verhaeghen [69]; Verhaeghen and Aikman [70]), based on exploratory and confirmatory factor analysis as well as structural equation modeling on 3 independent samples of about 300 subjects each has indeed confirmed the plausibility of this S-ART mindfulness manifold, suggesting a flow of influence from self-awareness over self-regulation to self-transcendence, and then outward to well-being and other aspects of psychological health (for a schematic representation, see Fig. 1). Factor analysis showed that additional subdivisions were present within the components of self-awareness and self-regulation: self-awareness incorporated reflective awareness (the more active, deliberate, probing aspect of mindfulness) and controlled sense-of-self in the moment (the more passive, equanimous, non-judgmental aspect of mindfulness) (for more details on these components and how they are measured, see the "Methods" section below); self-regulation was tapped by (the opposite of) self-preoccupation and by self-compassion. Graph: Fig. 1 The S-ART mindfulness manifold as obtained in Verhaeghen ([69]) Mindfulness and Wisdom There are obvious points of contact between this conceptualization of mindfulness and those of wisdom, suggesting they operate in the same nomological space. First, some of the common-core wisdom subcomponents align with the mindfulness manifold. Clearly, the reflection and self-understanding subcomponent of common-core wisdom has a natural affinity (if not identity) with the reflective awareness component in the mindfulness manifold. A few examples from specific theories illustrate this quite nicely. For instance, Ardelt ([ 3 ]) explicitly claims that '[t]he development of wisdom requires the transcendence of one's subjectivity and projections, which can be accomplished through self-examination, self-awareness, and a reflection on one's own behavior and one's interactions with others' (p. 269). Likewise, Glück and Bluck's ([32]) MORE (mastery, openness, reflectivity, and emotion regulation) model of wisdom posits that wisdom-related knowledge develops through an interaction of life experiences with the four MORE resources, and that therefore wisdom should manifest itself in how people reflect upon past experiences. As a third example, Brown and Greene's model of Wisdom Development ([14]) states that wisdom ripens when individuals go through a core 'learning-from-life' process, comprised of reflection, integration, and application. Pascual-Leone ([55]), as a final example, considers meditation (one possible cultivator of mindfulness) as a path towards wisdom, through its fostering of insight, self-insight, and self-transcendence. Second, emotional homeostasis can be understood as an aspect or outcome of self-regulation. Third, some wisdom researchers explicitly view self-transcendence as a critical component of wisdom (see the Ardelt quote above; also Curnow [22]; Levenson [46]). There are a few empirical indications of a mindfulness-wisdom link as well. One study (Brienza et al. [13]) used its own process-based measure of wisdom, and found correlations with mindfulness scales, especially observing and orienting. Two studies used a training approach to foster wisdom by incorporating mindfulness either explicitly (Sharma and Dewangan [61]) or implicitly (as reflective awareness through a self-reflection journal and a life experience journal; Bruya and Ardelt [17]). The former study did not find intervention effects on either mindfulness or wisdom, but did find significant correlations at pretest between mindfulness (measured by the Mindful Attention Awareness Scale, MAAS; Brown and Ryan [15]) and the affective and reflective components of wisdom. The latter study obtained an intervention effect of the reflective exercises over and beyond those of attending a cognitively oriented class on wisdom, but did not include a measure of mindfulness to verify the proximal cause of the effect. These intervention studies, then, are somewhat suggestive of (but far from definitive about) a positive relationship between mindfulness and wisdom. Wisdom and Ethical Sensitivities The psychological study of ethical sensitivities and attitudes (e.g., Greene [37]; Haidt [41]) has converged on the conclusion that ethical actions are not always the product of the careful application of rational thought, but instead tend to be largely (although not exclusively) based on intuitions—evolved, automatic responses, inaccessible to awareness, which sometimes operate in contradiction with logical constraints. Researchers in this field often consider the vessels for these intuitions to be innate—for instance, Haidt's Moral Foundations Theory (MFT; Graham et al. [36]) posits that ethical sensitivities ultimately boil down to the five dimensions of promoting care/avoiding harm, fairness, ingroup loyalty, (respect for) authority, and purity (or sanctity). The former two are often combined into an 'individualizing' foundation, because they focus on the provision and protection of individual rights; the remaining three into a 'binding' foundation, because they focus on ingroup cohesion. The idea is that every individual is sensitive to these five aspects, but that the intuitions themselves are built through experience, and are thus open to individual and cultural differences through a tuning up or down of the emotional responses due to experiences that fit into these vessels (Flanagan and Williams [28]). In our previous study (Verhaeghen and Aikman [70]), where we adopted the Moral Foundations framework, we found clear links between the mindfulness manifold and ethical sensitivities, which possibly might be mediated through wisdom. Specifically, we found that reflective awareness and self-transcendence were directly related to the individualizing aspects of morality (i.e., an emphasis on care and fairness); only self-transcendence was related to the binding aspects of morality (i.e., an emphasis on loyalty, authority, and sanctity). One reason to suspect that wisdom might play a role in the individualizing foundation stems from its very definition—prosocial attitudes and values are the second most cited key component in Bangen et al.'s ([10]) literature review (21 out of 24 theories or models incorporated this component). A key mechanism may be the self-transcendental character of wisdom, which it has in common with mindfulness. There are empirical reasons to suspect that wisdom is implicated in moral attitudes (for a review of empirical and theoretical links between wisdom and ethics, see Sternberg and Glück [65]). For instance, wisdom has been found to correlate positively with other-oriented values such as well-being of friends, societal engagement, and ecological protection (Kunzmann and Baltes [45]; Webster [73]). Implicit lay theories of wisdom also include value orientations that align, in Haidt's model, with care and fairness (Glück et al. submitted). The Present Study The literature reviewed suggests that mindfulness, wisdom, and ethical sensitivities are related, but the pieces of this puzzle have not yet been fit together. One wide-open question is how the different components of mindfulness, broadly defined as self-awareness, self-regulation, and self-transcendence relate to wisdom; another whether (or how) wisdom might be a mediator translating, and perhaps crystalizing, mindfully experienced events into ethical attitudes. From the literature reviewed above, I expect that all three aspects of mindfulness would be positively related to wisdom. To assess wisdom, I used the three scales most commonly used in quantitative research—the 3D-WS, the ASTI, and the SAWS. After Glück et al. ([34]), I expect that a factor analysis of these measures will yield two dimensions: wisdom about the self (ASTI and SAWS) and wisdom about the (social) world (3D-WS). Given that mindfulness is primarily associated with knowledge of the self, I would expect that the mindfulness-wisdom connection would be stronger for wisdom about the self than for wisdom about the (social) world. Extending our prior work on mindfulness and ethical sensitivities, as well as building on Glück et al. (submitted), I expect that wisdom will be positively connected to the individualizing moral foundations—care and fairness. For the binding foundations—authority, loyalty, and sanctity/purity—the connection is likely less strong. Because wisdom is very often considered an aspect of adult development, I included a group of adults sampled across a large sweep of the adult life span (Sample B, age 25–74), aside from the more usual sample of college students (Sample A). Adding the former sample allows me, first, to check if the results from the first sample replicate, and second, to test whether or not any of the wisdom or ethical components are age-sensitive, as has sometimes been claimed (e.g., Ardelt [ 1 ]; Baltes and Kunzmann [ 7 ]; but see, e.g., Grossmann and Kross [39]; Mickler and Staudinger [51]). Methods Participants Sample A consisted of 260 undergraduate students from the Georgia Institute of Technology, who received course credit in return for their participation. They were invited to participate in a study on 'mindfulness, acceptance, and psychology'. They were aged 18–26 (mean = 19.7, SD = 1.5); 54% were women. Sample B consisted of 173 participants recruited from Mechanical Turk. They were invited to participate in a study on 'mindfulness, acceptance, and psychology', and offered $4 in return for their time. Workers needed to be highly qualified in order to participate—more than 5000 Human Intelligence Tasks (HIT; i.e., surveys or other online tasks) completed to the requesters' satisfaction, and at least 98% of all lifetime HITs approved by the requester. They were aged 21–74 (mean = 39.8, SD = 11.7); 44% were women. The age distribution was as follows: age 21–30: 38 participants; age 31–40: 69 participants; age 41–50: 33 participants; age 51–60: 18 participants; age 61–74: 12 participants. On average, participants had completed 14.9 years of education (SD = 1.9). Although Mechanical Turk is generally considered to be a useful, valid, and reliable tool for behavioral researchers (e.g., Mason and Suri [49]), we found it prudent to assess potential differences in data quality between the two samples. We did this by comparing Cronbach's α values for all subscales (see the "Measures and Procedure" section below for all α values). Sample B (Mechanical Turk) tended to have higher reliability values (median = 0.84, ranging from 0.41 to 0.93) than Sample A (students) (median = 0.71, ranging from 0.48 to 0.90). The correlation between Fisher z -transformed reliability values between the samples was 0.78 (this transformation was applied to linearize the measurement scale), suggesting that both groups were about equally sensitive to differences in the item characteristics that drive reliability. Measures and Procedure Participants filled out all questionnaires online; they took about 45–60 min to complete. Below, questionnaires are grouped thematically; the mindfulness measures (i.e., self-awareness, self-regulation, and self-transcendence) are presented as they resulted from the set of factor analyses (an exploratory analysis on 488 participants, and a confirmatory analysis on an independent sample of 222 participants) in Verhaeghen ([69]); this structure was replicated in Verhaeghen and Aikman ([70]). All measures were collected from both samples. Cronbach's α values reported are the values obtained in the present study, reported separately for Samples A and B, respectively. Note that some scales (notably the subscales of the Self-Compassion Scale) contain a very small number of items, possibly depressing the α values. Control Variables The Mini-IPIP (Donnellan et al. [23]) is a 20-item measurement of the Big Five personality factors , 4 items for each factor: Extraversion (sample item: 'I am the life of the party', Cronbach's α = 0.83 and 0.87), Agreeableness (sample item: 'I sympathize with others' feelings', Cronbach's α = 0.77 and 0.85), Conscientiousness (sample item: 'I get chores done right away', Cronbach's α = 0.68 and 0.78), Openness (which the IPIP labels Intellect/Imagination; sample item: 'I have a vivid imagination', Cronbach's α = 0.71 and 0.84), and Neuroticism (sample item: 'I have frequent mood swings', Cronbach's α = 0.74 and 0.78). Additionally, participants were asked for their age and gender . Social Conservatism Social conservatism was measured via the Social Conservatism subscale (6 items; sample item: 'Please indicate the extent to which you feel positive or negative towards each issue: ... Abortion'; Cronbach's α = 0.62 and 0.69) of the Social and Economic Conservatism Scale (SECS; Everett [27]). Self-awareness Two constructs were assessed within self-awareness. The first, reflective awareness , is the unit-weighted composite of the z -scores of three scales: (a) the Observing subscale of the Five Facets Mindfulness Questionnaire (FFMQ; Baer et al. [ 6 ]) (8 items; sample item: 'When I'm walking, I deliberately notice the sensations of my body moving', Cronbach's α = 0.73 and 0.87); (b) the Reflectiveness subscale of the Broad Rumination Scale (BRS; Trani et al. in preparation) (4 items; sample item: 'It is important for me to understand why I feel a certain way', Cronbach's α = 0.81 and 0.81); and (c) Search for Insight/Wisdom of the Aspects of Spirituality scale (ASP; Büssing et al. [18]) (7 items; sample item: 'I strive for insight and truth', Cronbach's α = 0.84 and. 90). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). The second construct, controlled sense-of-self in the moment , is the unit-weighted composite of the z -scores of three scales: (a) the Acting with Awareness subscale from the FFMQ (8 items, sample item: the reverse of 'When I'm doing things, my mind wanders off and I'm easily distracted', Cronbach's α = 0.87 and 0.91); (b) the Sense-of-self Scale (SOSS; Flury and Ickes [29]) (12 items, sample item: 'I have a clear and definite sense of who I am and what I'm all about'; Cronbach's α = 0.86 and 0.88); and (c) the Non-judging of inner experience subscale of the FFMQ (8 items, sample item: the reverse of 'I criticize myself for having irrational or inappropriate emotions', Cronbach's α = 0.90 and 0.93). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). Self-regulation Two constructs were assessed within self-regulation. The first, self-preoccupation , is the unit-weighted composite of the z -scores of two subscales from the BRS, namely Compulsivity (5 items; sample item: 'When I start to worry, it's very hard for me to stop', Cronbach's α = 0.79 and 0.87) and Worrying (3 items; sample item: 'Uncertainty about the future bothers me', Cronbach's α = 0.58 and 0.68), as well as two subscales from the Self-Compassion Scale, Short Form (SCS; Raes et al. [57]), namely Isolation (2 items; sample item: 'When I'm feeling down, I tend to feel like most other people are probably happier than I am', Cronbach's α = 0.56 and 0.63) and Over-Identified (2 items; sample item: 'When I fail at something important to me I become consumed by feelings of inadequacy', Cronbach's α = 0.66 and 0.58). In both samples, the composite was normally distributed, as ascertained via a Kolmogorov–Smirnov test ( p > 0.2). In our previous work, as here, self-preoccupation correlated negatively with other aspects of mindfulness, as one would expect—better self-regulation implies lower, not higher, levels of self-preoccupation. This may be confusing for some readers. Because the construct is, however, measured by scales that tap explicitly into the self-preoccupation aspect, and not its absence or opposite, we preferred to keep the self-preoccupation label. The second, self-compassion , was measured as the unit-weighted composite of the z -scores of three subscales from the SCS, namely Self-Kindness (2 items; sample item: 'I try to be understanding and patient towards those aspects of my personality I don't like', Cronbach's α = 0.61 and 0.60), Common humanity (2 items; sample item: 'I try to see my failings as part of the human condition', Cronbach's α = 0.49 and 0.57), and Mindfulness (2 items; sample item: 'When something painful happens I try to take a balanced view of the situation', Cronbach's α = 0.66 and 0.68), as well as the Decentering subscale of the Experiences Questionnaire (EQ; Fresco et al. 2007) (13 items, sample item: 'I am better able to accept myself as I am'; Cronbach's α = 0.84 and 0.93). The composite was normally distributed in Sample A, Kolmogorov–Smirnov = 0.042, p > 0.2, but not Sample B, Kolmogorov–Smirnov = 0.075, p = 0.034. Self-transcendence Self-transcendence was measured as the unit-weighted composite of the z -scores of 2 subscales from the Dispositional Positive Emotion Scale (DPES; Shiota et al. [62]), namely Joy (6 items; sample item: 'I am an intensely cheerful person', Cronbach's α = 0.84 and 0.90), and Love (6 items; sample item: 'I develop strong feelings of closeness to people easily', Cronbach's α = 0.82 and 0.90), and 1 subscale from the Resilience Scale (RS; Lundman et al. [48]), namely Meaningfulness (7 items, sample item: 'My life has meaning', Cronbach's α = 0.81 and 0.91). The composite was normally distributed in Sample A, Kolmogorov–Smirnov = 0.042, p > 0.2, but not Sample B, Kolmogorov–Smirnov = 0.072, p = 0.046. Moral Foundations This construct was measured using the 5 subscales of the Moral Foundations Questionnaire (Graham et al. [36]): (a) Care/harm (6 items; sample item: 'When you decide whether something is right or wrong, to what extent are the following considerations relevant to your thinking? – Whether or not someone suffered emotionally'; Cronbach's α = 0.52 and 0.76); (b) Fairness (6 items; sample item: '... Whether or not some people were treated differently than others'; Cronbach's α = 0.56 and 0.64); (c) Ingroup loyalty (6 items; sample item: '... Whether or not someone's action showed love for his or her country'; Cronbach's α = 0.48 and 0.84); (d) Authority (6 items; sample item: '... Whether or not someone showed a lack of respect for authority'; Cronbach's α = 0.61 and 0.85); and (e) Purity (6 items; sample item: '... Whether or not someone violated standards of purity and decency'; Cronbach's α = 0.69 and 0.92). Wisdom Scales Participants filled out three self-report wisdom surveys. The Adult Self-Transcendence Inventory (ASTI; Levenson et al. [47]) measures, in the words of the authors, "a decreasing reliance on externals for definition of the self, increasing interiority and spirituality, and a greater sense of connectedness with past and future generations" (p. 127). After factor analysis, Levenson et al. derived a more focused self-transcendence scale, which is used here (Factor 1 of their Table 1; 10 items; sample item: 'My peace of mind is not so easily upset as it used to be'; Cronbach's α = 0.67 and 0.79). The Self-Assessed Wisdom Scale (SAWS; Webster [71]) measures 5 interrelated dimensions of wisdom: experience (8 items; sample item: 'I have experienced many painful events in my life'; Cronbach's α = 0.81 and 0.84), emotions (8 items; sample item: 'I am good at identifying subtle emotions within myself'; Cronbach's α = 0.83 and 0.86), reminiscence (8 items; sample item: 'Reviewing my past helps gain perspective on current concerns'; Cronbach's α = 0.86 and 0.91), openness (8 items; sample item: 'I like to read books which challenge me to think differently about issues'; Cronbach's α = 0.71 and 0.80), and humor (8 items; sample item: 'I can chuckle at personal embarrassments'; Cronbach's α = 0.86 and 0.91). The Three-Dimensional Wisdom Scale (3D-WS; Ardelt [ 2 ]) consists of 3 subscales, tapping the cognitive (14 items, sample item: 'It is better not to know too much about things that cannot be changed'; Cronbach's α = 0.78 and 0.86), reflective (12 items, sample item: 'When I'm upset at someone, I usually try to "put myself in his or her shoes" for a while'; Cronbach's α = 0.55 and 0.54), and affective (13 items, sample item: 'I can be comfortable with all kinds of people'; Cronbach's α = 0.49 and 0.41) components of wisdom. Factor analysis of the nine wisdom scales in both samples; principal axis analysis with oblimin rotation Sample ASample BFactor 1 wisdom about the selfFactor 2 wisdom about the social worldFactor 1 wisdom about the selfFactor 2 wisdom about the social worldASTI (total).67.80SAWS-emotion regulation.72.78SAWS-experience.79.75SAWS-humor.71.77SAWS-openness.65.74SAWS-reminisce-reflect.80.733D-WS-affective.71.803D-WS-cognitive.57.683D-WS-reflective.76.68 N = 260 for Sample A and 173 for Sample B. For legibility reasons, factor loadings below.30 are not represented Measures Collected but Not Included in the Analyses Additionally, participants filled out the Nonattachment Scale (NAS; Sahdra et al. [58]), the Emotional Resilience Scale (ERS; Gross and John [38]); the QUEST scale (Batson and Schoenrade [11]), the Varieties of Inner Speech Questionnaire (VISQ; McCarthy-Jones and Fernyhough [50]), and the Self-Verbalization Scale (SVS; Duncan and Cheyne [24]). Some of those measures were remnants of an earlier (Verhaeghen [69]) attempt at casting a wide net of mindfulness measures; these measures failed to make the final cut after the factor analysis described in that paper (NAS, ERS, and QUEST); others were are not relevant to the present project (VISQ and SVS). Results Factor Analysis of the Wisdom Scales Two exploratory factor analyses (principal axis analysis with oblimin rotation), one for each sample, were conducted on the nine wisdom scales (i.e., the ASTI scale, the three 3D-WS scales and the five SAWS scales). Scale or subscale scores (i.e., not item scores) were the unit of analysis. Eigenvalues and the scree plot suggested a 2-factor solution in both samples. This solution is presented in Table 1; it explains 55% of the variance in Sample A, and 57% of the variance in Sample B. Both analyses converged on the same solution: the ASTI and all the SAWS scales loaded on one factor, and all three 3D-WS scales loaded on another. As mentioned in the introduction, the ASTI and the SAWS scale have in common that they survey wisdom from an intrapersonal perspective, that is, they appear to tap self-knowledge and self-acceptance; the 3D-WS arguably captures skills and wisdom about how to deal with the social world and with external circumstances. Consequently, I will label the first factor wisdom about the self , and the second wisdom about the ( social ) world . The two factors are relatively independent: Their intercorrelation was 0.18 in Sample A and 0.07 in Sample B. Wisdom and the Mindfulness Manifold To examine how the mindfulness manifold is related to self-assessed wisdom, as well as to control for the effects of the set of background variables (personality, age, and gender), hierarchical multiple regression analysis was applied to the data, separated by sample, with the two types of wisdom (wisdom about the self and wisdom about the [social] world) as the final outcome. For these analyses, a unit-weighted composite was constructed from the z -scores for the ASTI and the different SAWS scales to represent wisdom about the self. The unit-weighted composite of the z -scores of the three 3D-WS scales represented wisdom about the (social) world. Both unit-weighted wisdom composites were normally distributed in both samples; highest Kolmogorov–Smirnov = 0.057, p > 0.200. In the first step, the background variables—the five IPIP scales, age, and gender—were entered. The next step added the two self-awareness composites (reflective awareness and controlled sense-of-self in the moment); the step after that the two self-regulation composites (self-preoccupation and self-compassion); the final step added self-transcendence. Pearson correlations between all variables are reported in Table 2; results from the regression analyses in Table 3. Note that in these analyses, self-preoccupation is scored as defined above, that is, higher values indicate higher levels of self-preoccupation, which indicates a low level of self-regulation. Because of the potential conceptual overlap between the mindfulness concept of self-transcendence and wisdom as defined through the ASTI, analyses were rerun after removing the ASTI from the composite measuring wisdom about the self. The wisdom about the self variable and the wisdom about the self variable with the ASTI removed were virtually identical ( r = 0.98 in Sample A and 0.99 in Sample B); the pattern of the regression results was identical (i.e., variables that were significant remained significant and variables that were not remained non-significant). Correlation matrix for the background variables, mindfulness variables, and wisdom factors; Sample A data presented above the diagonal, Sample B below 12345678910111213141516171 IPIP extraversion1.00.29**.01 −.12*.13*.09.10.03.12.22** −.22**.13*.40**.31**.19**.06.062 IPIP agreeableness.25**1.00.17** −.02.25**.18**.03.28**.36**.19**.00.20**.51**.38**.23**.31**.063 IPIP conscientiousness.12.30**1.00 −.16**.05.18**.03.11.09.34** −.11.18**.27**.10 −.02.05.19**4 IPIP neuroticism −.43** −.34** −.36**1.00 −.09 −.04 −.03.24**.08 −.53**.60** −.48** −.34** −.18** −.11.06 −.045 IPIP intellect/imagination.29**.18* −.02 −.20**1.00.07.04 −.15*.35**.08 −.08.07.20**.36**.03.04 −.116 Social conservatism −.04.14.23** −.19* −.111.00 −.05.07.16*.15* −.02.14*.24**.18*.03.11.54**7 Age −.05.13.07 −.08 −.08.30**1.00 −.07.05.03.03 −.02 −.03.03.07 −.03.088 Gender.05 −.31** −.17* −.02.03 −.07 −.21**1.00.04 −.03.21** −.05.13*.05.13*.30**.009 Reflective awareness.22**.34**.26** −.18*.43** −.02 −.12 −.141.00 −.08.22**.23**.35**.60**.15*.37**.23**10 Controlled sense-of-self in the moment.33**.40**.37** −.62**.21**.05.17* −.10.17*1.00 −.54**.42**.43**.22**.14* −.03.0111 Self-preoccupation −.37** −.22** −.23**.57** −.19* −.08 −.17* −.08 −.02 −.56**1.00 −.44** −.27** −.08 −.14*.30**.1112 Self-compassion.06.16* −.07 −.20**.03.05.04 −.04.17* −.01.17*1.00.48**.41**.21**.14*.17**13 Self-transcendence.52**.59**.34** −.66**.16*.26**.04 −.12.43**.54** −.47**.21**1.00.57**.27**.35**.24**14 Wisdom about the self.34**.51**.32** −.47**.40**.10.11 −.14.66**.45** −.28**.22**.68**1.00.28**.41**.26**15 Wisdom about the (social) world.11.06.08 −.08.08 −.05.05 −.06.10.05 −.06.00.11.101.00.18**.1016 Individualizing foundation.09.38**.09 −.13.17* −.08.06 −.15.31**.13 −.02.03.29**.43**.111.00.33**17 Binding foundation −.04.20**.20* −.12 −.20*.77**.13 −.10 −.01 −.02.09.07.31**.16*.01.071.00 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (https://ipip.ori.org/) * p <.05 Results from hierarchical regression analyses to predict the wisdom factors Step 1Step 2Step 3Step 4Sample ASample BSample ASample BSample ASample BSample ASample BWisdom about the self IPIP extraversion0.19**0.080.16**0.020.17**0.030.11* − 0.06 IPIP agreeableness0.24**0.26**0.080.17**0.060.17** − 0.010.05 IPIP conscientiousness0.010.07* − 0.060.01 − 0.060.03 − 0.080.02 IPIP neuroticism − 0.16** − 0.21** − 0.15** − 0.19** − 0.10 − 0.17* − 0.06 − 0.05 IPIP intellect/imagination0.28**0.31**0.13**0.110.16**0.110.14*0.18** Age − 0.010.08 − 0.020.13* − 0.010.12*0.010.13* Gender0.07 − 0.060.080.010.070.020.050.02 Reflective awareness0.52**0.50**0.46**0.49**0.40**0.38** Controlled sense-of-self in the moment0.15*0.120.120.130.070.09 Self-preoccupation0.04 − 0.010.050.05 Self-compassion0.19**0.060.14*0.03 Self-transcendence0.28**0.41**R2.296.455.506.622.526.625.561.673R2 change.296**.455**.210**.167**.020**.003.035**.048**Wisdom about the (social) world IPIP extraversion0.130.120.100.130.090.130.060.12 IPIP agreeableness0.21** − 0.010.16*0.000.16*0.000.16 − 0.01 IPIP conscientiousness − 0.090.03 − 0.130.04 − 0.120.04 − 0.13*0.04 IPIP neuroticism − 0.17** − 0.02 − 0.13 − 0.08 − 0.07 − 0.09 − 0.05 − 0.08 IPIP intellect/imagination − 0.050.06 − 0.080.06 − 0.080.06 − 0.080.07 Age0.050.040.050.040.060.050.070.05 Gender0.11 − 0.070.10 − 0.070.11 − 0.070.10 − 0.07 Reflective awareness0.110.040.130.040.100.02 Controlled sense-of-self in the moment0.12 − 0.120.07 − 0.110.05 − 0.12 Self-preoccupation − 0.120.03 − 0.110.04 Self-compassion0.03 − 0.000.01 − 0.08 Self-transcendence0.130.06R2.116.033.132.043.140.043.148.044R2 change.116*.033.016.009.008.000.008.001 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (ipip.ori.org) * p <.05, ** p <.01 Ethical Sensitivity as Consequence of Mindfulness and Wisdom Hierarchical regression was applied to investigate how wisdom and the mindfulness manifold potentially shape ethical sensitivity, operationalized here as the moral foundations. To keep the number of analyses manageable, the two individualizing foundations were collapsed into a single construct by taking the average of the z -scores of the Care/Harm and Fairness scales (the correlation between the two individualizing foundations was 0.50 in Sample A, and 0.57 in Sample B); likewise, a unit-weighted z -score composite was built from the three binding foundations, namely Ingroup loyalty, Authority, and Purity (intercorrelations between the three binding foundations ranged from 0.59 to 0.64 in Sample A, and from 0.63 to 0.78 in Sample B). As is usual (because individuals generally tend to skew towards the ethical side of the distribution), these composites were not normally distributed, Kolmogorov–Smirnov = 0.109, 0.112, 0.139, and 0.073, for individualizing in Samples A and B and binding in sample A and B, respectively, p = 0.000, 0.000, 0.000, and 0.040, respectively. Pearson correlations are reported in Table 2; results from the regression analyses in Table 4. Rerunning the regression analyses with the alternate measure of wisdom about the self, that is, with the ASTI removed, yielded an identical pattern as obtained for the original wisdom about the self concept (i.e., variables that were significant remained significant and variables that were not remained non-significant). Results from hierarchical regression analyses to predict the moral foundations Step 1Step 2Step 3Step 4Step 5Sample ASample BSample ASample BSample ASample BSample ASample BSample ASample BIndividualizing foundation IPIP extraversion − 0.06 − 0.02 − 0.04 − 0.03 − 0.01 − 0.03 − 0.06 − 0.11 − 0.10 − 0.09 IPIP agreeableness0.23**0.34**0.110.33**0.100.34**0.050.25*0.030.23* IPIP conscientiousness0.060.010.01 − 0.02 − 0.00 − 0.04 − 0.03 − 0.040.01 − 0.05 IPIP neuroticism − 0.04 − 0.03 − 0.10 − 0.10 − 0.21* −.16 − 0.17 − 0.07 − 0.17* − 0.05 IPIP intellect/imagination0.15*0.080.040.020.070.020.040.08 − 0.030.03 Social conservatism0.01 − 0.15 − 0.00 − 0.16 − 0.01 − 0.16 − 0.03 − 0.22* − 0.02 − 0.20* Age − 0.060.05 − 0.050.09 − 0.080.11 − 0.060.13 − 0.070.07 Gender0.21** − 0.060.25** − 0.030.21** − 0.030.18* − 0.020.17* − 0.02 Reflective awareness0.33**0.190.22**0.20*0.17*0.110.03 − 0.05 Controlled sense-of-self in the moment − 0.05 − 0.120.05 − 0.110.02 − 0.15 − 0.00 − 0.17 Self-preoccupation0.38**0.100.39**0.170.39**0.13 Self-compassion0.10 − 0.110.04 − 0.15 − 0.01 − 0.15 Self-transcendence0.27**0.35*0.160.17 Wisdom about the self0.42**0.41** Wisdom about the self (ASTI excluded)(NA)(NA) Wisdom about the (social) world0.010.04R2.158.160.233.191.300.202.329.232.404.285R2 stepwise change.158**.160**01,075**.033.067**.011.029**.031*.075**.053**Binding foundation IPIP extraversion − 0.020.030.000.040.030.050.00 − 0.02 − 0.01 − 0.02 IPIP agreeableness − 0.080.09 − 0.120.10 − 0.130.11 − 0.15*0.04 − 0.15*0.03 IPIP conscientiousness0.21**0.030.22**0.040.21**0.020.20**0.030.21**0.02 IPIP neuroticism0.070.07 − 0.020.02 − 0.05 − 0.06 − 0.030.00 − 0.030.02 IPIP intellect/imagination0.02 − 0.10 − 0.03 − 0.10 − 0.01 − 0.11 − 0.02 − 0.06 − 0.06 − 0.09 Social conservatism0.54**0.80**0.54**0.80**0.54**0.80**0.53**0.74**0.53**0.75** Age0.02 − 0.100.02 − 0.110.00 − 0.090.01 − 0.060.01 − 0.09 Gender − 0.13 − 0.04 − 0.10 − 0.05 − 0.13* − 0.03 − 0.14* − 0.02 − 0.14* − 0.02 Reflective awareness0.130.000.040.010.02 − 0.06 − 0.06 − 0.13 Controlled sense-of-self in the moment − 0.15* − 0.08 − 0.12 − 0.06 − 0.13 − 0.09 − 0.15 − 0.10 Self-preoccupation0.21*0.15*0.22**0.20**0.21*0.19** Self-compassion0.14 − 0.090.12 − 0.11*0.09 − 0.12* Self-transcendence0.100.28**0.050.22* Wisdom about the self0.23**0.15 Wisdom about the self (ASTI excluded)(NA)(NA) Wisdom about the (social) world − 0.040.04R2.361.651.391.655.419.668.423.690.447.698R2 stepwise change.361**.651**.030*.004.029*.013.004.024**.023*.008 N = 260 for Sample A and 173 for Sample B IPIP International Personality Item Pool (https://ipip.ori.org/) * p <.05, ** p <.01 Discussion In the present study, I investigated if and how wisdom might be related to dispositional mindfulness, broadly construed as a manifold of self-awareness, self-regulation, and self-transcendence, and if and how it might promote ethical sensitivities. Wisdom was measured using the three self-report surveys most often used in quantitative research on the topic—the 3D-WS, the ASTI, and the SAWS. Two independent samples were included: A sample of college students (Sample A), and one of adult workers on Mechanical Turk with a much wider age range (viz., 21–74; Sample B). The Structure of Wisdom A first expectation (after Glück et al. [34]) was that factor analysis on the subscales of the three surveys would reveal a bifurcation between wisdom about the self (ASTI and SAWS) and wisdom about the (social) world (3D-WS). Factor analysis indeed confirmed this divergence, in both samples. The correlation between the two dimensions was small, 0.18 in Sample A and 0.07 in Sample B, underscoring the relative independence of these two aspects of wisdom. This result replicates that of Glück et al., who obtained a correlation of 0.11. The present study is the first to also show functional independence between the two constructs, in that both types of wisdom have different correlates, as explicated in the next two sections. Predicting Wisdom About the Self From the literature reviewed in the Introduction, I expected that all three aspects of mindfulness—self-awareness, self-regulation, and self-transcendence—would be positively related to wisdom. Regression analysis suggested that this is (partially) true, but only for wisdom about the self. Before I detail these results, note that the background variables explained a fair amount of variance in wisdom about the self: it was negatively related to neuroticism, and positively related to agreeableness and intellect/imagination in both samples, and additionally to extraversion in the college sample and conscientiousness in the Mechanical Turk sample. After taking mindfulness into account, only the influence of intellect/imagination (in both groups) and extraversion (in the college sample) remained significant, but the coefficients were substantially reduced (with β s roughly half of those in Step 1). This suggests that the effects of agreeableness and neuroticism are wholly mediated through the effects of mindfulness, and those of extraversion and intellect/imagination are partially mediated. Levenson et al. ([47]) obtained a negative effect of neuroticism, and a positive effect of openness (i.e., imagination/intellect in this sample), agreeableness, and conscientiousness on the ASTI, a measure of wisdom about the self; only the latter correlation was absent from the present results. Within the Berlin wisdom paradigm, openness to experience is likewise a strong predictor of wisdom scores (e.g., Pasupathi et al. [56]; Staudinger and Glück [64]). This makes sense: if wisdom is at least partially based on experience, an openness to new experiences would be key for its development or flourishing. Crucially, the mindfulness manifold explained an additional 21% to 26% of the variance in wisdom about the self, over and beyond the variance explained by personality, age, and gender. In both samples, one aspect of self-awareness—reflective awareness—was a significant and strong predictor of wisdom about the self, with β values around 0.40 for the final step. The other aspect of self-awareness, however—controlled sense-of-self in the moment—was not a significant predictor (except in Step 2 in the college sample). It appears, then, that wisdom about the self is associated with a reflective stance about one's experiences (i.e., reflective awareness), but not with the experience of being present in the moment (i.e., controlled sense-of-self in the moment)—in other words, it is the examination of or the investigation into one's experiences rather than the mere witnessing of those experiences that is important for this type of wisdom, as many models of wisdom (e.g., Ardelt [ 3 ]; Brown and Greene [14]; Glück and Bluck [31]) indeed explicitly predict. It is interesting to note that self-compassion (at least in the college sample) was an additional predictor for wisdom about the self. The reasons might be that self-compassion allows one to step back from the immediacy of the experience, and consider oneself the way one would consider a friend—this friendly distancing, like the reflection/examination component, might possibly help to foster the transcendence Ardelt ([ 3 ]) considers so necessary for the development of wisdom. Self-preoccupation was not related to wisdom in either sample. One additional link found here was that between self-transcendence and wisdom about the self (with β values on par with or a little lower than those for reflective awareness). This association is almost self-evident, given that quite a few theorists consider self-transcendence to be a critical component of wisdom (Ardelt [ 3 ]; Curnow [22]; Levenson [46]). Note that this relationship remained unchanged when the ASTI, a measure of wisdom the conceptually relies on self-transcendence, was removed from the composite that tapped wisdom about the self, suggesting that the relationship cannot be explained merely by conceptual overlap between the measure of self-transcendence and the ASTI. The role of reflective awareness and self-compassion in wisdom about the self, however, is not merely to foster self-transcendence: the final step in the regression analyses clearly shows that the effects of reflective awareness (both samples) and self-compassion (college sample) are far from completely mediated by self-transcendence. It is also important to stress that the three background variables and the mindfulness manifold provide us with a very good handle on the individual differences in wisdom about the self: they explain a little more than half to two thirds of the variance (between 56 and 67%, to be precise), indicating that these constructs probably should be important components in any realistic theory of wisdom about the self. Predicting Wisdom About the (Social) World Wisdom about the (social) world, in contrast, was not predicted by the mindfulness manifold at all. There is some indication that wisdom about the (social) world might have roots in individual differences in personality instead: individuals scoring higher on agreeableness and lower on neuroticism scored higher on wisdom about the (social) world; however, this was only true in the student sample. As in wisdom about the self, the effects of agreeableness and neuroticism were wholly mediated through the effects of mindfulness, even though the latter effects did not rise to the level of significance. These personality correlates have some face validity in their predictive value. That is, it makes sense that people who are (or want to appear) more friendly, warm, and helpful might be better at picking up on social cues or be more interested in understanding how the social world and the world in general works. Neuroticism, in general, is related to overreactivity, negative emotions, and feeling easily threatened by social situations; none of these qualities would likely be conducive to acquire the type of equanimity associated with wisdom in general (see Wink and Staudinger [74], for a similar argument). Note that Ardelt et al. ([ 4 ]) found that openness and extraversion correlated with the 3D-WS (in a sample of 98 males who were approximately 80 years old); we found such correlations for wisdom about the self, not for wisdom about the (social) world. The reason for the discrepancy is unclear. The reason why the influence of personality variables on wisdom about the (social) world is constrained to the college group is likewise unclear. One potential reason could be adult development: perhaps as people grow older the grip of personality on their outlook on the world loosens. There is a hint of this in the present data: after a median split on the Mechanical Turk sample, the relevant correlations were nominally higher in the younger sample (correlation of wisdom about the [social] world with agreeableness was 0.11, with neuroticism − 0.12) than the older subsample (0.01 and − 0.04, resp.). None of these correlations, however, reached significance. This, then, remains an area for further research. Note that the Mechanical Turk sample was highly educated (about 3 years of college), so educational differences are unlikely to explain the cross-sample differences. Also note that the relationship with personality is much smaller than that observed in wisdom about the self: the background variables (personality, age, and gender) explained 30–46% of the variance in wisdom about the self, versus only 3–12% in wisdom about the (social) world. Wisdom about the (social) world is not only distinct from wisdom about the self; it also seems, with the present measures, much harder to explain. Wisdom and the Moral Foundations Turning now to ethical sensitivity as a potential consequence of mindfulness and wisdom, I found, first, a conceptual (partial) replication of our earlier paper (Verhaeghen and Aikman [70]) on the effects of mindfulness on the moral foundations. In that paper, we found, in two independent samples, that reflective awareness, self-preoccupation, and self-transcendence were related to the individualizing aspects of morality (i.e., an emphasis on care and fairness) (note that the relationship with self-preoccupation was only significant in Sample A in the present study). Self-compassion and self-transcendence were positively related to the binding aspects of morality (i.e., an emphasis on loyalty, authority, and sanctity). In the present data, an additional effect of self-preoccupation on binding was obtained, and the effect of self-compassion on binding was not significantly different from zero in one sample, and, surprisingly, negative in the other. Wisdom about the self turned out to be a strong predictor for the individualizing foundation, that is, one's sensitivity to the ethical dimensions of care and fairness ( β for the final step = 0.42 and 0.41, resp.). In contrast, wisdom about the (social) world had only a negligible and non-significant influence on the individualizing foundation ( β = 0.01 and 0.04). While most theories about wisdom posit an effect on ethics, notably "prosocial attitudes and behaviors, which include empathy, compassion, warmth, altruism, and a sense of fairness" (Bangen et al. [10], p. 1257), the present data suggest that this effect remains restricted to wisdom about the self, and does not extend to wisdom about the (social) world. Within the group of mindfulness variables, the effects of self-awareness on the individualizing foundation were partially mediated through self-transcendence (i.e., the coefficients associated with self-awareness become smaller once self-transcendence enters the equation) and wholly mediated through wisdom about the self (i.e., the coefficients associated with self-awareness became non-significant once the wisdom variables enter the equation, but only wisdom about the self had a reliable effect). The effects of self-transcendence on individualizing, in turn, were fully mediated through wisdom, and particularly wisdom about the self. One possible interpretation of the latter finding is that self-transcendence is a precursor for wisdom about the self; another that self-transcendence as defined here is subsumed under or maybe even synonymous with wisdom about the self. The latter interpretation is certainly compatible with views about wisdom as a form of self-transcendence (Ardelt [ 3 ]; Curnow [22]; Levenson [46]). Whatever the mechanism, wisdom about the self thus appears to foster an increased emphasis on the ethical dimensions of care and fairness, and this is partially due to the influence of reflective awareness and self-transcendence on wisdom about the self. The effects of wisdom on the binding foundations (i.e., an emphasis on authority, ingroup loyalty, and purity) were rather small. The strongest predictor for the binding foundation remained social conservatism, with people who are more conservative showing larger interest in the binding foundation ( β for the final step = 0.53 and 0.75). Wisdom about the self had a much smaller effect ( β for the final step = 0.23 and 0.15; the latter value was ns ); the contribution of wisdom about the (social) world was essentially nil ( β for the final step = − 0.04 and 0.04, ns ). In the college sample, participants who were less agreeable, more conscientious, male, and more self-preoccupied showed a larger interest in the binding foundation. The latter effect replicated for the Mechanical Turk sample, where lower levels of self-compassion and higher levels of self-transcendence were additionally related to a higher interest in binding. If we look at the results that replicate across both samples, the take-away message is that an interest in the binding foundation is determined mostly by social conservatism, and maybe, but to a much smaller extent, by wisdom about the self. This implies a second amendment to the Bangen et al. ([10]) quotation above, to the effect that wisdom's fostering of prosocial attitudes applies mostly to attitudes that make the rights and concerns of others visible (i.e., treating individuals with care and fairness), and less so to attitudes pertaining to ingroup cohesion (i.e., a focus on loyalty, authority, and purity).
    1. Philosophy for Children, Values Education and the Inquiring Society.Published in:Educational Philosophy & Theory,Oct2014,Professional Development CollectionBy:Cam, Philip Philosophy for Children, Values Education and the Inquiring Society.  How can school education best bring about moral improvement? Socrates believed that the unexamined life was not worth living and that the philosophical examination of life required a collaborative inquiry. Today, our society relegates responsibility for values to the personal sphere rather than the social one. I will argue that, overall, we need to give more emphasis to collaboration and inquiry rather than pitting students against each other and focusing too much attention on 'teaching that' instead of 'teaching how'. I will argue that we need to include philosophy in the curriculum throughout the school years, and teach it through a collaborative inquiry which enables children to participate in an open society subject to reason. Such collaborative inquiry integrates personal responsibility with social values more effectively than sectarian and didactic religious education. Keywords: religion; ethics; community of inquiry; spiral curriculum Introduction [ 1 ]As Socrates would have it, the philosophical examination of life is a collaborative inquiry. The social nature of the enterprise goes with its spirit of inquiry to form his bifocal vision of the examined life. These days, insofar as our society teaches us to think about values, it tends to inculcate a private rather than a public conception of them. This makes reflection a personal and inward journey rather than a social and collaborative one, and a person's values a matter of parental guidance in childhood and individual decision in maturity. The relegation of responsibility for values to the personal sphere also militates against societal self-examination. On the other hand, the traditional pontifical alternative is equally presumptive and debilitating in ignoring the possibility of personal judgement. How can education steer a course between the tyranny of unquestionable moral codes and the bankruptcy of individualistic moral relativism? It remains to be seen whether there is a way in which education could teach children to engage productively across their differences rather than responding to difference with suspicion or prejudice. Gilbert Ryle (in Cahn, 1970) made a clear distinction between 'teaching how' and 'teaching that', arguing from a behaviourist perspective that teaching how had a much more lasting impact than simply teaching the facts. However, too much emphasis on 'teaching how' can result in conditioning, training, teaching to conform to habit, teaching obedience with the threat of hellfire if the rules are broken. There is a third way, the way of philosophy espoused by Matthew Lipman ([ 8 ]) in his Philosophy for Children, which involves giving more emphasis to collaboration and inquiry rather than pitting students against each other and focusing too much attention on 'teaching that' instead of 'teaching how'. Philosophy as it is traditionally taught may well involve teaching how to follow the rules of formal logic correctly, or learning facts about the life and death of Socrates, but it also requires a capacity for critical reflection, consideration of alternative possibilities, and a genuine concern for truth and clarity. I argue that we need to include philosophy in the curriculum throughout the school years, but it needs to be a philosophy taught in the spirit of Socrates which balances individual and social values. Religious instruction tends to inculcate values through adult imposition and denies space to critical judgement. Ryle's distinction between 'learning that' and 'learning how' implied that these were discrete and exclusive ways of learning. However, learning how to do things is more than a matter of memorizing facts or following procedural instructions. Being able to cook is more than being able to follow a recipe book. Again, while some instruction is useful in learning to ride a bike, it is mostly a matter of trying to ride, and then, under guidance, trying again. It is a case of learning by doing, and doing it under different circumstances, in order to apply it in different circumstances. This is working out for oneself how to exercise individual judgement, rather than first learning a set of instructions and then carrying them out (Ryle, in Cahn, 1970, pp. 413–424). Whatever the rules are, they are heuristic and strategic, depending on different contexts, rather than algorithmic and learnable by rote. 'Learning how' can be important in many areas of the curriculum where training in skills is an important feature, especially in physical education and the arts, However, learning the art of inquiry requires a slightly different type of 'learning how' from training, rehearsal, repetition. A curriculum that is based on inquiry is one that is centred on thinking. There is a world of difference in the outcome to be expected from an education that treats knowledge as material with which to think and one that emphasizes memorization of knowledge. It is the difference between an inquiring society and one in which those few who have developed an inquiring mind have done so in spite of their education rather than because of it (Dewey, 1916/1966, chap. 12; Lipman, [ 8 ]). The concept of a community of inquiry owes much to Dewey who, in Democracy and education (1916/1966), described the healthy relation between an individual and his or her environment as functional. Dewey insisted that because the relationship between the individual and his or her environment must be based on mutual adjustment, fitting into society might well involve radically changing it. Dewey believed in the importance of preparing students for democratic citizenship. He stressed that consciously guided education aimed at developing the 'mental equipment' and moral character of students was essential to the development of civic character. Is this not what religious instruction tries to do? The relationship between the individual and society was far more important for Dewey than the child's relationship with an abstract God. It was organic and continually evolving in mutual adaptation. It differs from religious instruction in that its aim is to develop a model of free inquiry, which requires tolerance of alternative viewpoints, and free communication. He also believed that children's capacity for the exercise of deliberative, practical reason in moral situations could be cultivated not by ready-made knowledge but by 'a mode of associated living' characteristic of democracy. Lipman ([ 7 ]) was to elaborate on this idea of schools as a model of a participatory democracy and his classroom community of inquiry provided close analogies with the democratic school, a microcosm of the wider society. Thinking Together When we move away from the traditional classroom to the inquiring one and the teacher becomes less occupied with conveying information—with teaching 'that'— it becomes educationally desirable for students to engage with one another. When human conduct stimulates moral inquiry it is usually because that conduct is controversial, which is to say that there are different points of view as to how it should be judged. If you and I have different opinions in regard to someone's character or conduct, then we are both in need of justification and our views are subject to each other's objections. When we make a proposal to solve a practical problem of any complexity, we rely upon others who are reasonably well placed for constructive criticism or a better suggestion. If we want students to grow out of the habit of going with their own first thoughts, to become used to considering a range of possibilities, and to be on the lookout for better alternatives, then we could not do better than to have them learn by exploring issues, problems and ideas together. If we want them to become used to giving reasons for what they think, to expect the same of others, and to make productive use of criticism, then we could not go past giving them plenty of practice with their peers. And if we want them to grow up so that they consider other people's points of view, and not to be so closed minded as to think that those who disagree with them must be either ignorant or vicious, then the combination of intellectual and social engagement to be found in collaborative inquiry is just the thing. These are all good reasons for having our students learn to inquire together. Philosophy for Children More than any other discipline, philosophy is an inquiry into fundamental human problems and issues, where all the general conceptions that animate society come under scrutiny. Philosophy as a formal discipline played an important part in its place as a matriculation subject in some Australian states, because there were rigorous rules by which its standards could be maintained. This would involve, say, learning that ignoratio elenchi was an informal fallacy, or that modus tollens is an illegitimate move in deductive logic, or learning how to mount a reasoned argument in defence of a position. When, however, we are talking abut philosophy for children, its subject matter needs to be adapted to the interests and experience of students of various ages and its tools and procedures adjusted to their stage of development. There are models to work from, particularly the series of novels and manuals from Matthew Lipman, and in recent years we have begun to find our way forward.[ 2 ] If part of the difficulty is also that some philosophers think of philosophy as being above all that, it is salutary to remember that other disciplines have long since discovered how to recast themselves in educational form. Just as mathematics was forced to become more practical and relevant to the growing range of children who were staying on at school through the New Maths, so philosophy has been forced to become more real and relevant to children. The move towards an integrated curriculum away from discrete learning areas also required philosophy to make the connections across and through disciplines, raising the larger questions of epistemology, ontology, aesthetics and, for the purpose of this article, the important area of axiology or values. For philosophy to have a formative influence, and thereby to significantly affect both the way people think and the character of their concerns, it needs to be part of the regular fare throughout the school years. Only by this means can it effectively supply its nutrients to the developing roots of thought or knowing that and action or knowing how. We need to counter the view that philosophy is an advanced discipline, suitable only for the academically gifted and intellectually mature. Jerome Bruner made famous the startling claim that 'the foundations of any subject may be taught to anybody at any age in some form' (1960, p. 12), and he suggested that the prevailing view of certain disciplines being too difficult for younger students results in our missing important educational opportunities. Bruner called this structure a spiral curriculum : one that begins with the child's intuitive understanding of the fundamentals, and then returns to the same basic concepts, themes, issues and problems at increasingly elaborate and more abstract or formal levels over the years. A spiral curriculum is vital for developing the kind of deep understanding that belongs to philosophy and the humanities. What else is to be gained from building philosophy into the curriculum throughout the school years? It seems to me that an education in philosophical inquiry will assist students to achieve a rich understanding of a wide array of issues and ideas that inform life and society through an increasingly deep inquiry into them. It will help students to think more carefully about issues and problems that do not have a unique solution or a settled decision procedure, but where judgements and decisions can be better or worse in all kinds of ways. Since most of the problems that we face in life and in our society are of that character, the general-purpose tools that students acquire through philosophy will ensure that they are better prepared to face those problems. If philosophy is carried out in the collaborative style envisaged above, then its recipients will also be more likely to tackle such problems collaboratively, and thereby to be more constructive and accommodating with one another. Let me spell all this out a little under the headings of 'thinking', 'understanding' and 'community'. Thinking Philosophy is a discipline with a particular focus on thinking. It involves thinkers in the cognitive surveillance of their own thought. It is a reflective practice, in the sense that it involves not only careful thinking about some subject matter, but thinking about that thinking, in an effort to guide and improve it. Since philosophical thinking tends to keep one eye on the thinking process, philosophy can supply the tools that assist the thinker in such tasks as asking probing questions, making needful distinctions, constructing fruitful connections, reasoning about complex problems, evaluating propositions, elaborating concepts, and honing the criteria that are used to make judgements and decisions. Dewey's (2010) five-step model of identifying the problem and placing it in context, making creative and testable hypotheses that move towards a possible solution, analysing the hypotheses in terms of past experience, considering alternative hypotheses that may be more suitable, and checking possible solutions against actual experiences was picked up as a model of individual thinking, especially in science and design work. But in a community of inquiry each of these steps is done from the multiple perspectives of the group at any age, allowing not only the falsifiability of any conservative position to truth but also their complete contingency. The skills, abilities and habits of skills, abilities and habits of thinking—acquiring the habit of reflecting carefully upon your own thoughts, as well as what others think; developing the ability to imagine and evaluate new possibilities; developing the habit of changing your mind on the basis of good reasons; and acquiring skill in the establishment and use of appropriate criteria to form sound judgements—provide the methodology of Lipman's community of inquiry. Understanding Philosophy deals with ethical questions about how we should behave, social questions about the good community, epistemological questions about the justification of people's opinions, metaphysical questions about our spiritual lives, or logical questions about what we may reasonably infer, and is therefore a rich source of our cultural heritage and of contemporary thought and debate. In terms of both its history and ways of thinking, philosophy also helps to deepen our understanding of the big ideas and key concepts that have helped to shape civilization and continue to inform the way we live. Our conceptions of what makes something right or wrong, of justice, freedom and responsibility, of our personal, cultural and national identity, of sources of knowledge, of the nature of truth, beauty and goodness, are all central to what we value and how we conduct our affairs. Since such concepts so deeply inform life and society, it is important for students to develop their understanding of them. While we may attempt to deal with these matters elsewhere in the curriculum, philosophical inquiry gives students the tools that they need in order to explore these ideas in depth. Community With regard to cooperative thinking and the importance of community, I would stress the virtues of dialogue. As we work to resolve differences in our understandings, or to subject our reasons to each other's judgement, or try to follow an argument where it leads, we are like detectives whose clues are the experience, inferences, judgements and other intellectual considerations that each thinker brings into the dialogue with others. On this view, philosophical inquiry provides a model of the inquiring community: one that is engaged in thoughtful deliberation and decision making, is driven by a desire to make advance through cooperation and dialogue, and values the kinds of regard and reciprocity that grow under its influence. Just because it has these characteristics, philosophical inquiry can provide a training-ground for people who are being brought up to live together in such a community. Dewey's five steps require the philosophical disposition to give reasons when that is appropriate; and, generally, to cooperate with others and respect different points of view. Values Education The vital significance of educating for judgement in regard to values is nowhere more clearly recognized than in the writings of John Dewey: 'The formation of a cultivated and effectively operative good judgment or taste with respect to what is aesthetically admirable, intellectually acceptable and morally approvable is the supreme task set to human beings by the incidents of experience' (Dewey, 1929/1980, p. 262). This makes the cultivation of judgement the ultimate educational task and the development of good judgement central to values education in particular. Values education therefore cannot be simply a matter of instructing students as to what they should value—just so much 'teaching that'—as if students did not need to inquire into values or learn to exercise their judgement. In any case, it is an intellectual mistake to think that values constitute a subject matter to be learned by heart. They are not that kind of thing. Values are embodied in commitments and actions and not merely in propositions that are verbally affirmed. Nor can values education be reduced to an effort to directly mould the character of students so that they will make the right moral choices—as if in all the contingencies of life there was never really any doubt about what one ought to do, and having the right kind of character would ensure that one did it. Being what is conventionally called 'of good character' will not prevent you from acting out of ignorance, from being blind to the limitations of your own perspective, from being overly sure that you have right on your side, or even from committing atrocities with a good conscience in the name of such things as nation or faith. History is littered with barbarities committed by men reputedly of good character who acted out of self-righteous and bigoted certainty. Far from being on solid moral ground, the ancient tradition that places emphasis upon being made of the right stuff has encouraged moral blindness towards those of different ethnicity, religion, politics, and the like. Whatever else we do by way of values education, we must make strenuous efforts to cultivate good judgement. When it comes to deciding what to do in a morally troubling situation, good judgement involves distinguishing more from less acceptable decisions and conduct. Such discernment needs to be made by comparing our options in the circumstances in which they occur. Any such comparison requires us to ensure that, insofar as possible, we have hold of all the relevant facts. It involves us doing our best to make sure that we have not overlooked any reasonable course of action. It requires us to think about the consequences of making one decision, or taking one course of action, by comparison with another, and to be mindful of the criteria against which we evaluate them. It requires us to monitor the consequences of our actions in order to adjust our subsequent thinking to actuality. In short, good moral judgement requires us to follow the ways of inquiry. Dewey (1920/1957, pp. 163–164) says: A moral situation is one in which judgment and choice are required antecedently to overt action. The practical meaning of the situation—that is to say the action needed to satisfy it—is not self-evident. It has to be searched for. There are conflicting desires and alternative apparent goods. What is needed is to find the right course of action, the right good. Hence, inquiry is exacted: observation of the detailed make-up of the situation; analysis into its diverse factors; clarification of what is obscure; discounting of the more insistent and vivid traits; tracing of the consequences of the various modes of action that suggest themselves; regarding the decision reached as hypothetical and tentative until the anticipated or supposed consequences which led to its adoption have been squared with the actual consequences. The lack of integration of our advanced empirical and scientific knowledge with the remnants of value systems of much earlier times is already a problem of considerable proportions. We should not be adding to this burden when we teach science and technology, or history, or about society and the environment. Instead, we need to introduce our students to ways of thinking that develop their values in conjunction with their other understandings. This approach to values education fits with the emphasis to be placed upon collaborative inquiry for several reasons. First, the idea that values are to be cultivated by student reflection rather than impressed upon the student from without by moral authority does not imply that the pursuit of values is a purely personal affair. That would be a pendulum swing to individualistic relativism. Collaborative inquiry supplies a middle road—a way forward between an unquestioningly traditional attitude towards values and an individualism that makes each person their own moral authority. The development of good judgement through collaborative inquiry is the path towards a truly social intelligence. Secondly, values inquiry depends upon different points of view. If something is uncontroversial and everyone is of the same opinion, then there is no motivation for inquiry. Inquiry arises in situations where something is uncertain, puzzling, contentious or in some way problematic. The collaborative inquiry is organic, synergistic and evolving, a kind of moral practice based on a principle of democracy. Consider such elementary aspects of philosophical practice as: learning to hear someone out when you disagree with what they are saying; learning to explore the source of your disagreement rather than engaging in personal attacks; developing the habit of giving reasons for what you say and expecting the same of others; being disposed to take other people's interests and concerns into account; and generally becoming more communicative and inclusive. To see values education as continuous with all of our other efforts to educate our young in the ways of inquiry is to place it firmly in the tradition of reflective education rather than traditional religious instruction. Religious instruction cannot take on the burden of a systematic exploration of the ethical issues involved in the various areas of the curriculum as they are presented throughout the rest of the week. If we are to cultivate good moral judgement we need to make it integral to the material that we teach and not something we attempt to establish in such a disconnected fashion. From a pedagogical perspective, while it would be possible for religious instructors to introduce students to values inquiry, they are under no obligation to do so and many of them come from traditions that are likely to use the occasion to moralize and engage in indoctrination instead. This is not to say that religious education is incompatible with values inquiry. It is rather to acknowledge the need for change. Much of traditional religious instruction is antithetical to the educational requirements of an inquiring society; and if we are to develop such a society, such an outdated approach should not retain its foothold in our schools. This still leaves it open as to whether the school takes a philosophical approach to values education, or insists upon indoctrination rather than education. We should not think of philosophy and religion as representing two incompatible options when it comes to values education. They are representative, however, of a deeper choice that must be made in relation to values education, the choice between appeal to reason and dogmatism as central to the way we teach. Footnotes 1 Editor's Note : This article has been substantially edited and modified since it was delivered as a keynote address in December 2010. The context in which it was written reflects an ongoing tension between the didactic teaching of ethics through religious education and a more organic process of teaching ethics by modelling it and discussing it in philosophical discussion. In New South Wales (NSW) religious education was not compulsory, but Education Department policy forbade schools from offering alternative lessons to students who chose not to take part in scripture. The NSW government tasked St James Ethics Centre, under the guidance of Professor Cam, to develop and deliver ethics education classes in urban, regional and rural primary schools as an alternative to religious education. St James Ethics Centre promptly established Primary Ethics Limited, an independent not-for-profit organization, to develop an engaging, age-appropriate, interconnected curriculum that spans the primary years from Kindergarten to Year 6 and to then deliver ethics education free of charge via a network of specially trained and accredited volunteers. Despite protests from Church leaders in NSW that they should have sole responsibility for values education, on 1 December 2010 Parliament amended the NSW Education Act to give students who do not attend special religious education/scripture classes in NSW public schools the legal right to attend philosophical ethics classes as an alternative to supervised 'private study'. Because of the popularity of secular ethics classes, pressure from Church leaders and a change to a conservative state government, it was legislated in 2012 that parents should be told of the availability of ethics classes in their school only after they have opted out of special religious education or scripture. 2 Since the early 1990s Lipman's followers have extended his work and this general approach is now represented in schools in many countries around the world. For a selection of Australasian resources see http://www.fapsa.org.au/resources/catalogue References Bruner, J. S. (1960). The process of education. Cambridge, MA: Harvard University Press. Cahn, S. E. (Ed.). The philosophical foundations of education. New York: Harper & Row. 3 Dewey, J. (1910). How we think. Chicago, IL: D. C. Heath & Co. 4 Dewey, J. (1957). Reconstruction in philosophy (enlarged ed.). Boston, MA: Beacon Press. (Original work published 1920). 5 Dewey, J. (1966). Democracy and education. London: Collier Macmillan. (Original work published 1916). 6 Dewey, J. (1980). The quest for certainty. New York: Perigee Books. (Original work published 1929). 7 Lipman, M. (1988). Philosophy goes to school. Philadelphia, PA: Temple University Press. 8 Lipman, M. (2002). Thinking in education (2nd ed.). New York: Cambridge University Press. 9 Ryle, G. (1970). Teaching and training. In S. M. Cahn (Ed.), The philosophical foundations of education (pp. 413–424). New York: Harper & Row. ~~~~~~~~ By Philip Cam Reported by Author
    1. La Coéducation : Synergie entre Milieux Scolaire et Familial

      Synthèse de direction

      La coéducation est définie comme une alliance stratégique entre tous les adultes gravitant autour de l'enfant — enseignants, parents, professionnels et personnel de soutien — visant à optimiser le développement de son plein potentiel.

      Cette approche repose sur la reconnaissance et l'acceptation des rôles complémentaires de chaque acteur.

      L'établissement de cette relation doit idéalement débuter dès la première rencontre parents-enseignants, bien qu'elle puisse se mobiliser à tout moment, notamment lors de situations critiques.

      Le succès de cette démarche repose sur une posture de bienveillance créant un climat de sécurité psychologique, favorisant ainsi une communication transparente et une action concertée.

      L'intégration des technologies numériques, encadrée par le Plan d'action numérique en éducation, vient renforcer cette collaboration en offrant de nouveaux leviers d'apprentissage et en rassurant les parents sur l'usage pédagogique des outils technologiques.

      --------------------------------------------------------------------------------

      1. Fondements et Définition de la Coéducation

      La coéducation n'est pas une simple communication occasionnelle, mais une véritable mentalité de partenariat.

      Elle se structure autour de trois piliers : reconnaître, accepter et mettre en action les rôles respectifs de chacun.

      Une rencontre d'univers : Elle représente la fusion de l'univers familial et de l'univers scolaire pour former un écosystème unique et cohérent dans la vie du jeune.

      Une mission commune : L'objectif central est l'accompagnement de l'élève dans le développement de ses compétences et de son bien-être.

      Une alliance durable : Cette relation doit perdurer tout au long de l'année scolaire, assurant une continuité entre les différents milieux de vie de l'enfant.

      --------------------------------------------------------------------------------

      2. L'Établissement d'une Posture de Bienveillance

      Pour que la coéducation soit effective, les acteurs doivent adopter une posture spécifique favorisant l'ouverture et l'écoute.

      Le climat de sécurité psychologique

      L'état de bienveillance est le moteur de la coéducation. Il permet de :

      • Créer un contexte où chacun se sent à l'aise de nommer ses véritables préoccupations.

      • Établir une écoute mutuelle authentique.

      • Réduire les malentendus et les confrontations.

      Processus d'ancrage de la bienveillance

      Pour cultiver cet état, les intervenants sont invités à :

      1. Se référer à une expérience passée de bienveillance pour en retrouver les codes (ton, attitude).

      2. Pratiquer l'auto-bienveillance avant de l'étendre à l'autre.

      3. S'interroger sur les meilleures conditions pour rester dans l'ouverture lors des échanges.

      --------------------------------------------------------------------------------

      3. Rôles et Responsabilités : Complémentarité des Acteurs

      Bien que les objectifs finaux convergent, les rôles des enseignants et des parents sont distincts et complémentaires.

      | Acteur | Mandat et Objectifs Spécifiques | Domaine d'Influence | | --- | --- | --- | | Enseignant | Instruire, socialiser et qualifier dans un cadre temporel limité (180 jours). Application du programme et progression des apprentissages. | Milieu scolaire (classe) | | Parent | Premier éducateur de l'enfant. Accompagnement dans les transitions, les défis de vie et les étapes de développement. | Milieu familial et social | | Rôles Communs | Se rassurer mutuellement, valider les informations, partager le vécu de l'enfant et s'informer des stratégies efficaces. | Global (Co-responsabilité) |

      --------------------------------------------------------------------------------

      4. Stratégies de Communication et d'Action

      La coéducation se manifeste par un questionnement constant orienté vers l'impact positif pour l'enfant.

      L'intention politique commune : Avant chaque intervention, les adultes devraient se demander : "Quel est l'impact positif que mon intervention va avoir pour le bien de l'enfant ?"

      La résolution de problèmes : Face aux difficultés (comportements nuisibles ou retards d'apprentissage), l'approche préconisée est de se demander : "Comment pourrions-nous travailler ensemble pour répondre aux besoins de l'enfant ?"

      Inclusion de l'enfant : Il est recommandé d'inclure le jeune dans le questionnement pour s'assurer que les stratégies développées répondent réellement à ses besoins.

      --------------------------------------------------------------------------------

      5. Bénéfices et Manifestations de la Réussite

      Une coéducation réussie transforme la dynamique éducative et génère des résultats tangibles :

      Engagement accru : La clarté des rôles et le climat bienveillant stimulent la motivation des adultes à s'investir.

      Sentiment d'efficacité personnelle : Les expériences positives répétées renforcent la croyance des parents et des enseignants en leur capacité de réussir l'éducation du jeune.

      Progrès accélérés : L'action concertée et continue entre la maison et l'école permet une multiplication des progrès de l'enfant.

      Gestion émotionnelle : Les acteurs parviennent mieux à se détacher d'une surcharge émotionnelle lors des communications pour se recentrer sur l'objectif pédagogique.

      --------------------------------------------------------------------------------

      6. La Coéducation à l'Ère du Numérique

      Le numérique agit comme un levier pour soutenir la relation entre l'école et la famille.

      Le Plan d'action numérique

      Ce plan offre un cadre de référence inspiré des meilleures pratiques mondiales. Il vise deux dimensions centrales :

      1. Développer un citoyen éthique à l'ère du numérique.

      2. Mobiliser les compétences technologiques des jeunes.

      Manifestations concrètes en classe

      L'intégration technologique se traduit par de nouvelles méthodes d'apprentissage où l'enfant est placé en mode création :

      • Ateliers de robotique, de programmation et de codage.

      • Utilisation de la réalité virtuelle (ex: pour des exposés oraux).

      • Usage de tablettes pour la lecture et d'autres contributions pédagogiques.

      Cette structure numérique, encadrée par des pédagogues, sert également à rassurer les parents sur l'accompagnement technologique de leurs enfants, renforçant ainsi le lien de confiance nécessaire à la coéducation.

    1. Author response:

      [Note: The final version has been published in Brain, Behavior, and Immunity: https://doi.org/10.1016/j.bbi.2026.106473]

      eLife Assessment

      Rhis useful study raises interesting questions but provides inadequate evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The findings are intriguing but they are correlative and hypothesis-generating with the strong possibility of residual confounding.

      We thank the editors and reviewers for characterizing our work as useful and for the opportunity to publish a Reviewed Preprint with a corresponding response. However, the statements in the Assessment characterizing the evidence as ‘inadequate’ and asserting a ‘strong possibility of residual confounding’ are factually incorrect as applied to our data and incompatible with the empirical findings presented in the manuscript. We have notified the editors of this factual inaccuracy. As the Assessment will be published as originally written, we provide clarification here to ensure an accurate scientific record for readers of the Reviewed Preprint.

      Our study shows that the association between atovaquone–proguanil (A/P) exposure and reduced dementia risk, first identified in a rigorously matched national cohort in Israel, is robustly reproduced across three independently constructed age-stratified cohorts in the U.S. TriNetX network (with exposure at ages 50–59, 60–69, and 70–79). In each cohort, individuals exposed to A/P were compared with rigorously matched individuals who received another medication at the same age and were then followed over a decade for incident dementia. Cases and controls were matched on all major established dementia risk factors: age, sex, race/ethnicity, diabetes, hypertension, obesity, and smoking status.

      Across all three strata, each containing more than 10,000 exposed individuals with an equal number of matched controls, we observed substantial and consistent reductions in cumulative dementia incidence (HR 0.34–0.51), extremely low P-values (10<sup>–16</sup> to 10<sup>–40</sup>), and continuously widening divergence of Kaplan–Meier curves over the follow-up period. To more rigorously exclude the possibility of unmeasured baseline differences in health status, we additionally performed, for the purpose of this response, comparative analyses of key indicators of frailty and clinical utilization, including emergency and inpatient encounters, as well as the prevalence of mild cognitive impairment prior to medication exposure (values provided below in response to Reviewer #2, Weakness 1). These analyses provide clear evidence showing no pattern suggestive of exposed individuals being medically or cognitively healthier at baseline.

      Taken together, these findings constitute a rigorously matched and independently replicated association across two national health systems, using TriNetX, the most widely cited real-world evidence platform in published cohort studies. Replication across three age strata, each with >10,000 exposed individuals, followed for a decade, and matched on all major known risk factors for dementia, meets the accepted epidemiologic definition of strong and reproducible evidence.

      Although we disagree with elements of the editorial Assessment that appear inconsistent with the empirical findings, we will proceed with publication of the current manuscript as a Reviewed Preprint in order to ensure timely dissemination of findings with meaningful implications for public health and dementia prevention. In this initial public version, the point-by-point responses below provide concise explanations addressing the critiques underlying the Assessment. A revised manuscript, incorporating expanded baseline comparisons across each TriNetX age stratum, additional stringent exclusions, and an expanded discussion that will address the remarks presented in this review, will be submitted shortly.

      Reviewer #1 (Public review):

      Summary:

      This useful study provides incomplete evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The study reinforces findings that VZ vaccine lowers AD risk and suggests that this vaccine may be an effect modifier of A-P's protective effect. Strengths of the study include two extremely large cohorts, including a massive validation cohort in the US. Statistical analyses are sound, and the effect sizes are significant and meaningful. The CI curves are certainly impressive.

      Weaknesses include the inability to control for potentially important confounding variables. In my view, the findings are intriguing but remain correlative / hypothesis generating rather than causative. Significant mechanistic work needs to be done to link interventions which limit the impact of Toxoplasmosis and VZV reactivation on AD.

      We thank the reviewer for describing our study as useful and for highlighting several of its strengths, including the very large cohorts, sound statistical analyses, meaningful effect sizes, and the impressive CI curves. We also appreciate the reviewer’s recognition that our findings reinforce prior evidence linking VZV vaccination to reduced AD risk.

      Regarding the statement that the evidence remains incomplete due to “inability to control for potentially important confounding variables,” we refer to our introductory explanation above. As noted there, our analyses meet the accepted criteria for reproducible epidemiological evidence, and the assumption of uncontrolled confounding is contradicted by rigorous matching and by additional baseline evaluations. We fully agree that mechanistic work is warranted, and our epidemiologic findings strongly motivate such efforts.

      We address the reviewer’s specific comments in detail below.

      (1) Most of the individuals in the study received A-P for malaria prophylaxis as it is not first line for Toxo treatment. Many (probably most) of these individuals were likely to be Toxo negative (~15% seropositive in the US), thereby eliminating a potential benefit of the drug in most people in the cohort. Finally, A-P is not a first line treatment for Toxo because of lower efficacy.

      We agree that individuals in our cohort received Atovaquone-Proguanil (A-P) for malaria prophylaxis rather than for treatment of toxoplasmosis. However, this does not contradict our interpretation. Because latent CNS colonization by T. gondii is not currently considered clinically actionable, asymptomatic carriers are not offered treatment, and therefore would only receive an anti-Toxoplasma regimen unintentionally, through a medication prescribed for another indication such as malaria prophylaxis. Importantly, atovaquone is an established therapy for toxoplasmosis, including CNS disease, with documented efficacy and CNS penetration in current treatment guidelines. It is therefore reasonable to assume that, during the multi-week course typically administered for malaria prophylaxis, A-P would exert significant anti-Toxoplasma activity in individuals with latent CNS infection, potentially reducing or eliminating parasite burden even though the medication was not prescribed for that purpose.

      The reviewer notes that only ~15% of individuals in the U.S. are Toxoplasma-seropositive, based on surveys performed primarily in young adults of reproductive age (serologic testing is most commonly obtained in women during prenatal care). However, seropositivity increases cumulatively over the lifespan, and few reliable estimates exist for the age groups in which Alzheimer’s disease and dementia occur. Even if we accept the lower estimate of ~15% latent colonization in older adults, this proportion is still smaller than the lifetime cumulative incidence of dementia in the general population.

      Therefore, if latent toxoplasmosis contributes causally to dementia risk, and A-P is capable of eliminating latent Toxoplasma in the subset of individuals who harbor it, then a multi-week course of treatment—such as the one routinely taken for malaria prophylaxis—would be expected to produce a substantial reduction in dementia incidence at the population level, of the same order of magnitude reported here. A protective effect concentrated in a minority of exposed individuals is fully compatible with, and can mechanistically explain, the large overall reduction in risk that we observe.

      Finally, the reviewer notes that A-P is not a first-line treatment for toxoplasmosis due to assumed lower efficacy. This point does not undermine our results. Even a second-line agent, when administered over several weeks—as is routinely done for malaria prophylaxis—is expected to exert substantial anti-Toxoplasma activity. The long duration of exposure in large populations receiving A-P for travel provides a unique natural experiment that does not exist for other anti-Toxoplasma medications, which, when prescribed for their non-Toxoplasma indications, are not taken more than a few days. Thus, the widespread use of A-P for malaria prophylaxis allows a unique opportunity to evaluate long-term outcomes following inadvertent anti-Toxoplasma treatment.

      Moreover, “first line” recommendations in clinical guidelines refer to treatment of acute toxoplasmosis in immunosuppressed individuals, where tachyzoites are actively replicating. These guidelines do not consider efficacy against latent CNS colonization, which is dominated by bradyzoites, a biologically distinct form, in immunocompetent individuals. Therefore, the guideline hierarchy is not informative regarding which medication is more effective at clearing latent brain infection, the stage we consider most relevant to dementia risk.

      (2) A-P exposure may be a marker of subtle demographic features not captured in the dataset such as wealth allowing for global travel and/or genetic predisposition to AD. This raises my suspicion of correlative rather than casual relationships between A-P exposure and AD reduction. The size of the cohort does not eliminate this issue, but rather narrows confidence intervals around potentially misleading odds ratios which have not been adjusted for the multitude of other variables driving incident AD.

      We agree that prior to matching, A-P exposure may be associated with demographic features such as health or to travel internationally. However, this does not apply after matching. In all age-stratified analyses, exposed and control individuals were rigorously matched on all major risk factors known to influence dementia risk, including age, sex, race/ethnicity, smoking status, hypertension, diabetes, and obesity. Owing to the extremely large pool of individuals in TriNetX (~120M), our matching was performed stringently, producing exposed and unexposed cohorts that are near-identical with respect to the established determinants of dementia risk.

      The reviewer correctly identifies that large cohorts alone do not eliminate confounding; however, confounding must still be biologically and epidemiologically plausible. Any hypothetical confounder capable of producing a 50–70% reduction in dementia incidence over a decade would need to: (1) produce a very large protective effect against dementia; (2) be strongly associated with A-P exposure; and (3) remain entirely uncorrelated with age, sex, race/ethnicity, smoking, diabetes, hypertension and obesity, which have been rigorously matched. No such factor has been proposed. The suggestion that an unspecified ‘subtle demographic feature’ could produce effects of this magnitude remains hypothetical, and no such factor has been described in the dementia risk literature.

      If a specific evidence-supported confounder is proposed that meets these criteria, we would be pleased to test it empirically in our cohorts. In the absence of such a proposal, the interpretation that the association is merely “correlative rather than causal” remains speculative and does not negate the strength of a replicated, rigorously matched, long-term association across large cohorts in two national health systems.

      (3) The relationship between herpes virus reactivation and Toxo reactivation seems speculative.

      We respectfully disagree with the characterization of the herpesvirus–Toxoplasma interaction as speculative. The mechanism we describe is biologically valid, based on established virology and parasitology literature showing that latent T. gondii infection can reactivate from its bradyzoite state under inflammatory or immune-modifying conditions, including viral triggers. A published clinical report has documented CNS co-reactivation of T. gondii and a herpesvirus, explicitly noting that HHV-6 reactivation can promote Toxoplasma reactivation in neural tissue (Chaupis et al., Int J Infect Dis, 2016).

      Moreover, this mechanism is the only currently evidence-supported explanation that simultaneously and parsimoniously accounts for all of the epidemiologic observations in our study:

      (1) Substantially higher cumulative incidence of dementia in individuals with positive Toxoplasma serology, indicating that latent infection is a risk factor for subsequent cognitive decline;

      (2) Strong protective association following A-P exposure, a medication with established activity against Toxoplasma gondii, including in the CNS;

      (3) Independent protection conferred by VZV vaccination, observed consistently for two vaccines with distinct formulations (one live attenuated, one recombinant protein), whose only shared property is suppression of VZV reactivation;

      (4) Greater protective effect of A-P among individuals who were not vaccinated against VZV, consistent with a model in which dementia risk requires both herpesvirus reactivation and persistent latent Toxoplasma infection—such that reducing either factor alone (via VZV vaccination or anti-Toxoplasma suppression) substantially lowers risk.

      Taken together, these observations are difficult to reconcile under any alternative hypothesis.  

      To date, we are unaware of any other biologically coherent mechanism that can explain all four findings simultaneously. We would welcome any alternative explanation capable of accounting for these converging epidemiologic signals, as such a proposal could meaningfully advance the scientific discussion. In the absence of a competing explanation, the interaction between latent toxoplasmosis and herpesvirus reactivation remains the most parsimonious hypothesis supported by current knowledge.

      Finally, while observational studies are inherently limited in their ability to provide causal inference, the mechanism we propose is biologically grounded and experimentally testable. Our results provide a strong rationale for mechanistic studies and clinical trials, and warrant publication precisely because they generate a verifiable hypothesis that can now be evaluated directly.

      (4) A direct effect on A-P on AD lesions independent on infection is not considered as a hypothesis. Given the limitations above and effects on metabolic pathways, it probably should be. The Toxo hypothesis would be more convincing if the authors could demonstrate an enhanced effect of the drug in Toxo positive individuals without no effect in Toxo negative individuals.

      A direct effect of A-P on AD established lesions is indeed possible, and this hypothesis would be of significant therapeutic interest. However, we did not consider it within the scope of our epidemiologic analyses because all cohorts explicitly excluded individuals with existing dementia. Under these conditions, proposing a disease-modifying effect on established Alzheimer’s lesions based on our data would itself be speculative. Evaluating such a mechanism would be better answered by mechanistic or interventional studies rather than inference from populations without baseline disease.

      We also agree that demonstrating a stronger protective effect among Toxoplasma-positive individuals would be informative. Unfortunately, this “natural experiment” cannot be performed using the available data: Toxoplasma serology is rarely ordered in older adults, and A-P exposure is itself uncommon, resulting in a cohort overlap far too small to yield valid statistical inference (n≈25 in TriNetX).

      Thus, while both proposed hypotheses are scientifically attractive and merit further study, neither can be resolved using currently available real-world clinical data. Our findings provide the rationale to investigate both hypotheses experimentally, and we hope our report will motivate such studies.

      Reviewer #2 (Public review):

      Summary:

      This manuscript examines the association between atovaquone/proguanil use, zoster vaccination, toxoplasmosis serostatus and Alzheimer's Disease, using 2 databases of claims data. The manuscript is well written and concise. The major concerns about the manuscript center around the indications of atovaquone/proguanil use, which would not typically be active against toxoplasmosis at doses given, and the lack of control for potential confounders in the analysis.

      Strengths:

      (1) Use of 2 databases of claims data.

      (2) Unbiased review of medications associated with AD, which identified zoster vaccination associated with decreased risk of AD, replicating findings from other studies.

      We thank the reviewer for the thoughtful assessment and for noting key strengths of our work, including (1) the use of two large national databases, and (2) the unbiased discovery approach that replicated the widely reported association between zoster vaccination and reduced Alzheimer’s disease (AD) risk. We agree that these features highlight the validity and reproducibility of the analytic framework.

      Below we respond to the reviewer’s perceived weaknesses.

      Weaknesses:

      (1) Given that atovaquone/proguanil is likely to be given to a healthy population who is able to travel, concern that there are unmeasured confounders driving the association.

      We agree that, prior to matching, A-P exposure may correlate with demographic or health-related differences (e.g., ability to travel). However, this potential bias was explicitly controlled for in the study design. Across all three age-stratified TriNetX cohorts, exposed and unexposed individuals were rigorously matched on all major established dementia risk factors: age, sex, race/ethnicity, smoking status, obesity, diabetes mellitus, and hypertension. Comparative analyses confirm that these risk factors are equivalently distributed at baseline.

      As noted in our response to Reviewer #1, for any hypothetical unmeasured confounder to explain the results, it would need to satisfy three conditions simultaneously:

      (1) Be capable of producing a 50–70% reduction in dementia incidence sustained over a decade and across three distinct age strata (ages 50–79);

      (2) Be strongly associated with likelihood of receiving A-P;

      (3) Remain entirely uncorrelated with age, sex, race/ethnicity, smoking, diabetes, hypertension, or obesity, all of which were rigorously matched and balanced at baseline.

      No such factor has been proposed in the literature or by the reviewer. Thus, the concern remains hypothetical and unsupported by any measurable demographic or biological mechanism.

      Importantly, empirical evidence contradicts the notion of a “healthy traveler” bias:

      Emergency and inpatient encounter rates prior to exposure were comparable between A-P users and controls. Across the three age-stratified cohorts, emergency visits were similar or slightly higher among A-P users (EMER: 19.6% vs 16.4%, 19.9% vs 14.2%, 22.0% vs 14.8%), and inpatient encounters were effectively equivalent (IMP: 14.8% vs 15.2%, 17.7% vs 17.6%, 22.1% vs 22.2%). These patterns directly contradict the suggestion that A-P users were a healthier or less medically burdened population at baseline.

      Prevalence of mild cognitive impairment was not lower among A-P users and was, in fact, slightly higher in the oldest cohort. Across the three age groups, baseline diagnoses of mild cognitive impairment (MCI) were comparable or slightly higher among exposed individuals (0.1% vs 0.1%, 0.3% vs 0.2%, 1.1% vs 0.6%). These data contradict the suggestion that A-P users had superior baseline cognition.

      The strongest protective association occurred in the youngest stratum (age 50–59; HR 0.34). At this age, when nearly all individuals are sufficiently healthy to travel internationally, A-P uptake is the least likely to confound health status. A frailty-based “healthy traveler” hypothesis would instead predict the opposite pattern, with older adults showing the greatest apparent benefit, since health limitations are more likely to restrict travel in later life. In contrast, the protective association weakens with increasing age, empirically contradicting any explanation based on differential travel capacity.

      In conclusion, the empirical evidence directly contradicts the existence of a ‘healthy traveler’ effect.

      (2) The dose of atovaquone in atovaquone/proguanil is unlikely to be adequate suppression of toxo (much less for treatment/elimination of toxo), raising questions about the mechanism.

      A few important points should address the reviewer’s concern:

      In our cohorts, A-P was prescribed for malaria prophylaxis, as correctly noted. In this setting, it is taken for the entire duration of travel, plus several days before and after, typically resulting in many weeks of continuous exposure. This creates an unintentional but scientifically valuable natural experiment, in which a CNS-penetrating anti-Toxoplasma agent is administered for long durations.

      Atovaquone is an established treatment for CNS toxoplasmosis, has strong CNS penetration, and is included in current clinical guidelines for acute toxoplasmosis in immunocompromised patients, although at higher doses. Because latent, asymptomatic CNS colonization is not treated in clinical practice, there are currently no data establishing the dose required to eliminate bradyzoite-stage Toxoplasma in immunocompetent individuals.

      Our observations concern atovaquone–proguanil (A-P), a fixed-dose combination of atovaquone with proguanil, a DHFR inhibitor targeting a key metabolic pathway shared by malaria parasites and T. gondii. The combination has well-established synergistic effects in malaria prophylaxis and the same mechanism would be expected to enhance anti-Toxoplasma activity. This fixed-dose regimen has never been formally evaluated for toxoplasmosis treatment at prolonged durations or against latent bradyzoite infection.

      Our hypothesis does not require or imply complete eradication of Toxoplasma. A clinically meaningful reduction in latent cyst burden among the subset of colonized individuals may be sufficient to alter long-term disease trajectories. Thus, a population-level decrease in dementia incidence does not require universal clearance of infection, but only partial suppression or reduction of parasite load in susceptible individuals, which is entirely compatible with the known pharmacology and duration of A-P exposure.

      (3) Unmeasured bias in the small number of people who had toxoplasma serology in the TriNetX cohort.

      The relatively small number of older adults with Toxoplasma serology stems from current clinical practice: serologic testing is mostly performed in women during reproductive years due to risks in pregnancy, whereas in older adults a positive result has no clinical consequence and therefore testing is rarely ordered.

      Importantly, the seropositive and seronegative groups were drawn from the same underlying population of individuals who underwent serology testing, and the only difference between groups is the test result itself. Because the decision to order a test is made prior to and independent of the result, there is no plausible rationale by which the serology outcome (positive or negative) would introduce a bias favoring either group beyond the result of the test itself.

      Furthermore, the two groups were here also rigorously matched on all major dementia risk factors, including age, sex, race/ethnicity, smoking, diabetes, hypertension, and BMI, and these characteristics are similarly distributed between groups. A small sample size does not imply bias; it simply reduces statistical power. Despite this limitation, the observed association (HR = 2.43, p = 0.001) remains strongly significant.

      Finally, this result is consistent with multiple published studies reporting higher rates of Toxoplasma seropositivity among individuals with Alzheimer’s disease, dementia, and even mild cognitive impairment, such that our finding reinforces a broader and independently observed epidemiologic pattern. Importantly, in our cohort the serology testing clearly preceded dementia diagnosis, which supports the plausibility of a causal rather than merely correlative relationship between latent toxoplasmosis and cognitive decline.

      To conclude our provisional response, we thank the editor and reviewers for raising points that will be further addressed and expanded upon in the discussion of the forthcoming revision. We welcome transparent scientific dialogue and acknowledge that, as with all observational research, residual confounding cannot be eliminated with absolute certainty. However, we disagree with the overall Assessment and emphasize that our findings—reproduced independently across two national health systems and three age-stratified cohorts, each rigorously matched on all major determinants of dementia risk, meet, and in many respects exceed, current standards for high-quality observational evidence.

      Assigning the results to “residual confounding” requires more than speculation: it requires identification of a confounding factor that is (1) anchored in established dementia risk literature, (2) empirically plausible, and (3) quantitatively capable of generating a sustained ~50 percent reduction in dementia incidence over a decade. No such factor has been identified to date. We note that the assertion of “residual confounding” has not been supported by a specific, quantitatively plausible mechanism. A hypothetical bias that is both extremely large in effect and uncorrelated with all major risk factors is not statistically or biologically credible.

      The explanation we propose, reduction in dementia risk through elimination of latent Toxoplasma gondii, is biologically grounded, directly supported by independent epidemiologic literature, and uniquely capable of accounting for all convergent observations in our data. No alternative hypothesis has been put forward that can plausibly explain these findings.

      A revised version of the manuscript will be submitted shortly, incorporating expanded baseline analyses, with the strictest possible exclusion criteria (including congenital, vascular, chromosomal, and neurodegenerative disorders such as Parkinson’s disease), and complete tabulated comparisons. These data will further reinforce that the observed protective associations are not attributable to any measurable confounding. We also plan to enhance the discussion in order to address the points raised by the reviewers.

      In light of the expanded analyses, any reservations expressed in the initial Assessment can now be re-evaluated on the basis of the empirical evidence. The findings reported in our study meet, and in several respects exceed, current epidemiologic standards for high-quality observational research, clearly warrant publication, and provide a robust scientific foundation for future mechanistic and interventional studies to determine whether elimination of latent toxoplasmosis can prevent or treat dementia.

    2. Reviewer #1 (Public review):

      Summary:

      This useful study provides incomplete evidence of an association between atovaquone-proguanil use (as well as toxoplasmosis seropositivity) and reduced Alzheimer's dementia risk. The study reinforces findings that VZ vaccine lowers AD risk and suggests that this vaccine may be an effect modifier of A-P's protective effect. Strengths of the study include two extremely large cohorts, including a massive validation cohort in the US. Statistical analyses are sound, and the effect sizes are significant and meaningful. The CI curves are certainly impressive.

      Weaknesses include the inability to control for potentially important confounding variables. In my view, the findings are intriguing but remain correlative / hypothesis generating rather than causative. Significant mechanistic work needs to be done to link interventions which limit the impact of Toxoplasmosis and VZV reactivation on AD.

      Weaknesses:

      Major:

      (1) Most of the individuals in the study received A-P for malaria prophylaxis as it is not first line for Toxo treatment. Many (probably most) of these individuals were likely to be Toxo negative (~15% seropositive in the US), thereby eliminating a potential benefit of the drug in most people in the cohort. Finally, A-P is not a first line treatment for Toxo because of lower efficacy.

      (2) A-P exposure may be a marker of subtle demographic features not captured in the dataset such as wealth allowing for global travel and/or genetic predisposition to AD. This raises my suspicion of correlative rather than casual relationships between A-P exposure and AD reduction. The size of the cohort does not eliminate this issue, but rather narrows confidence intervals around potentially misleading odds ratios which have not been adjusted for the multitude of other variables driving incident AD.

      (3) The relationship between herpes virus reactivation and Toxo reactivation seems speculative.

      (4) A direct effect on A-P on AD lesions independent on infection is not considered as a hypothesis. Given the limitations above and effects on metabolic pathways, it probably should be. The Toxo hypothesis would be more convincing if the authors could demonstrate an enhanced effect of the drug in Toxo positive individuals without no effect in Toxo negative individuals.

      Minor:

      (5) "Clinically meaningful" should be eliminated from the discussion given that this is correlative evidence.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 ( Public review):

      The strength of the current study lies in their establishing the molecular mechanism through which PRMT1 could alter craniofacial development through regulation of the transcriptome, but the data presented to support the claim that a PRMT1-SFPQ axis directly regulates intron retention of the relevant gene networks should be robust and with multiple forms of clear validation. For example, elevated intron retention findings are based on the intron retention index, and according to the manuscript, are assessed considering the relative expression of exons and introns from a given transcript. However, delineating between intron retention and other forms of alternative splicing (i.e., cryptic splice site recognition) requires a more comprehensive consideration of the intron splicing defects that could be represented in data. A certain threshold of intron read coverage (i.e., the percent of an intron that is covered by mapped reads) is needed to ascertain if those that are proximal to exons could represent alternative introns ends rather than full intron retention events. In other words, intron retention is a type of alternative splicing that can be difficult to analyze in isolation given the confounding influence of cryptic splicing and cryptic exon inclusion. If other forms of alternative splicing were assessed and not detected, more confident retention calls can be made.

      This manuscript is a mechanistic exploration that follows previous work we published on the role of Prmt1 in craniofacial development, in which genetic deletion of Prmt1 in CNCCs leads to cleft palate and mandibular hypoplasia (PMID: 29986157).

      As the reviewer pointed out, a certain threshold of intron read coverage is needed to assess intron retention events. We employed IRTools to assess the collective changes of intron retention between cell-states associated with certain biological function or pathway. IRTools incorporated considerations for intron read coverage by checking the evenness of read distribution in an intron. Specifically, every constitutive intronic regions (CIR) is divided into 10 equally sized bins and the proportion of reads that map to each bin is calculated. CIRs are then ranked according to their imbalance in bin-wise reads distribution, represented by the proportion of reads in its most populated bin. Those among top 1% are considered to contain potentially false IR events and excluded. We further addressed this question by developing another measure of intron retention, intron retention coefficient (IRC), which assesses IR events using the junction reads (Supplemental Figure-S8). Junction reads that straddle two exons are called exon-exon junction reads (spliced reads), and those that straddle an exon and a neighboring intron are called exon-intron junction reads (retained reads). The IRC of an intron is defined as the fraction of junction reads that are exon-intron junction reads: IRC = exon-intron read-count / (exon-exon read-count + exon-intron read-count), where exon-intron read-count = (5’ exon-intron read-count + 3’ exon-intron read-count) / 2. The IRC of a gene is defined as the exon-intron fraction of all junction reads overlapping or over the constitutive introns of this gene. In the calculation of the IRC, only exon-intron junction reads that cover the junction point and overlap both of each side for at least 8 bps were counted, and only exon-exon junction reads that jump over the relevant junction points and overlap each of the respective exons for at least 8 bps were counted. In this process, evenness of the proportion of exon-intron junction reads that are 5’ or 3’ exon-intron junction reads are taken into account. As shown in the Supplemental Figure S7A and S7B, IRC analysis generated consistent results with those obtained from using IRI (Figure 3A and 3I).

      In addition, as the reviewer pointed out, intron retention can be difficult to analyze in isolation. We followed the reviewer’s suggestion that “If other forms of alternative splicing were assessed and not detected, more confident retention calls can be made“ and analyzed other forms of alternative splicing for all ECM and GAG genes with significant IRI increase (genes highlighted in Figure-3A and 3I) using rMATS (Supplemental Figure-S9). Among these genes, only 5 genes (Cthcr1, Mmp23, Adamts10, Ccdc80 and Col25a1) showed statistically significant changes in skipped exon, 1 gene (Bmp7) showed significant changes in mutually exclusive exons, and none showed significant changes in alternative 5’ or 3’ splicing. SE and MXE changes detected were marginal (Supplemental figure S8), while the majority of matrix genes with significant intron retention didn’t exhibit other forms of alternative splicing, further supporting the confidence of intron retention calls.

      While data presented to support the PRMT1-SFPQ activation axis is quite compelling, that this is directly responsible for the elevated intron retention remains enigmatic. First, in characterizing their PRMT1 knockout model, it is unclear whether the elevated intron retention events directly correspond to downregulated genes.

      In the revised manuscript, we demonstrate IR-triggered NMD as a mechanism for transcript decay and downregulation of matrix genes. When IR-triggered NMD was blocked by chemical inhibitor NMDI14, the intron-retaining transcripts showed significant accumulation (new Figure-4). NMD is the RNA surveillance system to degrade aberrant RNAs. Intron retention-triggered NMD in cancer has both promotive and suppressive roles and NMD inhibitors has been tested for cancer therapy including immunotherapy. During embryonic development, the functional significance of NMD machinery is suggested by human genetic findings and mouse genetic models. NMD is driven by a protein complex composed of SMG and UPF proteins. Smg6, Upf1, Upf2 and Upf3a knockout mouse die at early embryonic stages (E5.5-E9.5), and Smg1 gene trap mutant mice die at E12.5 (PMID: 29272451). SMG9 mutation in human patients causes malformation in the face, hand, heart and brain (PMID: 27018474).

      We show that in CNCCs NMD functions both as a physiological mechanism and invoked by molecular insult. Blocking NMD in CNCCs caused significant accumulation of intron-retaining Adamts2, Alpl, Eln, Matn2, Loxl1 and Bgn transcripts, suggesting a basal role for NMD to degrade intron-retaining transcripts (Figure-4Ba-4Bf). We further demonstrated the accumulation of Adamts2 and Fbln5 using semi-quantitative PCR with the detection of a longer product from Adamts2 intron 19 and Fbln5 intron 7 (Figure-4Ca-4Ch). In CNCCs and ST2 cells, NMD is further invoked by Prmt1 and Sfpq deficiency. In Prmt1 deficient CNCCs, NMD blockage led to higher accumulation of intron-retaining Adamts2 and Alpl transcripts, suggesting that Prmt1 deficiency triggers NMD to reduce intron-containing transcripts (Figure-4Aa, 4Ab). In Sfpq-depleted ST2 cells, blocking NMD caused accumulation of intron-retaining transcripts Col4a2, St6galnac3 and Ptk7 (Figure-9B, 9C).

      Moreover, intron splicing is a well-documented node for gene regulation during embryogenesis and in other proliferation models, and craniofacial defects are known to be associated with 'spliceosomopathies'. However, reproduction of this phenotype does not suggest that the targets of interest are inherently splicing factors, and a more robust assessment is needed to determine the exact nature of alternative splicing in this system. Because there are several known splicing factors downstream of PRMT1 and presented in the supplemental data, the specific attribution of retention to SFPQ would be additionally served by separating its splicing footprint from that of other factors that are primed to cause alternative splicing.

      We have previously shown that a group of splicing factors depends on Prmt1 for arginine methylation, including SFPQ (PMID: 31451547). We tested additional splicing factors that are highly expressed in CNCCs and depends on PRMT1 for arginine methylation: SRSF1, EWSR1, TAF15, TRA2B and G3BP1 (Figure-5, 6 and 10). Among these factors, EWSR1 and TRA2B are both methylated in CNCCs and depend on PRMT1 for methylation (Fig. 5 and Supplemental Figure-S3B, S3C). We weren’t able to assess TAF15 methylation because of lack of efficient antibody for the PLA assay. We also demonstrated that their protein expression or subcellular localization was not altered by Prmt1 deletion in CNCCs, unlike SFPQ (Supplemental Figure-S4). To define their splicing footprint, we performed siRNA-mediated knockdown in ST2 cells, followed by RNA-seq and IRI analysis to define differentially regulated genes and introns, which revealed distinct biological pathways regulated by SFPQ, EWSR1, TRA2B and TAF15, but minimal roles of EWSR1, TRA2B and TAF15 on intron retention when compared to SFPQ (Fig. 10F-10S, Supplemental Figure S7A-S7F, Supplemental Tables S4-S6). ECM genes are significantly downregulated by all four splicing factors (Fig. 10F-10I), but EWSR1, TRA2B and TAF15 function through IR-independent mechanisms, such as exon skipping, as exemplified by Postn (Fig. 10J-10S).

      Clarifying the relationship between SFPQ and splicing regulation is important given that the observed splicing defects are incongruous with published data presented by Takeuchi et al., (2018) regarding SFPQ control of neuronal apoptosis in mice. In this system, SFPQ was more specifically attributed to the regulation of transcription elongation over long introns and its knockout did not result in significant splicing changes. Thus, to establish the specificity for the SFPQ in regulating these retention events, authors would need to show that the same phenotype is not achieved by mis-regulation of other splicing factors. That the authors chose SFPQ based on its binding profile is understandable but potentially confounding given its mechanism of action in transcription of long introns (Takeuchi 2018). Because mechanisms and rates of transcription can influence splicing and exon definition interactions, the role of SFPQ as a transcription elongation factor versus a splicing factor is inadequately disentangled by authors.

      To test whether SFPQ acts as a transcription elongation factor, we performed Pol II Cut&Tag in ST2 cells and demonstrated that depletion of SFPQ only caused marginal changes in either the promoter region or gene body of ECM genes, suggesting that the role of SFPQ as a transcriptional activator or elongation factor is minimal (Fig. 7G, 7H). This finding is distinct from SFPQ function in neurons (PMID: 29719248), suggesting that the activation or recruitment of SFPQ in transcriptional regulation may involve tissue-specific factors in neurons.

      Reviewer #2 (Public review):

      Summary:

      The manuscript by Lima et al examines the role of Prmt1 and SFPQ in craniofacial development. Specifically, the authors test the idea that Prmt1 directly methylates specific proteins that results in intron retention in matrix proteins. The protein SFPQ is methylated by Prmt1 and functions downstream to mediate Prmt1 activity. The genes with retained introns activate the NMD pathway to reduce the RNA levels. This paper describes an interesting mechanism for the regulation of RNA levels during development.

      Strengths:

      The phenotypes support what the authors claim that Prmt1 is involved in craniofacial development and splicing. The use of state-of-the-art sequencing to determine the specific genes that have intron retention and changes in gene expression is a strength.

      Weaknesses:

      Some of the data seems to contradict the conclusions. And it is unclear how direct the relationships are between Prmt1 and SFPQ.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      First, the claims regarding the effect of PRMT1 loss on splicing are unclear by the section title. In other words, does loss PRMT1 change the incidence of baseline alternative splicing events, or does it introduce new retention events that are responsible for underwriting the craniofacial phenotype? Consistent with this idea, the narrative could benefit from more cellular and/or histological validations of the transcriptomic defects discovered in the RNAseq, which could help contextualize the bioinformatics data with the developmental defects. Moreover, the conclusions drawn about intron retention could be clarified in terms of how applicable the mechanism is likely to be outside of this tissue-specific set of responsive introns.

      Loss of Prmt1 did not cause a global shift in intron retention, as shown in Supplemental Figure S2. Instead, Prmt1 deletion caused increase of intron retention specifically in genes enriched in cartilage development, glycosaminoglycan biology, dendrite and axon, and decreased intron retention in mitochondria and metabolism genes (Table. S1). We also tested matrix protein expression by histology to confirm that transcriptomic defects revealed at the RNA level resulted in lower protein production. The new data are in Figure 3E-3H.

      Additionally, invoking NMD to align splicing and differential gene expression data understandable but lacking sufficient controls to be conclusive, such as positive control genes to confirm inhibition of NMD.

      To validate the blockage of NMD, glutathione peroxidase 1 (Gpx1) intron 1, a well-documented substrate for NMD, is tested as positive control (Fig 4Ac, 4Ad, 9B).

      Additionally, it should be clarified whether NMD is a basal mechanism for the regulation of these introns or whether it is an induced mechanism that is invoked by the molecular insult.

      In CNCCs, NMD functions both as a physiological mechanism and invoked by molecular insult. Please refer to responses to Reviewer 1’s public review for detailed explanations.

      Further, authors present data downstream of two siRNAs for the same gene target, but it remains unclear how siRNAs for the same gene target produce different effects. It may be helpful for authors to clarify how many of the transcriptomic defects are shared versus unique between the siRNAs.

      To address this question, we used bioinformatic analysis of the whole genome data to the similarity in changes caused by the two SFPQ-targeting siRNAs. As shown in the new Fig. 7Ba & 7Bb, transcriptomic and intron changes are consistent between the two siRNAs, suggesting that genes targeted by the two siRNA predominantly overlap. This overlap is illustrated by scatter plot analysis of RNAseq DEG and IRI data from each siRNA against SFPQ.

      Finally, we stress the importance of presenting the full conceptual basis for SFPQ's potential role in splicing and gene expression. It is significant to note that SFPQ has been previously studied as a splicing factor and was instead determined to function in support of the transcription elongation rather than in splicing. Thus, if authors are confident that the SFPQ manifests directly in splicing changes they encumber the burden of proof to show that its role in transcription, nor another splicing factor, are driving splicing changes.

      We demonstrated that depletion of SFPQ only caused marginal changes in either the promoter region or gene body of ECM genes, suggesting that the role of SFPQ as a transcriptional activator or elongation factor is minimal (Fig. 7G, 7H). Please refer to responses to Reviewer 1’s public review for detailed explanations.

      Reviewer #2 (Recommendations for the authors):

      (1) It is not clear why the authors focused on intron retention targets vs the other possibilities. Skipped Exon is much higher in terms of the number of changes, please clarify. For the intron retention how is this quantified? The traces are nice, but it is hard to tell which part is retained at this magnification. Also, because the focus is on extracellular matrix (ECM) and NMD it would be nice to show some of those targets here. In the tbx1 trace, some are up and some are down. What does that mean for the gene expression?

      We have investigated SE initially and found that genes with significant changes in Prmt1 CKO CNCCs fall into diverse functional pathways. Among them, a few genes are critical for skeletal formation, including Postn and Fn, and the function of their exon skipping has been documented. For example, the two exons that are skipped in Postn, Exon17 and 21, have been shown to regulate craniofacial skeleton shape and mandibular condyle hypertrophic zone thickness using transgenic mouse models (PMID: 36859617). As illustrated by Figure 10, the skipped exon of Postn is regulated by multiple splicing factors that may perform overlapping functions in vivo.

      Intron retention of each gene is quantified by the ratio of the overall read density of its constitutive intronic regions (CIRs) to the overall read density of its constitutive exonic regions (CERs) and defined as the intron retention index (IRI). In the first section of Response to Reviewer 1’s comments, we explained additional bioinformatic analysis that was performed to address reviewers’ questions, support the confidence of intron event calls and rule out the possibility of other alternative splicing mechanisms, such as by SE, MXE, A5SS or A3SS (Supplemental Figure S5, S6, Table S7).

      (2) RNA-Sequencing of Prmt1 mutants nicely shows gene expression changes, including in ECM and GAG genes. While validation of the sequencing results is not necessarily required, it would be very interesting to show the expression in situ. In addition, the heat map shows both downregulated but also upregulated transcripts. This is expected since this protein regulates many genes. However, the volcano plot shows a significant number of genes upregulated. It would be interesting to show what the upregulated genes are. And what is the proposed mechanism for Prmt1 regulation of upregulated genes?

      Validation for the transcriptomic changes is shown in Fig. 3E-3H using immunostaining.

      As for upregulated genes in Prmt1 mutant, top pathways include cytokine-mediated signaling pathway, signal transduction by p53 signaling pathway and cell morphogenesis (Figure 2E), which are consistent with our previous reports that Prmt1 deletion induces cytokine production in oral epithelium and leads to p53 accumulation in embryonic epicardium (PMID: 32521264, 29420098). Besides these pathways, Prmt1 deletion also caused upregulation of genes involved in adult behavior, postsynaptic organization and apoptotic process, which is consistent with findings from other labs on PRMT1 function in neuronal and cancer cells (PMID: 34619150, 33127433).

      (3) Specific transcripts were shown to have elevated intron retention involved in the ECM and GAG pathway. However in Figure 3D it seems to show the opposite with intronic expression decreased and exonic increases and intronic decrease. This is very important to the final conclusion of the paper. In addition, is there a direct relationship between increased intron and downregulation of this specific gene expression? It seems a bit correlational as it could also be an indirect mechanism. One way to test this is to do in vitro translation with and without the specific intron to test if it results in lower expression.

      We apologize for the mis-labeling in previous version of Figure 3D, which is now corrected. We also tried to test the direct relationship between intron and downregulation of matrix genes such as Adamts2 using in vitro experiments, however, the introns of matrix genes with high retention tends to be long, many 10 to 50kb in length, making it challenging to generate mini-gene constructs for molecular analysis. We used a different approach and demonstrated that inhibition of NMD with a chemical inhibitor NMDI14 caused dramatic accumulation of the Adamts2, Alpl, Eln, Matn2, Loxl1 and Bgn transcripts, suggesting that retained introns triggered NMD to regulate gene expression and this mechanism acts as a physiological level in CNCCs (Fig. 4). We also blocked NMD in control and Prmt1 null CNCCs, where NMD blockage led to higher accumulation of Adamts2 and Alpl transcripts, suggesting that upon Prmt1 deficiency, NMD is further utilized to degrade intron-containing transcripts (Fig. 4). Similarly, in Sfpq-depleted ST2 cells, blocking NMD caused accumulation of intron-retaining transcripts Col4a2, St6galnac3 and Ptk7 (Fig. 9A, 9B).

      (4) While Figure 4 nicely shows the methylation of SFPQ is reduced in Prmt1 CKO cells, it is unclear which reside this methylation occurs. Also the overall expression of SFPQ is also down so it is possible that the methylation is indirect ie Prmt1 regulates some other methyltransferase that regulates SFPQ. Or that because the overall level of SFPQ is down, there is no protein to methylate. How do the authors differentiate between these possibilities?

      Previously, arginine methylation of SFPQ has been characterized using in vitro reaction and cell lines with biochemical assays by Snijders., et al in 2015 (PMID: 25605962). Among all PRMTs that catalyze asymmetric arginine dimethylation (ADMA), SFPQ is methylated by only PRMT1 and PRMT3, with PRMT1 showing higher efficiency while PRMT3 showing a lower efficiency. However, PRMT3 is mainly cytosolic. Its expression in CNCCs is about 100-fold lower than PRMT1 (Fig. 1). Based on these knowledges, PRMT1 is the primary arginine methyltransferase for SFPQ, a nuclear protein in CNCCs. We and others have shown in a previous publication that SFPQ methylation on arginine 7 and 9 depends on PRMT1 (PMID: 31451547).

      To investigate SFPQ protein degradation in CNCCs, we used MG132 to block proteasomal degradation and observed a partial rescue of SFPQ protein degradation in Prmt1 mutant embryos, suggesting that SFPQ is degraded through proteasomal-mediated mechanism. To address the relationship between SFPQ methylation and protein expression, we assessed arginine methylation of SFPQ that accumulated after MG132 treatment. The accumulated SFPQ was not methylated, confirming the absence of methylation even when SFPQ protein expression is restored.

      Snijders., et al, also shown that citrullination induced by PADI4 regulate SFPQ stability (Snijders 2015). We considered this possibility and assessed the expression levels of PADIs. In E13.5 and E15.5 CNCCs, PADI1-4 mRNA expression levels are very low (TPM<5), suggesting that PADIs may not regulate SFPQ stability in CNCCs. A detailed mechanism as to how PRMT1-mediated SFPQ methylation controls stability awaits further investigation.

      (5) For the Sfpq deleted experiment, it seems that the two knockdowns are not similar in the gene targets and GO terms different except Wnt signaling. This makes this data difficult to interpret. The genes identified as intron retention are different than the ones identified in Prmt1 deletion and not reduced as much. How does this fit in with the Prmt1 story? If working through Sfpq, it assumes that the targets will be similar and more the 8% would be in common.

      To address the first concern, we used bioinformatic analysis of the whole genome data to the similarity in changes caused by the two SFPQ-targeting siRNAs. As shown in the new Fig. 7Ba & 7Bb, transcriptomic and intron changes are consistent between the two siRNAs, suggesting that genes targeted by the two siRNA predominantly overlap. This overlap is illustrated by scatter plot analysis of RNAseq DEG and IRI data from each siRNA against SFPQ.

      We have previously identified a group of splicing factors that depends on PRMT1 for arginine methylation, including SFPQ (PMID: 31451547). In the new data in Figures 5, 6 and 10, we tested an additional five PRMT1-dependent splicing factors that are highly expressed in CNCCs: SRSF1, EWSR1, TAF15, TRA2B and G3BP1 (Fig. 5, 6 and 10). Among these factors, SRSF1 and G3BP1 are predominantly expressed in the cytosol of NCCs at E13.5. As splicing activity in the nucleus is needed for pre-mRNA splicing, we excluded these two and focused on the other three proteins. EWSR1 and TRA2B are both methylated in CNCCs and depend on PRMT1 for methylation (Fig. 5). We weren’t able to assess TAF15 methylation because of lack of efficient antibody for the PLA assay. We also demonstrated that their protein expression or subcellular localization was not altered by Prmt1 deletion in CNCCs, unlike SFPQ (Fig. S2). To define their splicing footprint, we performed siRNA-mediated knockdown in ST2 cells, followed by RNA-seq and IRI analysis to define differentially regulated genes and introns, which revealed distinct biological pathways regulated by SFPQ, EWSR1, TRA2B and TAF15, but minimal roles of EWSR1, TRA2B and TAF15 on intron retention when compared to SFPQ (Fig. 10F-10I, Supplemental Figure S7A-S7F). ECM genes are significantly downregulated by all four splicing factors (Fig. 10J-10M), but EWSR1, TRA2B and TAF15 regulate transcription or exon skipping instead of IR, as exemplified by Alpl and Postn (Fig. 10N-10T).

      (6) The addition of an NMD mechanism is interesting but not surprising that when inhibiting the pathway broadly, there is an increase in gene expression in the mesoderm cell line. How specific is this to craniofacial development?

      NMD is driven by a protein complex composed of SMG and UPF proteins. We show in the revised manuscript that NMD is both a physiological mechanism in CNCCs and triggered by genetic disturbance (Fig. 4). These data are in line with human patient reports where SMG9 mutation in human causes malformation in the face, hand, heart and brain (PMID: 27018474). Mouse genetic studies also demonstrated roles of NMD components during embryonic development.Smg6, Upf1, Upf2 and Upf3a knockout mouse die at early embryonic stages (E5.5-E9.5), and Smg1 gene trap mutant mice die at E12.5 (Han 2018). Additionally, intron retention-triggered NMD in cancer has both promotive and suppressive roles and NMD inhibitors has been tested for cancer therapy and recently cancer immunotherapy. Our findings highlight matrix genes as one of the key targets for NMD during craniofacial development.

      Minor:

      (1) The supplemental figures are difficult to understand. In the first upload there are many figures and tables, some excel files that are separate uploads and some not. Please upload as separate files so it is clear. And also put them in order that they are in the manuscript.

      (2) For the heat map in figure 2B, it would be good to show all the genes or none at all. It seems a bit like cherry-picking to highly only a few. And they are not labeled where they are located in the graph. Are these the top lines if so please label.

      (3) Gene names in Figure 3A are difficult to read. I would also not consider BMP7 an ECM gene.

      (4) A summary diagram of the interactions proposed will help to make this more understandable.

      The supplemental figures are reorganized and uploaded as separate word and excel documents. For Heat map in Fig. 2B, we have removed the gene names. For Fig. 3A, only the most significantly changed gene are labeled in red dots with names. We didn’t label all the genes because of the large number of genes. For the new Figure 3B, we have replaced BMP7. A schematic summary is also added to Supplemental Fig. S9 to illustrate the PRMT1-SFPQ pathway.

    1. Reviewer #1 (Public review):

      Summary

      The authors determine the phylogenetic relation of the roughly two dozen wtf elements of 21 S. pombe isolates and show that none of them in the original S. pombe are essential for robust mitotic growth. It would be interesting to test their meiotic function by simply crossing each deletion mutant with the parent and analyzing spores for non-Mendelian inheritance. If this has been reported already, that information should be added to the MS. If not, I suggest the authors do these simple experiments and add this information.

      Strengths:

      The most interesting data (Fig. 4) show that one recombinant (wtfC4) between wtf18 and wtf23 produces in mitotic growth a poison counteracted by its own antidote but not by the parental antidotes. Again, it would be interesting to test this recombinant in a more natural setting - meiosis between it and each of the parents.

      Weaknesses:

      Some minor rewriting is needed.

      Comments on Revision:

      (1) The parameter for "maximum growth rate" in Figure 2D needs to be defined and put on the graph.

      (2) On page 8, line 182, the authors should consider testing the hybrid wtf in meiosis using strain 975 of Leupold, which is h+, or another standard h+ strain. I don't think the antidote allele is needed; rather, it seems to me it would counter the lethality of the poison protein and should be omitted to test drive of the hybrid wtf. This is a simple experiment and would add considerably to the paper.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      The authors determine the phylogenetic relation of the roughly two dozen wtf elements of 21 S. pombe isolates and show that none of them in the original S. pombe are essential for robust mitotic growth. It would be interesting to test their meiotic function by simply crossing each deletion mutant with the parent and analyzing spores for non-Mendelian inheritance. If this has been reported already, that information should be added to the manuscript. If not, I suggest the authors do these simple experiments and add this information.

      Thanks for the great summary! All the wtf genes have been tested for meiotic drive phenotypes previously by Bravo Nunez et al. (2020; http://doi.org/10.1371/journal.pgen.1008350). The reference was cited in our original manuscript, and we added the details in the revised manuscript.  

      Strengths:

      The most interesting data (Figure 4) show that one recombinant (wtfC4) between wtf18 and wtf23 produces in mitotic growth a poison counteracted by its own antidote but not by the parental antidotes. Again, it would be interesting to test this recombinant in a more natural setting - meiosis between it and each of the parents.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Weaknesses:

      In the opinion of this reviewer, some minor rewriting is needed.

      We did the rewriting as this reviewer suggested.

      Reviewer #2 (Public review):

      Summary:

      This important study provides a mechanism that can explain the rapid diversification of poison-antidote pairs (wtf genes) in fission yeast: recombination between existing genes.

      Thanks!

      Strengths:

      The authors analyzed the diversity of wtf in S. pombe strains, and found pervasive copy number variations. They further detected signals of recurrent recombination in wtf genes. To address whether recombination can generate novel wtf genes, the authors performed artificial recombination between existing wft genes, and showed that indeed a new wtf can be generated: the poison cannot be detoxified by the antidotes encoded by parental wtf genes but can be detoxified by own antidote.

      Thanks for the great summary!

      Weaknesses:

      The study can benefit from demonstrating that the novel poison-antidote constructed by the authors can serve as a meiotic driver.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Reviewer #3 (Public review):

      Summary:

      In this manuscript, Wang and colleagues explore factors contributing to the diversification of wtf meiotic drivers. wtf genes are autonomous, single-gene poison-antidote meiotic drivers that encode both a spore-killing poison (short isoform) and an antidote to the poison (long isoform) through alternative transcriptional initiation. There are dozens of wtf drivers present in the genomes of various yeast species, yet the evolutionary forces driving their diversification remain largely unknown. This manuscript is written in a straightforward and effective manner, and the analyses and experiments are easy to follow and interpret. While I find the research question interesting and the experiments persuasive, they do not provide any deeper mechanistic understanding of this gene family.

      Thanks! Please see the following for our point-to-point response.

      Strengths:

      (1) The authors present a comprehensive compendium and analysis of the evolutionary relationships among wtf genes across 21 strains of S. pombe.

      (2) The authors found that a synthetic chimeric wtf gene, combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves like a meiotic driver that could only be rescued by the chimeric antidote but neither of the parental antidotes. This is a very interesting observation that could account for their inception and diversification.

      Thanks for the great summary!

      Weaknesses:

      (1) Deletion strains

      The authors separately deleted all 25 Wtf genes in the S. pombe ference strain. Next, the authors performed a spot assay to evaluate the effect of wtf gene knockout on the yeast growth. They report no difference to the WT and conclude that the wtf genes might be largely neutral to the fitness of their carriers in the asexual life cycle at least in normal growth conditions.

      The authors could have conducted additional quantitative growth assays in yeast, such as growth curves or competition assays, which would have allowed them to detect subtle fitness effects that cannot be quantified with a spot assay. Furthermore, the authors do not rule out simpler explanations, such as genetic redundancy. This could have been addressed by crossing mutants of closely related paralogs or editing multiple wtf genes in the same genetic background.

      Another concern is the lack of detailed information about the 25 knockout strains used in the study. There is no information provided on how these strains were generated or, more importantly, validated. Many of these wtf genes have close paralogs and are flanked by repetitive regions, which could complicate the generation of such deletion strains. As currently presented, these results would be difficult to replicate in other labs due to insufficient methodological details

      We generated growth curves for all the 25 wtf deletion strains. We provided the details for wtf gene knockout. However, for 25 wtf genes, there are too many combinations for editing two genes, and it is technically challenging to knock out multiple wtf together. Nevertheless, our results suggest single wtf genes have little effect on the host fitness under normal condition.

      (2) Lack of controls

      The authors found that a synthetic chimeric wtf gene, constructed by combining exons 1-5 of wtf23 and exon 6 of wtf18, behaves as a meiotic driver that can be rescued only by its corresponding chimeric antidote, but not by either of the parental antidotes (Figure 4F). In contrast, three other chimeric wtf genes did not display this property (Figure 4C-E). No additional experiments were conducted to explain these differences, and basic control experiments, such as verifying the expression of the chimeric constructs, were not performed to rule out trivial explanations. This should be at the very least discussed. Also, it would have been better to test additional chimeras.

      We verified the expression of the chimeric genes. The last exon of wtf18 is too small (128bp) to do more meaningful chimeras.

      (3) Statistical analyses

      In line 130 the authors state that: "Given complex phylogenetic mixing observed among wtf genes (Figure 1E), we tested whether recombination occurred. We detected signals of recombination in the 25 wtf genes of the S. pombe reference genome (p = 0) and in the wtf genes of the 21 S. pombe strains (p = 0) using pairwise homoplasy index (HPI) test." Reporting a p-value of 0 is not appropriate. Exact P-values should be reported. 

      Due to software limitations, the PHI test reports p-values of 0.0 for extremely significant results. We have therefore reported them as <0.0001 in the revised manuscript.

      Recommendations for the authors:

      Reviewing Editor Comments:

      Regarding the synthetic chimeric wtf gene constructed by combining exons of wtf23 and wtf18, the authors did not explicitly test whether it acts as a meiotic driver in the natural context of a cross. Instead, they examined this possibility only through transgenic overexpression experiments. Given that this is arguably the most important claim of the paper, it is critical that the authors perform, report, and discuss such an experiment in a natural context, regardless of the outcome. It is not necessary to test other recombinants or other wtf loci.

      Thanks for this insightful comment! As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory strain 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wtf18. However, we encountered a challenge: since strain 972h- has only one mating type and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that only carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins or due to the genetic background. Similarly, the drive activity of wtf13 has been shown to be specifically suppressed in certain backgrounds.

      Reviewer #1 (Recommendations for the authors):

      The paper is very well written, but some minor points should be corrected or checked.

      (1) Line 95: Why "Putative"? Is it not clear what a wtf pseudogene is?

      “Putative” was removed.

      (2) Line 105: Does "known functional" mean they are active (i.e., have been tested and shown to be active)? If so, a reference should be added.

      We used “known meiotic divers”, and added reference here.

      (3) Line 135: "no recombination signal was tested". Do the authors mean no signal was inferred? 

      We changed “tested” to “detected”.

      (4) Line 147: References for "known functional meiotic drivers (wtf23) and artificially generated meiotic driver (wtf18)" should be given. A statement of how wtf18 was "artificially generated" is essential so the reader knows how that element differs from the wtfC4 generated here.

      Reference for wtf23. As for wtf18, we have specified in the follow text, namely “we artificially introduced an in-frame ATG codon right before the start of exon 2, generating wtf18poison/-0M.”

      (5) Lines 154 and 424 say an ATG codon was introduced "right before the start of exon 2," but Figure 4B shows it before exon 1.

      We thank the reviewer. The introduced ATG is the second start codon in the long transcript and the first in the short transcript. The right panel of Figure 4B shows the short transcript, so the text and figure are consistent.

      (6) Line 159: The wtf18 mutant with this additional ATG codon should be tested in meiosis, to see if "putative" is correct.

      Thanks. As wtfC4, we came with technical challenges to show the driver phenotype in a natural setting, and thus removed this statement.

      (7) Line 181: change "driver" to "drive".

      Driver is correct.

      (8) Line 184: insert to read "wtf genes tested". Also, what is the basis for proposing that "the last exon might be crucial for antidote function"?

      “Tested” added, and removed the statement.

      (9) Line 198: change to read "detects only large differences".

      Done as suggested.

      (10) Line 204: change "removed" to "removal".

      Done as suggested.

      (11) Lines 242 and 243: Are "Splittree4" and "SplitsTree4" different, or is this a misprint?

      Corrected!

      (12) Lines 274-5 and 412 -3 would read better as "strains were diluted in five 10-fold steps” and “...μL of each dilution spotted on” “…to assay for…"

      Done as suggested.

      (13) Line 284 says "No new data were generated." This is clearly wrong. Perhaps the authors mean there are no supplementary data files.

      Corrected!

      (14) Line 406: Change "is" to "are".

      Corrected!

      (15) Line 413: Surely, they were spotted onto YE agar medium, not liquid medium.

      Corrected!

      (16) Figure 3C: Define "Rho" and the scale used.

      The definition of Rho has been added to the Methods section in the revised manuscript.

      Reviewer #2 (Recommendations for the authors):

      The evidence is largely solid, but the study can benefit from demonstrating that the novel poison-antidote constructed by the authors can serve as a meiotic driver.

      As suggested, we have tried to test this recombinant in a more natural setting. We created a recombinant strain (wtfC4) based on the laboratory 972h-. Specifically, we replaced the last exon of the original wtf23 gene with the last exon of wt18f. However, we encountered a challenge: since 972h- is a mating-type strain and cannot undergo meiosis on its own, we had to mate the recombinant strain with a BN0 h⁺ strain that carries the wtf23<sup>antidote</sup>. Unfortunately, despite of tens of attempts over nearly a year, we did not observe meiotic driver phenotype as expected. This might be due to issues with the proper splicing and expression of the potential poison and antidote proteins.

      Reviewer #3 (Recommendations for the authors):

      I strongly recommend the authors provide all the details concerning the generation of the knock-out strains, including specific primers used (for both the deletion and validation), the result of these validations, and the specific genotype (and ID) of the strains generated.

      These details are now included in the Materials and Methods section and in Supplementary.

      Please also provide exact P-values (see point 3).

      Due to software limitations, the PHI test reports p-values of 0.0 for extremely significant results. We have therefore reported them as <0.0001 in the revised manuscript.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #2 (Public review):

      In this valuable manuscript, Lin et al attempt to examine the role of long non coding RNAs (lncRNAs) in human evolution, through a set of population genetics and functional genomics analyses that leverage existing datasets and tools. Although the methods are incomplete and at times inadequate, the results nonetheless point towards a possible contribution of long non coding RNAs to shaping humans, and suggest clear directions for future, more rigorous study.

      Comments on revisions:

      I thank the authors for their revision and changes in response to previous rounds of comments. As before, I appreciate the changes made in response to my comments, and I think everyone is approaching this in the spirit of arriving at the best possible manuscript, but we still have some deep disagreements on the nature of the relevant statistical approach and defining adequate controls. I highlight a couple of places that I think are particularly relevant, but note that given the authors disagree with my interpretation, they should feel free to not respond!

      (1) On the subject of the 0.034 threshold, I had previously stated: "I do not agree with the rationale for this claim, and do not agree that it supports the cutoff of 0.034 used below."

      In their reply to me, the authors state:

      "What we need is a gene number, which (a) indicates genes that effectively differentiate humans from chimpanzees, (b) can be used to set a DBS sequence distance cutoff. Since this study is the first to systematically examine DBSs in humans and chimpanzees, we must estimate this gene number based on studies that identify differentially expressed genes in humans and chimpanzees. We choose Song et al. 2021 (Song et al. Genetic studies of human-chimpanzee divergence using stem cell fusions. PNAS 2021), which identified 5984 differentially expressed genes, including 4377 genes whose differential expression is due to trans-acting differences between humans and chimpanzees. To the best of our knowledge, this is the only published data on trans-acting differences between humans and chimpanzees, and most HS lncRNAs and their DBSs/targets have trans-acting relationships (see Supplementary Table 2). Based on these numbers, we chose a DBS sequence distance cutoff of 0.034, which corresponds to 4248 genes (the top 20%), slightly fewer than 4377."

      I have some notes here. First, Agoglia et al, Nature, 2021, also examined the nature of cis vs trans regulatory differences between human and chimps using a very similar set up to Song et al; their Supplementary Table 4 enables the discovery of genes with cis vs trans effects although admittedly this is less straightforward than the Song et al data. Second, I can't actually tell how the 4377 number is arrived at. From Song et al, "Of 4,671 genes with regulatory changes between human-only and chimpanzee-only iPSC lines, 44.4% (2,073 genes) were regulated primarily in cis, 31.4% (1,465 genes) were regulated primarily in trans, and the remaining 1,133 genes were regulated both in cis and in trans (Fig. 2C). This final category was further broken down into a cis+trans category (cis- and transregulatory changes acting in the same direction) and a cis-trans category (cis- and trans-regulatory changes acting in opposite directions)." Even when combining trans-only and cis&trans genes that gives 2,598 genes with evidence for some trans regulation. I cannot find 4,377 in the main text of the Song et al paper.

      Elsewhere in their response, the authors respond to my comment that 0.034 is an arbitrary threshold by repeating the analyses using a cutoff of 0.035. I appreciate the sentiment here, but I would not expect this to make any great difference, given how similar those numbers are! A better approach, and what I had in mind when I mentioned this, would be to test multiple thresholds, ranging from, eg,0.05 to 0.01 <DBS dist =0.01 -> 0.034 -> 0.05> at some well-defined step size.

      (1) We sincerely thank the reviewer for this critical point. Our initial purpose, based on DBS distances from the human genome to chimpanzee genome and archaic genomes, was that genes with large DBS distances may have contributed more to human evolution. However, our ORA (overrepresentation analysis) explored only genes with large DBS distances (the legend of old Figure 2 was “1256 target genes whose DBSs have the largest distances from modern humans to chimpanzees and Altai Neanderthals are enriched in different Biological Processes GO terms”), with the use of the cutoff (threshold) of 0.034 for defining large distance. The cutoff is not totally unreasonable (as our new results and the following sensitivity analysis indicate), but this approach was indirect and flawed.

      (2) We have now performed ORA using two methods. The first uses only DBS distances. Instead of using a cutoff, we now sort genes by DBS distance (human-chimpanzee distances and human-Altai Neanderthal distance, respectively, see Supplementary Table 5) and use the top 25% and bottom 25% of genes to perform ORA. This directly examines whether DBS distances along indicate that genes with large DBS distances contribute more to human evolution than genes with small DBS distances. The second also explores the ASE genes (allele-specific expression, genes undergoing human/chimpanzee-specific regulation in the tetraploid human–chimpanzee hybrid iPS) reported by Agoglia et al. 2021. We select the top 50% and bottom 50% of genes with large and small DBS distances, intersect them with ASE genes from Agoglia et al. 2021 (their Supplementary Table 4), and apply ORA to the intersections. Both the results are that: (a) more GO terms are obtained from genes with large DBS distances, (b) more human evolution-related GO terms are obtained from genes with large DBS distances (Supplementary Table 5,6,7; Figure 2; Supplementary Fig. 15). These results directly suggest that genes with large DBS distances contribute more to human evolution than genes with small DBS distances, which is a key theme of the study.

      (3) Regarding Song et al 2021, the statement of “we differentiated…allotetraploid (H1C1a, H1C1b, H2C2a, H2C2b) lines into ectoderm, mesoderm, and endoderm” made us assume that their differentiated hybrid cell lines cover more tissue types than those of Agoglia et al. 2021. Now, upon re-examining Supplementary Table 5 of Song et al. and Supplementary Table 4 of Agoglia et al. 2021, we find that the latter more clearly indicates significant ASE genes (p-adj<0.01 and |LFC>0.5| in GRCh38 and PanTro5).

      (4) We have also performed two additional analyses in response to the suggestion of “test multiple thresholds, ranging from, eg, 0.05 to 0.01 <DBS dist =0.01 -> 0.034 -> 0.05> at some well-defined step size”. First, we performed a multi-threshold sensitivity analysis using a spectrum of cutoffs (0.03, 0.034, 0.04, 0.05), and tracked the number of genes identified and the enrichment significance of key GO terms (e.g., "neuron projection development," "behavior") across these thresholds. The result confirms that while the absolute number of genes varies with the cutoffs, the core biological conclusion (specifically, the significant enrichment of target genes in neurodevelopmental and cognitive functions) remains stable and significant. For instance, "behavior" maintains strong statistical significance (FDR<0.01) in both the human-chimpanzee and human-Altai Neanderthal comparisons across all tested cutoffs, and "Neuron projection development" also remains significant across three (0.03, 0.034, 0.04) of the four cutoffs in the Altai comparison. This pattern suggests that our core findings regarding neurodevelopmental functions are robust across a range of cutoffs. Nevertheless, we did not extend the analysis to smaller cutoffs (e.g., 0.01 or 0.02) because such values would identify an excessively large number of genes (>10000) for ORA, which would render the GOterm enrichment analysis less meaningful due to a loss of specificity.

      Second, we have performed an additional validation to directly evaluate whether the 0.034 cutoff itself represents a stringent and biologically meaningful value. We sought to empirically determine how often a DBS sequence distance of 0.034 or greater might occur by chance in promoter regions, thereby testing its significance as a marker of potential evolutionary divergence. We randomly sampled 10,000 windows from annotated promoter regions across the hg38 genome, each with a size matching the average length of DBSs (147 bp). We then calculated the per-base sequence distances for these random windows between modern humans and chimpanzees, as well as between modern humans and the three archaic humans (Altai, Denisovan, Vindija). The analysis reveals that a distance of ≥0.034 is a rare event in random promoter sequences: for Human-Chimp, Human-Altai, HumanDenisovan, and Human-Vindija, 5.49% (549/10000), 0.31% (31/10000), 4.47% (447/10000), and0.03% (3/10000) of random windows reach this distance. This empirical evidence suggests that 0.034 is a sufficiently strong cutoff for defining large DBS distance, it would occur very unlikely in a random genomic background (P<0.1 for Chimpanzee and P<0.05 for the archaic humans), and DBSs exceeding this cutoff are significantly enriched for sequences that have undergone substantial evolutionary change instead of being random neutral variations.  

      (5) We present new Figure 2, Supplementary Table 5,6,7, and Supplementary Fig. 15. We have substantially revised section 2.3, related sections in Results, Supplementary Note 3, and Supplementary Table 8. We have removed related descriptions and explanations in the main text and Supplementary Notes. The results of the above two analyses are presented here as two Author response images.

      Author response table 1.

      Sensitivity analysis of GO-term enrichment across different DBS sequence distance cutoffs. The table shows the numbers of target genes identified and the false discovery rates (FDR) for the enrichment of three selected GO terms at four different distance cutoffs. Note that, unlike in the old Figure 2, the results for chimpanzees and Altai Neanderthals are not directly comparable here, as the numbers of target genes used for the enrichment analysis differ between them at each cutoff.

      Author response image 1.

      Distribution of per-base sequence distances for DBS size-matched random genomic windows in Ensembl-annotated promoter regions, calculated between modern humans and (A) chimpanzee, (B) Altai Neanderthal, (C) Denisovan, and (D) Vindija Neanderthal genomes.

      (2) The authors have introduced a new TFBS section, as a control for their lncRNAs - this is welcome, though again I would ask for caution when interpreting results. For instance, in their reply to me the authors state: "The number of HS TFs and HS lncRNAs (5 vs 66) <HS TF vs all HS lncRNAs> alone lends strong evidence suggesting that HS lncRNAs have contributed more significantly to human evolution than HS TFs (note that 5 is the union of three intersections between <many2zero + one2zero> and the three <human TF list>)."

      But this assumes the denominator is the same! There are 35899 lncRNAs according to the current GENCOVE build; 66/35899 = 0.0018, so, 0.18% of lncRNAs are HS. The authors compare this to 5 TFs. There are 19433 protein coding genes in the current GENCOVE build, which naively (5/19433) gives a big depletion (0.026%) relative to the lnc number. However, this assumes all protein coding genes are TFs, which is not the case. A quick search suggests that ~2000 protein coding genes are TFs (see, eg, https://pubmed.ncbi.nlm.nih.gov/34755879/); which gives an enrichment (although I doubt it is a statistically significant one!) of HS TFs over HS lncRNAs (5/2000 = 0.0025). Hence my emphasis on needing to be sure the controls are robust and valid throughout!

      We thank the reviewer for this comment. While 5 vs 66 reveals a difference, a direct comparison is too simplified. The real take-home message of the new TFBS section is not the numbers but the distributions of HS TFs’ targets and HS lncRNAs’ targets across GTEx organs and tissues (Figure 3 and Supplementary Figures 24, 25) - correlated HS lncRNA-target transcript pairs are highly enriched in brain regions, but correlated HS TF-target transcript pairs are distributed broadly across GTEx tissues and organs. We have now removed the simple comparison of “5 vs 66” and more carefully explained our comparison in section 2.6.

      (3) In my original review I said: line 187: "Notably, 97.81% of the 105141 strong DBSs have counterparts in chimpanzees, suggesting that these DBSs are similar to HARs in evolution and have undergone human-specific evolution." I do not see any support for the inference here. Identifying HARs and acceleration relies on a far more thorough methodology than what's being presented here. Even generously, pairwise comparison between two taxa only cannot polarise the direction of differences; inferring human-specific change requires outgroups beyond chimpanzee.

      In their reply to me, the authors state:

      Here, we actually made an analogy but not an inference; therefore, we used such words as "suggesting" and "similar" instead of using more confirmatory words. We have revised the latter half sentence, saying "raising the possibility that these sequences have evolved considerably during human evolution".

      Is the aim here to draw attention to the ~2.2% of DBS that do not have a counterpart? In that case, it would be better to rewrite the sentence to emphasise those, not the ones that are shared between the two species? I do appreciate the revised wording, though.

      (1) Our original phrasing may be misleading, and we agree entirely that “pairwise comparison between two taxa only cannot polarise the direction of differences; inferring human-specific change requires outgroups beyond chimpanzee”. As explained in that reply, we know and think that DBSs and HARs are two different classes of sequences, and indeed, identifying HARs and acceleration relies on a far more thorough methodology. Yet, three factors prompted us to compare them. First, both suggest the importance of sequences outside genes. Second, both are quite “old” sequences and have undergone considerable evolution recently (although the references are different). Third, both have contributed greatly to human brain evolution.  

      (2) Here, our stress is 97.81% but not 2.2%, and we have made this analogy more clearly and cautiously. Relevant revisions have been made in the Results, Discussion, and Methods sections.   

      (3) We also have further determined whether the 2.2% DBSs are human-specific gains by analyzing them using the UCSC Multiz Alignments of 100 Vertebrates. The result confirms that all 2248 DBSs are present in the human genome but are absent from the chimpanzee genome and all other aligned vertebrate genomes. We add this result into the manuscript.

      (4) Finally, Line 408: "Ensembl-annotated transcripts (release 79)" Release 79 is dated to March 2015, which is quite a few releases and genome builds ago. Is this a typo? Both the human and the chimpanzee genome have been significantly improved since then!

      (1) We thank the reviewer for this comment, which prompts us to provide further explanation and additional data. First, we began predicting HS lncRNAs’ DBSs when Ensembl release 79 was available, but did not re-predict DBSs when new Ensembl releases were published because (a) these new Ensembl releases are based also on hg38, (b) we did not find any fault in the LongTarget program during our use, nor received any one from users, (c) predicting lncRNAs’ DBSs using the LongTarget program is highly time-consuming.  

      (2) Second, to assess the influence of newer Ensembl releases, we compared the promoters annotated in release 79 and in release 115. We found that the vast majority (87.3%) of promoters newly annotated in release 115 belong to non-coding genes. Thus, using release 115 may predict more DBSs in non-coding genes, but downstream analyses based on protein-coding genes would be essentially the same (meaning that all figures and tables would be the same).

      (3) Third, a key element of this study is GTEx data analysis, and these data were also published years ago.  

      (4) Finally, some lncRNA genes have new gene symbols in new Ensembl releases. To allow researchers to use our data conveniently, we have added a new column titled "Gene symbol (Ensembl release115)" to Supplementary Tables 2A and 2B.  

      Summary:

      Major changes based on Reviewer’s comments:

      (1) The following revisions are made to address the comment on “the 0.034 threshold”: (a) Section 2.3, section 2.4, Supplementary Note 3, and related contents in Discussion and Methods are revised, (b) new Figure 2, Supplementary Figure 15, new Supplementary Table 5,6,7, (c) Table 2 and Supplementary Table 8 are revised.

      (2) To address the comment on “new TFBS section”, section 2.6 and section 4.13 are revised.  

      (3) To address the comment on “97.81% and 2.2% of DBSs”, section 2.3 is revised.

      (4) The following revisions are made to address the comment on “release 79”: (a) the old Supplementary Table 2, 3 are merged to Supplementary Table 2AB, and the new column "Gene symbol (Ensembl release115)" is added to Supplementary Table 2AB, (b) accordingly, Supplementary Table 4,5 are renamed to Supplementary Table 3,4.

      Additional revisions:

      (1) Section 2.5 “Young weak DBSs may have greatly promoted recent human evolution” is moved into Supplementary Note 3 (which now has the subtitle “Target genes with specific DBS features are enriched in specific functions”), because this section is short and lacking sufficient cross-validation.

      (2) Considerable minor revisions of sentences have been made.

      (3) Since there are many supplementary figures, the main text now cites only Supplementary Notes, as the reader can easily access supplementary figures in Supplementary Notes.

    1. Author response:

      The following is the authors’ response to the previous reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study evaluates the role of visual experience in shaping functional correlations between human extrastriate visual cortex and frontal regions. The authors used fMRI to assess "resting-state" temporal correlations in three groups: sighted adults, congenitally blind adults, and neonates. Previous research has already demonstrated differences in functional correlations between visual and frontal regions in sighted compared to early blind individuals. The novel contribution of the current study lies in the inclusion of an infant dataset, which allows for an assessment of the developmental origins of these differences.

      The main results of the study reveal that correlations between prefrontal and visual regions are more prominent in the blind and infant groups, with the blind group exhibiting greater lateralization. Conversely, correlations between visual and somato-motor cortices are more prominent in sighted adults. Based on these data, the authors conclude that visual experience plays an instructive role in shaping these cortical networks. This study provides valuable insights into the impact of visual experience on the development of functional connectivity in the brain.

      Strengths:

      The dissociations in functional correlations observed among the sighted adult, congenitally blind, and neonate groups provide strong support for the main conclusion regarding postnatal experience-driven shaping of visual-frontal connectivity.

      The inclusion of neonates offers a unique and valuable developmental anchor for interpreting divergence between blind and sighted adults. This is a major advance over prior studies limited to adult comparisons.

      Convergence with prior findings in the blind and sighted adult groups reinforces the reliability and external validity of the present results.

      The split-half reliability analysis in the infant data increases confidence in the robustness of the reported group differences.

      Weaknesses:

      The manuscript risks overstating a mechanistic distinction between sighted and blind development by framing visual experience as "instructive" and blindness as "reorganizing." Similarly, the binary framing of visual experience and blindness as independent may oversimplify shared plasticity mechanisms.

      The interpretation of changes in temporal correlations as altered neural communication does not adequately consider how shifts in shared variance across networks may influence these measures without reflecting true biological reorganization.

      The discussion does not substantively engage with the longstanding debate over whether sensory experience plays an instructive or permissive role in cortical development.

      The relationship between resting-state and task-based findings in blindness remains unclear.

      Reviewer #2 (Public review):

      Summary:

      Tian et al. explore the developmental origins of cortical reorganization in blindness. Previous work has found that a set of regions in the occipital cortex show different functional responses and patterns of functional correlations in blind vs. sighted adults. Here, Tian et al. explore how this organization arises over development. Is the "starting state" more like the blind pattern, or more like the adult pattern? Their analyses reveal that the answer depends on the particular networks investigated. Some functional connections in infants look more like blind than sighted adults; other functional connections look more like sighted than blind adults; and others fall somewhere in the middle, or show an altogether different pattern in infants compared with both sighted and blind adults.

      Strengths:

      The paper addresses very important questions about the starting state in the developing visual cortex, and how cortical networks are shaped by experience. Another clear strength lies in the unequivocal nature of many results. Many results have very large effect sizes, critical interactions between regions and groups are tested and found, and infant analyses are replicated in split halves of the data.

      Weaknesses:

      While potential roles of experience (e.g., visual, cross-modal) are discussed in detail, little consideration is given to the role of experience-independent maturation. The infants scanned are extremely young, only 2 weeks old. It is possible then that the sighted adult pattern may still emerge later in infancy or childhood, regardless of infant visual experience. If so, the blind adult pattern may depend on blindness-related experience only (which may or may not reflect "visual" experience per se). In short, it is not clear that birth, or the first couple weeks of life, are a clear cut "starting point" for development, after which all change can be attributed to experience.

      Reviewer #3 (Public review):

      Summary

      This study aimed to investigate whether the differences observed in the organization of visual brain networks between blind and sighted adults result from a reorganization of an early functional architecture due to blindness, or whether the early architecture is immature at birth and requires visual experience to develop functional connections. This question was investigated through the comparison of 3 groups of subjects with resting-state functional MRI (rs-fMRI). Based on convincing analyses, the study suggests that: 1) secondary visual cortices showed higher connectivity to prefrontal cortical regions (PFC) than to non-visual sensory areas (S1/M1 and A1) in infants like in blind adults, in contrast to sighted adults; 2) the V1 connectivity pattern of infants lies between that of sighted adults (showing stronger functional connectivity with non-visual sensory areas than with PFC) and that of blind adults (showing stronger functional connectivity with PFC than with non-visual sensory areas); 3) the laterality of the connectivity patterns of infants resembled those of sighted adults more than those of blind adults, but infants showed a less differentiated fronto-occipital connectivity pattern than adults.

      Strengths

      - The question investigated in this article is important for understanding the mechanisms of plasticity during typical and impaired development, and the approach considered, which compares different groups of subjects including, neonates/infants and blind adults, is highly original.

      - Overall, the presented analyses are solid and well detailed, and the results and discussion are convincing.

      Weaknesses

      - While it is informative to compare the "initial" state (close to birth) and the "final" states in blind and sighted adults to study the impact of post-natal and visual experience, this study does not analyze the chronology of this development and when the specialization of functional connections is completed. This would require investigating the evolution of functional connectivity of the visual system as a function of visual experience and thus as a function of age, at least during toddlerhood given the early and intense maturation of the visual system after birth. This could be achieved by analyzing different developmental periods using open databases such as the Baby Connectome Project.

      - The rationale for grouping full-term neonates and preterm infants (scanned at term-equivalent age) is not understandable when seeking to perform comparisons with adults. Even if the study results do not show differences between full-terms and preterms in terms of functional connectivity differences between regions and of connectivity patterns, preterms group had different neurodevelopment and post-natal (including visual) experiences (even a few weeks might have an impact). And actually they show reduced connectivity strength systematically for all regions compared with full-terms (Sup Fig 7). Considering a more homogeneous group of neonates would have strengthen the study design.

      - The rationale for presenting results on the connectivity of secondary visual cortices before the one of primary cortices (V1) could be clarified.

      - The authors acknowledge the methodological difficulties for defining regions of interest (ROIs) in infants in a similar way as adults. Since the brain development is not homogeneous and synchronous across brain regions (in particular with the frontal and parietal lobes showing a delayed growth), this poses major problems for registration. This raises the question of whether the study findings could be biased by differences in ROI positioning across groups.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors are appropriately cautious in many parts of the discussion and include several helpful control analyses. Nonetheless, additional clarification of key assumptions and potential confounds would strengthen the paper.

      (1) The current framing labels vision as "instructive" and blindness as "reorganizing," but it is unclear why these two experiential factors are characterized differently. Both involve activity-dependent changes to functional architecture from a shared immature scaffold. Labeling them differently risks conflating divergent outcomes with distinct underlying mechanisms. Just because visual and blind adults show different patterns of functional connectivity does not mean they reflect separate processes. While the discussion briefly acknowledges the possibility of shared plasticity mechanisms, much of the framing across the manuscript, including in the abstract and introduction, implies a dichotomy. A clearer articulation of the criteria used to assign these labels, or reconsideration of whether such a distinction is warranted, would improve conceptual clarity. The current framing appears analogous to saying that "heat causes expansion" and "cold causes contraction" as if these were separate mechanisms, when they are actually two directions of change along a single factor: temperature. A more parsimonious framework, such as activity-dependent reweighting of pre-existing connectivity, may better capture the nature of plasticity at play in both sighted and blind development.

      Following the reviewer’s suggestion, we have revised the manuscript to clarify that both vision and blindness can be understood as manifestations of a common framework of experience-driven plasticity. We removed all mention of reorganization and clarify and modified the wording throughout.

      Specifically:

      Abstract: “Are infant visual cortices functionally like those of sighted adults, with blindness leading to functional change? We find that, on the contrary that secondary visual cortices of infants are functionally more like those of blind adults: stronger coupling with PFC than with nonvisual sensory-motor networks, suggesting that visual experience modifies elements of the sighted-adult long-range functional connectivity profile. Infant primary visual cortices are in-between blind and sighted adults i.e., more balanced PFC and sensory-motor connectivity than either adult group. The lateralization of occipital-to-frontal connectivity in infants resembles the sighted adults, consistent with the idea that blindness leads to functional change. These results suggest that both vision and blindness modify functional connectivity through experience-driven (i.e., activity-dependent) plasticity.” (Page 1, Line 13)

      Introduction: We replaced “blindness leads to functional reorganization” with “blindness modifies this functional connectivity” (Page 2, Line 52), and the following sentence has also been modified to: “lifetime visual experience shapes connectivity toward the sighted-adult pattern” (Page 2, Line 54) For the lateralization patterns, we now describe them as “blindness-related modification” rather than “reorganization”, to keep the interpretation descriptive rather than mechanistic. (Page 4, Line 114),

      (2) In interpreting the functional correlation differences, the discussion should more explicitly consider how statistical interdependence between areas could influence the observed results. For example, an increase in shared variance between visual and motor areas, such as might result from visually guided action, could result in a reduction in the apparent strength of visual-prefrontal temporal correlation (at the resolution of fMRI) without any true biological change in communication between visual-prefrontal cortex. This possibility is not ruled out by reporting groupwise patterns of relative connectivity. A more cautious systems-level framing could help clarify the distinction between neural plasticity and statistical redistribution of variance.

      We thank the reviewer for raising this important point. We agree that resting-state fMRI provides a measure of statistical synchrony in BOLD signals rather than direct causal interactions between regions. This a fundamental limitation of resting state fMRI, which we now note in the Discussion section. Such changes in correlation are consistent with a variety of underlying biological mechanisms. Online task is one factor that influences cross-region correlations. In the current study, both blind and sighted groups were measured while blindfolded and were not performing visually guided actions during the resting state fMRI scans. It is possible that past visual-guided action experience changes the resting state correlations of sighted participants. Indeed, this is one interesting hypothesis.

      In the revised Discussion, we now explicitly note this limitation and clarify that differences in FC do not by themselves establish whether or how underlying neurophysiological mechanisms are changed. We also emphasize that future work will need to investigate whether FC changes are accompanied by alterations in structural connectivity and to probe causal interactions and mechanistic underpinnings as follows:

      “Resting-state functional connectivity captures synchrony in BOLD signal fluctuations rather than causal interactions and differences in functional connectivity cannot on their own reveal how underlying neurophysiological mechanisms are modified.” (page 13,line 342)

      “Future studies will be needed to determine whether these functional changes are accompanied by alterations in structural connectivity, and to probe causal interactions and mechanistic underpinnings.” (page 13,line 350)

      (3) The mechanistic interpretation of group differences in visual-motor coupling would benefit from stronger network-level justification. Direct connections between these areas are sparse in primates. If effects reflect indirect polysynaptic interactions or shared thalamic input, as the authors suggest, one might expect corresponding group differences in intermediate regions (e.g., parietal cortex, thalamus) that mediate these interactions. Is there any evidence for this in the data?

      We thank the reviewer for raising this point. We agree and as noted above, resting state fMRI cannot distinguish between direct causal interactions between two regions and ones that a mediating region is involved. This is a fundamental limitation of resting state fMRI. The current study further focused on testing a specific hypothesis motivated by previously observed group differences between blind and sighted adults and our analyses focused on ROI-to-ROI connectivity between occipital, frontal, and sensory-motor cortices, and did not include these additional regions. In prior work, we and others, have looked at effects in parietal cortices (Abboud & Cohen, 2019; Bedny et al., 2009; Deen et al., 2015; Kanjlia et al., 2016, 2021; Sen et al., 2022). In blindness, parietal networks show increased correlations with some visual areas, rather than decreased. Regarding the thalamus, there is less clear evidence and there is some ongoing work trying to address this question. A couple of studies suggest that there is indeed increased connectivity between some parts of the thalamus and visual cortex in blindness. Although the anatomical information is limited, some of the work suggests that this increase is with higher-cognitive nuclei of the thalamus (Bedny et al., 2011; Liu et al., 2007).

      We agree that this is an important direction for future work. To acknowledge this point, we have revised the manuscript to highlight the potential role of cortical and subcortical hub regions in mediating connectivity changes. The text has been modified as follows:

      “Connectivity changes between two areas could be mediated by ‘third-party’ hub regions. For example, posterior parietal cortex serves as a cortical hub for multisensory integration and visuo-motor coordination and could mediate occipital-to-sensory-motor communication (Rolls et al., 2023; Sereno & Huang, 2014). Subcortical structures such as the thalamus could also play a mediating role (Vega-Zuniga et al., 2025).” (page 13,line 345)

      (4) The discussion would benefit from deeper engagement with prior work on experience-dependent plasticity, particularly the longstanding distinction between instructive and permissive roles of experience. While the authors briefly define these concepts and reference their historical use, a more explicit consideration of how their findings relate to this broader literature would help clarify whether such distinctions are necessary or appropriate.

      We thank the reviewer for this thoughtful suggestion to engage more explicitly with the longstanding literature on instructive versus permissive roles of experience. However, most of this literature comes from animal models, where experimental manipulations of the anatomical structure, of experience itself (e.g., controlled rearing studies) and sometimes of neural activity patterns allow clear tests of these mechanisms. Such manipulations are not feasible in humans. The terminology in the animal literature does not directly map onto the methods and data available in the present study or in other work with humans. For this reason, the current data does not allow us to fully engage with the debates in the animal literature and doing risks overinterpreting our findings.

      Nevertheless, we agree that once the instructive/permissive framework has been introduced, it is important to clarify how our results relate to it, rather than only providing definitions. We have therefore added the following text to the discussion:

      “In humans, such manipulations are not feasible, leaving us to study only the consequences of the presence or absence of vision. Under an instructive account, visual and multisensory experience could strengthen coupling between visual and other non-visual sensory-motor cortices through coordinated activity, thereby establishing the sighted-adult connectivity pattern. In the absence of visual input, by contrast, the lack of such coordinated activity may prevent these couplings from being established. Alternatively, vision may act permissively, indirectly enabling maturational processes that shift connectivity toward the sighted-adult configuration.” (page 14,line 362)

      (5) The revised discussion acknowledges the divergence between resting-state and task-based findings, but does not fully frame the theoretical implications of this discrepancy. Although this study cannot resolve the issue with its own data, a more integrative discussion could help clarify whether these measures reflect distinct functional states, developmental trajectories, or mechanisms of plasticity. Without such framing, readers are left without clear guidance on how to reconcile the present results with prior work on cross-modal recruitment in blindness.

      We thank the reviewer for this thoughtful comment. We agree that know how resting-state evidence relates to task-based evidence is a fundamentally important issue. We now discuss this more in the Introduction as well as in the Discussion.

      There is a sizable literature of both task-based and resting state studies. Some of prior studies have measured resting state and task-based data within the same participants and found relationships (Kanjlia et al., 2016, 2021; Lane et al., 2015). We now clarify this in the introduction. These studies find that within visual cortices of blind people, the task-based profile of a cortical area is related to its resting state connectivity pattern (Abboud & Cohen, 2019; Deen et al., 2015; Kanjlia et al., 2016, 2021). This suggests that these two measures are related. However, the timecourse of this relationship, the developmental trajectory and mechanism of plasticity is not known. We note this now in the introduction on page 2. Primarily this is because there is very little relevant developmental evidence. For example, in the current study we find that the resting state profile of secondary visual networks in infants is similar to that of blind adults. However, we do not know whether the visual cortices of infants show task-based cross modal responses. To our knowledge nobody has tested this question. We agree with the reviewer that raising this question in the paper is better than not commenting on the relationship at all.

      To address the reviewer’s comment, we have expanded the discussion to situate our results within a developmental framework, highlighting how early intrinsic connectivity may scaffold alternative trajectories shaped by either visual experience or blindness. The revised text now reads as follows:

      “Conversely, for people who remain blind throughout life, visual-PFC connectivity could enable recruitment of visual cortices for higher-order non-visual functions, such as language and executive control (Bedny et al., 2011; Kanjlia et al., 2021). Our results suggest that blind adults may build on connectivity patterns already present in infancy: like blind adults, sighted infants show stronger occipital–PFC than occipital–sensory–motor coupling. Repeated engagement of occipital networks during higher cognitive tasks in early development could intern enhance connectivity and specialization of visual networks for non-visual higher-order functions.

      Some prior studies have measured resting-state and task-based functional profiles in the same participants. These studies find that within visual cortices of blind people, the task-based profile of a cortical area is related to its resting state connectivity pattern (citations.) This suggests that these two measures are related. However, the timecourse of this relationship, the developmental trajectory and mechanism of plasticity is not known. Primarily this is because there is very little relevant developmental evidence. For example, in the current study we find that the resting state profile of secondary visual networks in infants is similar to that of blind adults. However, we do not know whether the visual cortices of infants show enhanced task-based cross modal responses, relative to sighted adults and how this compares to responses observed in blind adults. Future work with infants and children would be able to address this question.

      In the current study, the clearest evidence for functional change driven by blindness was observed for laterality. Connectivity lateralization in sighted infants resembles that of sighted adults, in both V1 and secondary visual cortices. Relative to both sighted infants and sighted adults, blind adults show more lateralized connectivity patterns between occipital and prefrontal cortices. Previous studies suggest that in people born blind occipital and non-occipital language responses are co-lateralized (Lane et al., 2017; Tian et al., 2023). We speculate that habitual activation of visual cortices by higher-cognitive tasks, such as language, which are themselves highly lateralized, contributes to this biased connectivity pattern of occipital cortex in blindness. Taken together, these results suggest a developmental framework in which intrinsic connectivity present in infancy provides a scaffold that is subsequently shaped and reinforced by experience-dependent recruitment, through either visual experience or the lifelong absence of vision in blindness. Longitudinal work across successive developmental stages will be crucial to test how the alternative trajectories shaped by visual experience versus blindness unfold over development.” (page 14-15)

      (6) The split-half reliability analysis is a valuable control. Additional details would clarify what these noise ceilings reflect. Were the rsFC patterns for each ROI calculated only for the ROIs included in the current study or was a broader assessment across the whole brain performed? It also would be helpful to report whether reliability differed for individual ROIs within and between groups. Even if global reliability is matched, selective differences could influence group comparisons. Several infants in the dhcp dataset were scanned twice. Were any second scans included in the current analyses? Comparing first versus second scans directly could strengthen the claim that several weeks of visual experience are insufficient to shift connectivity toward a sighted adult profile.

      Thanks to the reviewer’s comments on the reliability of the current study.

      In the present study, the noise ceiling was computed from the reliability of the ROI-wise FC profiles used across all analyses. Reliability was estimated using a split-half procedure: each rs-fMRI time series was divided into two equal halves, FC among all ROIs included in the study was computed separately for each half, and the noise ceiling for each ROI was defined as the Pearson correlation between its two FC profiles. Then we averaged these ROI-wise noise ceilings to evaluate group-level reliability, which exceeded 0.70 in all three groups and found no significant difference across groups. This provides an estimate of the upper bound on explainable variance for the exact FC features subjected to statistical testing (Lage-Castellanos et al., 2019). A brief description has been added to the manuscript (page 19, line 518).

      Regarding the reviewer’s question about the scope of rsFC features used in the noise-ceiling analysis: we computed noise ceilings only for the ROIs included in the present study, because all analyses in this work were conducted at the ROI–ROI level and did not involve voxelwise whole-brain FC. Thus, the noise-ceiling estimates correspond directly to the full set of FC features on which all statistical comparisons were based.

      As suggested by the reviewer, we examined noise ceilings for each ROI separately. All ROIs showed high absolute reliability (noise ceiling > 0.80) across the three groups, indicating that the ROI-wise FC estimates are generally robust across participants. Although many ROIs exhibited statistically significant group differences in noise ceiling (one-way ANOVA, p < 0.05), the effect sizes were small to moderate (partial η<sup>2</sup> < 0.14). These differences indicate that reliability may vary modestly across groups at the ROI level, and we cannot fully determine whether such variability contributes to the observed different FC patterns across groups. We have included this point in the revised manuscript (page 19, line 525), along with the full statistical results for the ROI-wise noise ceilings in the Supplementary Table S2.

      Last, we fully agree that longitudinal comparisons across multiple time points can provide important insights into how early visual experience shapes connectivity. At the same time, in the present dataset, the first scan occurred at a preterm age and the second at term-equivalent age. The differences between the first and second scans would reflect not only additional weeks of visual input, but also differences in prematurity status and overall neurodevelopmental maturity, which would make the interpretation of such comparisons difficult in the context of our current aims. We have clarified in the revised manuscript that only term-equivalent (second) scans were included. We see careful longitudinal work as an important avenue for addressing this question more directly.

      (7) The signal dropout assessment in the infant dataset is a valuable quality control step. Applying the same metric to the adult datasets would help harmonize preprocessing across groups and increase confidence in group-level comparisons.

      Thank you for this valuable suggestion. Following your comment, we applied the same signal dropout assessment to the adult datasets. One participant in the sighted adult group and two participants in the blind adult group showed signal dropout in one ROI each. The corresponding results are now included in the Supplementary Materials (Figure S13). The findings remain unchanged after this additional control analysis. We also add the relevant content in the Method part as follows:

      “The same signal dropout assessment was also applied to the blind and sighted adults to ensure consistent quality control across groups. One participant in the sighted adult group and two participants in the blind adult group exhibited signal dropout in one ROI each. Excluding these participants did not alter the group-level results (see Figure S13).” (page 16, line 449)

      Minor:

      (8) The authors added accurate anatomical descriptions to the methods but a less precise characterization remains in the introduction: "Anatomically, these regions correspond roughly to the location of areas such as motion area V5/MT+, the lateral occipital complex (LO), V3a and V4v in sighted people."

      We thank the reviewer for this helpful comment. We have revised the Introduction to provide a fuller anatomical description, consistent with the Methods. The text now reads:

      “Anatomically, these regions in sighted people approximately correspond to the locations of motion-sensitive V5/MT+ and the lateral occipital complex (LO), as well as ventral portions of occipito-temporal cortex including V4v and dorsal portions including V3a. The occipital ROI also extends ventrally into the middle portion of the ventral temporal lobe and dorsally into the intraparietal sulcus and superior parietal lobule.” (page 3, line 88)

      (9)Typo: "lager effect" should be "larger effect."

      Secondary visual cortices showed a significant within > between difference in both groups, with a lager effect in the blind group (post-hoc tests, Bonferroni-corrected paired: t-test: sighted adults within hemisphere > between hemisphere: t (49) = 7.441, p = 0.012; blind adults within hemisphere > between hemisphere: t (29) = 10.735, p < 0.001; V1: F(1, 78) =87.211, p < 0.001).

      We thank the reviewer for catching this typo. We have corrected “lager effect” to “larger effect” in the revised manuscript. (page 9, line 214)

      Reviewer #2 (Recommendations for the authors):

      All of my other concerns were adequately addressed.

      We thank the reviewer for their positive evaluation, and we are glad that our revisions have addressed their concerns.

      Reviewer #3 (Recommendations for the authors):

      In my view, qualifying infants as "sighted" is confusing and unnecessary: why not simplifying and homogenizing the wording along the manuscript and figures?

      We thank the reviewer for this suggestion. We agree and have revised the manuscript to use consistent wording, avoiding the qualification of infants as “sighted.”

      l188, I don't understand the sentence "By contrast, in sighted adults, this cross-hemisphere difference is weak or absent."

      We thank the reviewer for noting that this sentence was unclear. We have revised the text to provide a more precise explanation. The text now reads:

      “By contrast, in sighted adults this lateralized pattern is weaker: visual areas in each hemisphere show only a modest preference for ipsilateral prefrontal cortices, and connectivity with the contralateral PFC remains comparatively strong.” (page 8, line 207)

      l193: "Secondary visual cortices showed a significant within > between difference in both groups, with a lager effect in the blind group": providing effect sizes for the 2 groups would strengthen this result (+ note the typo laRger).<br /> - Figure S7, S11: Please add titles of y-axes.

      Thank you for this helpful suggestion. We have corrected the typo and added the effect sizes for both groups in the revised text. The revised sentence now reads as follows:

      “Secondary visual cortices showed a significant within > between difference in both groups, with a larger effect in the blind group (post-hoc tests, Bonferroni-corrected paired: t-test: sighted adults within hemisphere > between hemisphere: t (49) = 7.441, p = 0.012, cohen’d = 0.817; blind adults within hemisphere > between hemisphere: t (29) = 10.735, p < 0.001, cohen’d = 1.96).” (page 9, line 214)

      Titles of the y-axes have also been added to Figures S7 and S11.

    1. Reviewer #1 (Public review):

      Summary:

      Lesser et al provide a comprehensive description of Drosophila wing proprioceptive sensory neurons at the electron microscopy resolution. This "tour-de-force", provides a strong foundation for future structural and functional research aimed at understanding wing motor control in Drosophila with implications to understanding wing control across other insects.

      Strengths:

      (1) Authors leverage previous research that described many of the fly wing proprioceptors, and combine this knowledge with EM connectome data such that they now provide a near-complete morphological description of all wing proprioceptors.

      (2) Authors cleverly leverage genetic tools and EM connectome data to tie the location of proprioceptors on the wings with axonal projections in the connectome. This enables them to both align with previous literature as well as make some novel claims.

      (3) In addition to providing a full description of wing proprioceptors, authors also identified a novel population of sensors on the wing tegula that make direct connections with the B1 wing motor neurons implicating the role of tegula in wing movements that was previously underappreciated.

      (4) Despite being the most comprehensive description so far, it is reassuring that authors clearly state the missing elements in the discussion.

      Weaknesses:

      (1) Authors do their main analysis on data from FANC connectome but provide corresponding IDs for sensory neurons in the MANC connectome. I wonder how the connectivity matrix compares across FANC and MANC if the authors perform similar analysis as they have done in Fig. 2. This could be a valuable addition and potentially also pick up any sexual dimorphism.

      (2) Authors speculate about presence of gap junctions based on density of mitochondria. I'm not convinced about this given mitochondrial densities could reflect other things that correlate with energy demands in sub-compartments.

      Overall, I consider this an exceptional analysis which will be extremely valuable to the community.

    2. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Lesser et al provide a comprehensive description of Drosophila wing proprioceptive sensory neurons at the electron microscopy resolution. This “tour-de-force” provides a strong foundation for future structural and functional research aimed at understanding wing motor control in Drosophila with implications for understanding wing control across other insects.

      Strengths:

      (1) The authors leverage previous research that described many of the fly wing proprioceptors, and combine this knowledge with EM connectome data such that they now provide a near-complete morphological description of all wing proprioceptors.

      (2) The authors cleverly leverage genetic tools and EM connectome data to tie the location of proprioceptors on the wings with axonal projections in the connectome. This enables them to both align with previous literature as well as make some novel claims.

      (3) In addition to providing a full description of wing proprioceptors, the authors also identified a novel population of sensors on the wing tegula that make direct connections with the B1 wing motor neurons, implicating the role of the tegula in wing movements that was previously underappreciated.

      (4) Despite being the most comprehensive description so far, it is reassuring that the authors clearly state the missing elements in the discussion.

      Weaknesses:

      (1) The authors do their main analysis on data from the FANC connectome but provide corresponding IDs for sensory neurons in the MANC connectome. I wonder how the connectivity matrix compares across FANC and MANC if the authors perform a similar analysis to the one they have done in Figure 2. This could be a valuable addition and potentially also pick up any sexual dimorphism.

      We agree that systematic comparisons will provide valuable insights as more connectome datasets become available. However, the primary goal of this study was to link central axon morphology with peripheral structures in the wing. We deliberately omitted more detailed and quantitative analyses of the downstream VNC circuitry, apart from providing a global view of the connectivity matrix and using it to cluster the sensory axon types. A more detailed and systematic comparison of wing sensorimotor circuit connectivity across different connectome datasets (FANC, MANC, BANC, IMAC) is the subject of ongoing work in our lab, which we feel is beyond the scope of this study. Here, we chose to match the wing proprioceptors to axons in MANC to demonstrate their stereotypy across individuals and to make them more accessible to other researchers. We found no obvious sexual dimorphism at the level of wing sensory neurons. We now note this in the Discussion.

      (2) The authors speculate about the presence of gap junctions based on the density of mitochondria. I’m not convinced about this, given that mitochondrial densities could reflect other things that correlate with energy demands in sub-compartments.

      We have moved speculation about mitochondria and gap junctions to the Discussion.

      (3) I’m intrigued by how the tegula CO is negative for iav. I wonder if authors tried other CO labeling genes like nompc. And what does this mean for the nature of this CO. Some more discussion on this anomaly would be helpful.

      Based on this suggestion, we have added an image showing that tegula CO neurons are labeled by nompC-Gal4.

      (4) The authors conclude there are no proprioceptive neurons in sclerite pterale C based on Chat-Gal4 expression analysis. It would be much more rigorous if authors also tried a pan-neuronal driver like nsyb/elav or other neurotransmitter drivers (Vglut, GAD, etc) to really rule this out. (I hope I didn’t miss this somewhere.)

      To address this, we imaged OK371-GFP, which labels glutamatergic neurons, in the wing and wing hinge. We saw expression in the wing, as others have reported (Neukomm et. al., 2014), but we saw no expression at the wing hinge. Apart from a handful of glutamatergic gustatory neurons in the leg, we are not aware of any other sensory neurons in the fly that are not labeled by Chat-Gal4.

      Overall, I consider this an exceptional analysis that will be extremely valuable to the community.

      We sincerely appreciate the reviewer’s positive feedback.

      Reviewer #2 (Public review):

      Summary:

      Lesser et al. present an atlas of Drosophila wing sensory neurons. They proofread the axons of all sensory neurons in the wing nerve of an existing electron microscopy dataset, the female adult fly nerve cord (FANC) connectome. These reconstructed sensory axons were linked with light microscopy images of full-scale morphology to identify their origin in the periphery of the wing and encoded sensory modalities. The authors described the morphology and postsynaptic targets of proprioceptive neurons as well as previously unknown sensory neurons.

      Strengths:

      The authors present a valuable catalogue of wing sensory neurons, including previously undescribed sensory axons in the Drosophila wing. By providing both connectivity information with linked genetic drive lines, this research facilitates future work on the wing motor-sensory network and applications relating to Drosophila flight. The findings were linked to previous research as well as their putative role in the proprioceptive and nerve cord circuitry, providing testable hypotheses for future studies.

      Weaknesses:

      (1) With future use as an atlas, it should be noted that the evidence is based on sensory neurons on only one side of the nerve cord. Fruit flies have stereotyped left/right hemispheres in the brain and left/right hemisegments in the nerve cord. The comparison of left and right neurons of the nervous system can give a sense of how robust the morphological and connectivity findings are. Here, the authors have not compared the left and right side sensory axons from the wing nerve, leaving potential for developmental variability across samples and left/right hemisegments.

      The right ADMN nerve in the FANC dataset is partially severed, making left/right comparisons unreliable (see Azevedo 2024, Extended Data Figure 4). We have updated the text to explain this within the Methods section of the paper.

      (2) Not all links between the EM reconstructions and driver lines are convincing. To strengthen these, for all EM-LM matches in Figures 3-7, rotated views of the driver line (matching the rotated EM views) should be shown to provide a clearer comparison of the data. In particular, Figure 3G and Figure 7B are not very convincing based on the images shown. MCFO imaging of the driver lines in Figure 3G and 7B would make this position stronger if a clone that matches the EM reconstruction could be identified.

      Many of the z-stack images in the paper are from the Janelia FlyLight collection, and unfortunately their imaging parameters were not optimized for orthogonal views. Rotated views are blurry and not especially helpful for comparison to EM reconstruction. We now point out in the text that interested readers can access the z-stacks from FlyLight to see the dorsal-ventral projections.

      Regarding Figure 3G and 7B, we have added markers to the image with corresponding descriptions in the legend to guide the reader through the image of the busy driver line. Although these lines label many cells in the VNC as a whole, they sparsely label cells in the ADMN, making them nonetheless useful for identifying peripheral sensory neurons.

      (3) Figure 7B looks like the driver line might have stochastic expression in the sensory neuron, which further reduces confidence in the result shown in Figure 7C. Is this expression pattern in the wing consistently seen? Many split-GAL4s have stochastic expressions. The evidence would be strengthened if the authors presented multiple examples (~4-5) of each driver line’s expression pattern in the supplement.

      Figure 7B shows sparse labeling of the driver line using the MCFO technique, as specified in the legend. Its unilateral expression is therefore not due to stochastic expression of the Gal4 line. We have added the “MFCO” label to the image to clarify.

      (4) Certain claims in this work lack quantitative evidence. On line 128, for instance, “Overall, our comprehensive reconstruction revealed many morphological subgroups with overlapping postsynaptic partners, suggesting a high degree of integration within wing sensorimotor circuits.” If a claim of subgroups having shared postsynaptic partners is being made, there should have been quantitative evidence. For example, cosine similar amongst members of each group compared to the cosine similarity of shuffled/randomised sets of axons from different groups. The heat map of cosine similarity in Figure 2B alone is not sufficient.

      We agree that illustrating the extent of shared postsynaptic partners across subgroups strengthens this point. We added a visualization showing pairwise similarity scores for within- and between-cluster neuron pairs (Figure 2B inset). We also performed a permutation test to determine that within-cluster similarity is significantly higher than between clusters, and we report the test in the results as well as the figure legend. This analysis provides a more quantitative summary of the qualitative trends in connectivity that are summarized in Figure 2B.

      (5) Similarly, claims about putative electrical connections to b1 motor neurons are very speculative. The authors state that “their terminals contain very densely packed mitochondria compared to other cells”, without providing a quantitative comparison to other sensory axons. There is also no quantitative comparison to the one example of another putative electrical connection from the literature. Further, it should be noted that this connection from Trimarchi and Murphey, 1997, is also stated as putative on line 167, which further weakens this evidence. Quantification would strongly strengthen this position. Identification of an example of high mitochondrial density at a confirmed electrical connection would be even better. In the related discussion section “A potential metabolic specialization for flight circuitry”, it should be more clearly noted that the dense mitochondria could be unrelated to a putative electrical connection. If the authors have an alternative hypothesis about the mitochondria density, this should be stated as well.

      We agree with the reviewer that the link between mitochondrial density and metabolic specialization is purely speculative in this context. Based on reviewer feedback, we have moved all mention of the relationship between mitochondrial density and gap junction coupling to the Discussion. We acknowledge that this may seem like a somewhat random and not quantitatively supported observation. However, we found the coincidence striking and worthy of mention, though it is only tangentially relevant to the rest of the paper. From conversations with colleagues, we have also heard that this relationship is consistent with as yet unpublished work in other model organisms (e.g., zebrafish, mouse).

      The electrical coupling to b1 motor neurons is well-established (Fayyazuddin and Dickinson, 1999), and we have updated the text to state this more clearly. However, we agree that whether the specific neurons we have identified based on their anatomy are the same ones functionally identified through whole-nerve recordings remains unknown.

      (6) It would be appropriate to cite previous work using a similar strategy to match sensory axons to their cell bodies/dendrites at the periphery using driver lines and connectomics (see Figure 5 for example in the following paper: https://doi.org/10.7554/eLife.40247 ).

      At this point, there are now dozens of papers that match the axons of sensory neurons to their cell bodies/dendrites in the periphery by comparing light microscopy and connectomics. When we dug in, we found examples in C. elegans, Ciona intestinalis, zebrafish, and mouse, all published prior to the study cited above. For basically every animal for which scientists have acquired EM volumes of neural tissue, they have used other anatomical labeling methods to determine cell types inside and outside the imaged volume. In summary, we found it difficult to establish a single primary citation for this approach. In lieu of this, we have added a citation to an earlier review by a pioneer in EM connectomics that discusses the general approach of matching cells across different labeling/imaging modalities (Meinertzhagen et al., 2009).

      The methods section is very sparse. For the sake of replicability, all sections should be expanded upon.

      We have expanded the methods section, and also a STAR methods table.

      Reviewer #3 (Public review):

      Summary:

      The authors aim to identify the peripheral end-organ origin in the fly’s wing of all sensory neurons in the anterior dorsomedial nerve. They reconstruct the neurons and their downstream partners in an electron microscopy volume of a female ventral nerve cord, analyse the resulting connectome, and identify their origin with a review of the literature and imaging of genetic driver lines. While some of the neurons were already known through previous work, the authors expand on the identification and create a near-complete map of the wing mechanosensory neurons at synapse resolution.

      Strengths:

      The authors elegantly combine electron microscopy, neuron morphology, connectomics, and light microscopy methods to bridge the gap between fly wing sensory neuron anatomy and ventral nerve cord morphology. Further, they use EM ultrastructural observations to make predictions on the signaling modality of some of the sensory neurons and thus their function in flight.

      The work is as comprehensive as state-of-the-art methods allow to create a near-complete mapof the wing mechanosensory neurons. This work will be of importance to the field of fly connectomics and modelling of fly behavior, as well as a useful resource to the Drosophila research community.

      Through this comprehensive mapping of neurons to the connectome, the authors create a lot of hypotheses on neuronal function, partially already confirmed with the literature and partially to be tested in the future. The authors achieved their aim of mapping the periphery of the fly’s wing to axonal projections in the ventral nerve cord, beautifully laying out their results to support their mapping.

      The authors identify the neurons in a previously published connectome of a male fly ventral nerve cord to enable cross-individual analysis of connections. Further, together with their companion paper, Dhawan et al. 2025, describing the haltere sensory neurons in the same EM dataset, they cover the entire mechanosensory space involved in Drosophila flight.

      Weaknesses:

      The connectomic data are only available upon request; the inclusion of a connectivity table of the reconstructed neurons would aid analysis reproducibility and cross-dataset comparisons.

      We have added a connectivity table as well as analysis scripts in the github repository for the paper (https://github.com/EllenLesser/Lesser_eLife_2025).

      Recommendations for the authors:

      Reviewer #2 (Recommendations for the authors):

      The methods section should be expanded in every aspect. Most pressing sections are:

      (1) Data and Code availability: All code should be included as a Zenodo database, the suggestion to ask authors for code upon request is inappropriate.

      We have added all code to a public github repository, which is now linked in the Methods section.

      (2) Samples: Standard cornmeal and molasses medium should have a reference, as many institutes use different recipes.

      The recipe used by the University of Washington fly kitchen is based on the Bloomington standard Cornmeal, Molasses and Yeast Medium recipe, which can be found at https://bdsc.indiana.edu/information/recipes/molassesfood.html. The UW recipe is slightly modified for different antifungal ingredients and includes tegosept, propionic acid, and phosophoric acid.

      (3) Table 3: Driver lines labelling wing sensory neurons: The genetic driver lines should have associated Bloomington stock centre numbers. Additionally, relevant information for effector lines used should be included in the methods.

      We now include the Bloomington stock numbers and more information on effector lines in the STAR methods table.

      Minor corrections:

      (1) Lines 119-120: “Notably, many of the axons do not form crisp cluster boundaries, suggesting that multimodal sensory information is integrated at early stages of sensory processing.” We do not follow the logic of this statement and suspect it is a bit too speculative.

      We removed this sentence from the manuscript.

      (2) Figure 1: The ADMN is missing in the schematics and would be helpful to depict for non-experts. Is this what is highlighted in Figure 1D?

      Yes, and we now label 1D as the ADMN wing nerve.

      (3) Figure 1B: Which driver lines are being depicted here? Looking at Table 3 does not clarify. It should be specified at least in the figure legend.

      As stated in the legend, we include a table of all of the driver lines we screened and which sensory structures they label.

      (4) Figure 1C: There are some minor placement issues with the text in the schematic. There is an arrow very close to the “CO” on the top right, which makes the “O” look like the symbol for male. “ax ii” is a bit too close to the wing hinge

      We updated the figure to address this issue.

      (5) Figure 1D: The outlined grey masks are not clear. The use of colour would be very useful for the reader to help understand what the authors are referring to here

      We now use color for the masks.

      (6) Figure 2A: It is unclear if the descending neuron and non-motor efferent neuron are not shown because they are under the described threshold, or to simplify the plot. They should be included in the plot if over the threshold.

      We have updated the legend to specify that the exclusion of the descending and non-motor efferent neurons are to visually simplify the plot. We include % of sensory output to each of these neurons in the legend, and they are included in the connectivity matrix data in the public  GitHub repository associated with the paper, included in the Methods.

      (7) Figure 2B: What clustering is used specifically? The method says it’s from Scikit-learn, but there are many types of clustering available in this package.

      We now include the specific clustering type used in the Methods section, which is agglomerative clustering.

      (8) Figure 3A: What does the green box behind the plot represent?

      The green box represents the tegula CO axons, which we now specify in the legend.

      (9) Figure 3C: the “C” is clipped at the top.

      We updated the figure to address this issue.

      (10) Figure 4A: the main text says a “group of four axons” (line 203) while the figure says 5 axons.

      We updated the text to address this issue.

      (11) Line 360: “We found that the campaniform sensilla on the tegula provide the most direct feedback onto wing steering motor neurons”. We struggled to find where this was directly shown, because several sensory axon types directly synapse onto motor neurons.

      We now specify in the text that this finding is shown in Figure 3.

      Reviewer #3 (Recommendations for the authors):

      I would like to congratulate the authors on their beautiful, easy-to-read, and easy-to-comprehend manuscript, with clear figures and nice visualizations. This work provides a valuable resource that will contribute to the interpretability of connectomic data and further to connectome-based modeling of fly behavior.

      We sincerely appreciate the reviewer’s positive feedback.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      This article deals with the chemotactic behavior of E coli bacteria in thin channels (a situation close to 2D). It combines experiments and simulations.

      The authors show experimentally that, in 2D, bacteria swim up a chemotactic gradient much more effectively when they are in the presence of lateral walls. Systematic experiments identify an optimum for chemotaxis for a channel width of ~8µm, close to the average radius of the circle trajectories of the unconfined bacteria in 2D. It is known that these circles are chiral and impose that the bacteria swim preferentially along the right-side wall when there is no chemotactic gradient. In the presence of a chemotactic gradient, this larger proportion of bacteria swimming on the right wall yields chemotaxis. This effect is backed by numerical simulations and a geometrical analysis.

      If the conclusions drawn from the experiments presented in this article seem clear and interesting, I find that the key elements of the mechanism of this wall-directed chemotaxis are not sufficiently emphasized. Moreover, the paper would be clearer with more details on the hypotheses and the essential ingredients of the analyses.

      We thank the reviewer for these constructive suggestions. We agree that emphasizing the underlying mechanism is crucial for the clarity of our findings. In the revised manuscript, we have now explicitly highlighted the critical roles of chiral circular motion and the alignment effect following side-wall collisions in both the Abstract (lines 25-27) and the Discussion (lines 391-393). Furthermore, we have added a new analysis of bacterial trajectories post-collision (Fig. S2), which demonstrates that cells predominantly align with and swim along the sidewalls. We have also clarified the assumptions in our numerical simulations, specifically how the radius of circular trajectories and the alignment effect are incorporated into the equations of motion. Please refer to our detailed responses in the "Recommendations for the authors" section for further specifics.

      Reviewer #2 (Public review):

      Summary:

      In this study, the authors investigated the chemotaxis of E. coli swimming close to the bottom surface in gradients of attractant in channels of increasingly smaller width but fixed height = 30 µm and length ~160 µm. In relatively large channels, they find that on average the cells drift in response to the gradient, despite cells close to the surface away from the walls being known to not be chemotactic because they swim in circles.

      They find that this average drift is due to the cell localization close to the side walls, where they slide along the wall. Whereas the bacteria away from the walls have no chemotaxis (as shown before), the ones on the left side wall go down-gradient on average, but the ones on the right-side wall go up-gradient faster, hence the average drift. They then study the effect of reducing channel width. They find that chemotaxis is higher in channels with a width of about 8 µm, which approximately corresponds to the radius of the circular swimming R. This higher chemotactic drift is concomitant to an increased density of cells on the RSW. They do simulations and modeling to suggest that the disruption of circular swimming upon collision with the wall increases the density of cells on the RSW, with a maximal effect at w = ~ 2/3 R, which is a good match for their experiments.

      Strengths:

      The overall result that confinement at the edge stabilises bacterial motion and allows chemotaxis is very interesting although not entirely unexpected. It is also important for understanding bacterial motility and chemotaxis under ecologically relevant conditions, where bacteria frequently swim under confinement (although its relevance for controlling infections could be questioned). The experimental part of the study is nicely supported by the model.

      Weaknesses:

      Several points of this study, in particular the interpretation of the width effect, need better clarification:

      (1) Context:

      There are a number of highly relevant previous publications that should have been acknowledged and discussed in relation to the current work:

      https://pubs.rsc.org/en/content/articlehtml/2023/sm/d3sm00286a

      https://link.springer.com/article/10.1140/epje/s10189-024-00450-7

      https://doi.org/10.1016/j.bpj.2022.04.008

      https://doi.org/10.1073/pnas.1816315116

      https://www.pnas.org/doi/full/10.1073/pnas.0907542106

      https://doi.org/10.1038/s41467-020-15711-0

      http://doi.org/10.1038/s41467-020-15711-0

      http://doi.org/10.1039/c5sm00939a

      We appreciate the reviewer bringing these important publications to our attention. We have now cited and discussed these works in the Introduction (lines 55-62 and 76-85) to better contextualize our study regarding bacterial motility and chemotaxis in confined geometries.

      (2) Experimental setup:

      a) The channels are built with asymmetric entrances (Figure 1), which could trigger a ratchet effect (because bacteria swim in circle) that could bias the rate at which cells enter into the channel, and which side they follow preferentially, especially for the narrow channel. Since the channel is short (160 µm), that would reflect on the statistics of cell distribution. Controls with straight entrances or with a reversed symmetry of the channel need to be performed to ensure that the reported results are not affected by this asymmetry.

      We appreciate the reviewer's insight regarding the potential ratchet effect caused by asymmetric entrances. To rule this out, we fabricated a control device with straight entrances and repeated the measurements. As shown in Figure S3, the chemotactic drift velocity follows the same trend as observed in the original setup, confirming an optimal width of ~9 mm. These results demonstrate that the entrance geometry does not bias the reported statistics. We have updated the manuscript text at lines 233-235.

      b) The authors say the motile bacteria accumulate mostly at the bottom surface. This is strange, for a small height of 30 µm, the bacteria should be more-or-less evenly spread between the top and bottom surface. How can this be explained?

      We apologize for not explaining this clearly in the text. As shown by Wei et al., Phys. Rev. Lett. 135, 188401 (2025), significant surface accumulation occurs in channels with heights exceeding 20 µm. In our specific experimental setup, we did not use Percoll to counteract gravity. Therefore, the bacteria accumulated mostly at the bottom surface under the combined influence of gravity and hydrodynamic attraction. This bottom-surface localization is supported by our observation that the bacterial trajectories were predominantly clockwise (characteristic of the bottom surface) rather than counter-clockwise (characteristic of the top surface). We have added this explanation to Line 141.

      c) At the edge, some of the bacteria could escape up in the third dimension (http://doi.org/10.1039/c5sm00939a). What is the magnitude of this phenomenon in the current setup? Does it have an effect?

      We thank the reviewer for raising this important point regarding 3D escape. We have quantified this phenomenon and found the escape rate from the edge into the third dimension to be 0.127 s<sup>-1</sup>. This corresponds to a mean residence time that allows a cell moving at 20 mm/s to travel approximately 157.5 mm along the edge. Since this distance is comparable to the full length of our lanes (~160 mm), most cells traverse the entire edge without escaping. Furthermore, our analysis is based on the average drift of the surface trajectories per unit of time; this metric is independent of the absolute number of cells present. Therefore, the escape phenomenon does not significantly impact our conclusions. We have added a statement clarifying this at line 154.

      d) What is the cell density in the device? Should we expect cell-cell interactions to play a role here? If not, I would suggest to de-emphasize the connection to chemotaxis in the swarming paper in the introduction and discussion, which doesn't feel very relevant here, and rather focus on the other papers mentioned in point 1.

      The cell density in our experiments was approximately 1.3×10<sup>-3</sup> μm<sup>-2</sup>. Given this low density, we do not expect cell-cell interactions to play a role in the observed behaviors.

      Regarding the connection to swarming chemotaxis: We agree that our low-density setup differs from a high-density swarm; however, we believe the comparison remains relevant for two reasons. First, it provides a necessary contrast to studies showing surface inhibition of chemotaxis. Second, while we eliminate cell-cell interactions, we isolate the geometric aspect of swarming. In a swarm, cells move within narrow lanes created by their neighbors. Our device mimics this specific physical confinement by replacing neighboring cells with PDMS sidewalls. This allows us to decouple the effects of physical confinement from cell-cell interactions. We have added the text (Line 370) to clarify this rationale and have incorporated the additional references in introduction as suggested in point 1.

      e) We are not entirely convinced by the interpretation of the results in narrow channels. What is the causal relationship between the increased density on the RSW and the higher chemotactic drift? The authors seem to attribute higher drift to this increased RSW density, which emerges due to the geometric reasons. But if there is no initial bias, the same geometric argument would induce the same increased density of down-gradient swimmers on the LSW, and so, no imbalance between RSW and LSW density. Could it be the opposite that the increased RSW density results from chemotaxis (and maybe reinforces it), not the other way around? Confinement could then deplete one wall due to the proximity of the other, and/or modify the swimming pattern - 8 µm is very close to the size of the body + flagellum. To clarify this point, we suggest measuring the bacterial distributions in the absence of a gradient for all channel widths as a control.

      We thank the reviewer for this insightful comment regarding the causal relationship between cell density and chemotactic drift. We apologize if the initial explanation was unclear.

      Regarding the no-gradient control: Without an attractant gradient (and no initial bias), there is no breaking of symmetry and the labels of "LSW" and "RSW" are arbitrary. Therefore, there will be no asymmetry in the bacterial distributions on both sides (within experimental fluctuations) in the absence of a gradient for any channel width.

      Regarding the causality and density imbalance: We agree that the increased RSW density is a result of chemotaxis, which is then reinforced by the lane geometry especially at narrow lane width. The mechanism relies on the coupling of chemotactic bias with surface circularity. The angle ranges that lead to RSW-UG accumulation (Fig. 6A-C) coincide with the up-gradient direction. Because these cells experience suppressed tumbling (longer runs), they can maintain the steady circular trajectories required to reach and align with the RSW. Conversely, while pure geometric analysis suggests a similar potential for LSW-DG accumulation, these trajectories coincide with the down-gradient direction. These cells experience enhanced tumbling, which distorts the circular trajectories. This prevents them from effectively reaching the LSW and also increases the probability of them leaving the wall. Therefore, the causality is indeed a positive feedback loop: the attractant gradient creates an initial bias that allows the RSW-UG fraction to form stable trajectories; the optimal lane width (matching the swimming radius) then maximizes this capture efficiency, further enriching the RSW fraction and enhancing the overall drift.

      We have added clarifications regarding these points in the revised manuscript (the last paragraph of “Results”).

      (3) Simulations:

      The simulations treat the wall interaction very crudely. We would suggest treating it as a mechanical object that exerts elastic or "hard sphere" forces and torques on the bacteria for more realistic modeling.

      We appreciate the reviewer's suggestion to incorporate more detailed mechanical interactions, such as elastic or hard-sphere forces, for the wall collisions. While we agree that a full hydrodynamic or mechanical model would offer higher fidelity, our experimental observations suggest that a simplified kinematic approach is sufficient for the specific phenomena studied here.

      As shown in the new Fig. S2, our analysis of cell trajectories in the 44-µm-wide channels reveals that cells colliding with the sidewalls tend to align with the surface almost instantaneously. The timescale required for this alignment is negligible compared to the typical wall residence time (see also Ref. 6). Consequently, to maintain computational efficiency without sacrificing the essential physics of the accumulation effect, we employed a coarse-grained phenomenological model where a bacterium immediately aligns parallel to the wall upon contact, similar to approaches used previously (Ref. 43). We have added relevant text to the manuscript on lines 168-171.

      Notably, the simulations have a constant (chemotaxis independent) rate of wall escape by tumbling. We would expect that reduced tumbling due to up-gradient motility induces a longer dwell time at the wall.

      We apologize for the confusion. The chemotaxis effect is indeed fully integrated into our simulation. Specifically, the simulated cells sense the chemical gradient and adjust their motor CW bias (B) accordingly. This adjustment directly modulates the tumble rate (k), calculated as k \= B/0.31 s<sup>-1</sup>. Consequently, the wall escape rate is not constant but varies with the chemotactic response. We also imposed a maximum detention time limit which, when combined with the variable tumble rate, results in an average wall residence time of approximately 2 s, consistent with our experimental observations (Fig. S6B). We have clarified these details in the final section of 'Materials and Methods'.

      Reviewer #3 (Public review):

      This paper addresses through experiment and simulation the combined effects of bacterial circular swimming near no-slip surfaces and chemotaxis in simple linear gradients. The authors have constructed a microfluidic device in which a gradient of L-aspartate is established to which bacteria respond while swimming while confined in channels of different widths. There is a clear effect that the chemotactic drift velocity reaches a maximum in channel widths of about 8 microns, similar in size to the circular orbits that would prevail in the absence of side walls. Numerical studies of simplified models confirm this connection.

      The experimental aspects of this study are well executed. The design of the microfluidic system is clever in that it allows a kind of "multiplexing" in which all the different channel widths are available to a given sample of bacteria.

      While the data analysis is reasonably convincing, I think that the authors could make much better use of what must be voluminous data on the trajectories of cells by formulating the mathematical problem in terms of a suitable Fokker-Planck equation for the probability distribution of swimming directions. In particular, I would like to see much more analysis of how incipient circular trajectories are interrupted by collisions with the walls and how this relates to enhanced chemotaxis. In essence, there needs to be a much clearer control analysis of trajectories without sidewalls to understand the mechanism in their presence.

      We thank the reviewer for this insightful suggestion. We agree that understanding how circular trajectories are interrupted by wall collisions is central to explaining the enhanced chemotaxis. While we did not explicitly formulate a Fokker-Planck equation, we have addressed the reviewer's core point by employing two complementary mathematical approaches that model the probability distribution of swimming directions and wall interactions:

      (1) Stochastic simulations (Langevin approach): As detailed in the "Simulation of E. coli chemotaxis within lane confinements" subsection of “Results” and Figure 5, we modeled cells as self-propelled particles performing random walks. This model explicitly accounts for the "interruption" of circular trajectories by incorporating a constant angular velocity (circular swimming) and an alignment effect upon collision with sidewalls. These simulations successfully reproduced the experimental trends, confirming that the interplay between circular radius and lane width determines the optimal drift velocity.

      (2) Geometric probability analysis: To provide the "intuitive understanding", we included a specific Geometrical Analysis section (the last subsection of “Results”) and Figure 6. This analysis mathematically formulates the problem by calculating the exact proportion of swimming angles that allow a cell to transition from a circular trajectory in the bulk to an up-gradient trajectory along the Right Sidewall (RSW). By integrating over the possible swimming directions, we derived the probability of wall interception as a function of lane width (w) and swimming radius (r). This analysis reveals that the interruption of circular paths is most favorable for chemotaxis when w » (0.7-0.8)´r.

      (3) Control analysis: regarding the "control analysis of trajectories without sidewalls," we utilized the cells in the Middle Area (MA) of the wide lanes as an internal control. As shown in Fig. 2B and 4A, these cells exhibit typical surface-associated circular swimming (Fig. 3B) but generate zero net drift. This serves as the baseline "no sidewall" condition, demonstrating that the chemotactic enhancement is strictly driven by the rectification of circular swimming into wall-aligned motion at the boundaries.

      The authors argue that these findings may have relevance to a number of physiological and ecological contexts. Yet, each of these would be characterized by significant heterogeneity in pore sizes and geometries, and thus it is very unclear whether or how the findings in this work would carry over to those situations.

      We thank the reviewer for this important observation regarding environmental heterogeneity. We agree that we should be cautious about directly extrapolating to complex ecological contexts without qualification. We have revised the last sentence of the abstract to adopt a more measured tone: "Our results may offer insights into bacterial navigation in complex biological environments such as host tissues and biofilms, providing a preliminary step toward exploring microbial ecology in confined habitats and potential strategies for controlling bacterial infections."

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Key elements of the mechanism of wall-directed chemotaxis are not sufficiently emphasized:

      For instance, the chirality of the trajectories is an essential part of the analysis but is mentioned only briefly in the introduction. In the geometrical analysis, I understand that one of the critical parameters is the angle at which bacteria "collide" with the walls. But, again, this remains largely implicit in the discussion. This comes to the point that these ideas are not even mentioned in the abstract which doesn't provide any hint of a mechanism. An analysis of the actual trajectories of the cells after they hit the walls, as a function of their initial angle would be helpful in comparison with the simulations and the geometrical analysis.

      We appreciate the reviewer's insightful comment regarding the need to better emphasize the mechanism of wall-directed chemotaxis. We agree that the chirality of trajectories and the geometry of wall collisions are central to our analysis and were previously under-emphasized.

      To address this, we have made the following revisions:

      (1) We have revised the Abstract (lines 25-27) and the Discussion (lines 391-393) to explicitly highlight the crucial role of chiral circular motion and the alignment effect following sidewall collisions.

      (2) We further analyzed bacterial trajectories at different collision angles. Typical examples are shown in Supplementary Fig. S2. We observed that cells tend to align with and swim along the sidewalls regardless of their initial collision angles. This finding is now described in the main text at lines 168-171.

      The motion of the bacteria is modelled as run-and-tumble at several places in the manuscript, and in particular in the simulations. Yet, the trajectories of the bacteria seem to be smooth in this almost 2D geometry, except of course when they directly interact with the walls (I hardly see tumbles in the MA region in Figure 1B). Can the authors elaborate on the assumptions made in the numerical simulations? In particular, how is the radius of the trajectories included in these equations of motion (line 514)?

      We apologize for the lack of clarity regarding the bacterial motion model. It has been established that while bacteria do tumble near solid surfaces, they exhibit a smaller reorientation angle compared to bulk fluids; in fact, the most probable reorientation angle on a surface is zero (Ref. 41). Consequently, tumbles are often difficult to distinguish from runs with the naked eye. Additionally, the trajectories in Figure 1B are plotted on a 44 mm ´ 150 mm canvas with unequal coordinate scales, which may further obscure the visual distinctness of tumbling events.

      Regarding the equations of motion: We modeled the bacteria as self-propelled particles governed by the internal chemotaxis pathway, alternating between run and tumble states. As noted in the equations on lines 286 & 578, we incorporated the circular motion by introducing a constant angular velocity, −ν<sub>0</sub>/r, during the run state. Here, ν<sub>0</sub> represents the swimming speed, r denotes the radius of circular swimming, and the negative sign indicates clockwise chirality. Furthermore, to model the hydrodynamic interaction with the boundaries, we assumed that when a cell collides with a sidewall, its velocity vector instantly aligns parallel to that wall.

      The comparison of Figure 5B (simulations) with Figure 4B (experiments) does not strike me as so "similar". Why are the points at small widths so noisy (Figure 5AB)? Figure 5C is cut at these widths, it should be plotted over the entire scale.

      We acknowledge that the agreement between simulation and experiment is less robust in the narrowest channels. The discrepancy and "noise" at small widths in Figure 5 arise from the limitations of the self-propelled particle model in highly confined geometries. Specifically, our simulation treats bacteria as point particles and does not explicitly calculate the physical exclusion (steric effects) caused by the finite size of the flagella and cell body.

      In the experimental setup, steric constraints within narrow channels (comparable to the cell size) restrict the cells' ability to turn freely, effectively stabilizing their motion. However, because our model allows particles to reorient more freely than actual cells would in such confined spaces, it produces fluctuations and an overestimation of the drift velocity at small widths. If these confinement effects were fully incorporated, the cell density mismatch between the left and right sidewalls would be reduced, leading to lower drift velocities that match the experimental data more closely.

      Regarding Figure 5C: Since the "active particle" assumption loses physical validity in channels narrower than the scale of the bacterium, the simulation results in this regime are not representative of biological reality. Plotting these non-physical points would distort the analysis. Therefore, we have maintained the truncation of Figure 5C at 4 mm to ensure the data presented is physically meaningful. We have added a clear discussion of these model limitations to the manuscript at lines 310-314.

      These important precisions should be added to the text or in a supplementary section. A validated mechanism describing in detail the impact of the walls on the cell trajectories would greatly improve the conclusions.

      We thank the reviewer for the suggestions. As noted in the responses above, we have incorporated the details concerning the simulation assumptions and the model limitations at narrow widths into the revised manuscript. We have performed further analysis of the collision trajectories between bacteria and the sidewalls. As illustrated in the new Fig. S2, the data confirms that cells tend to align with and swim along the sidewalls following a collision, regardless of the initial impact angle.

      Reviewer #2 (Recommendations for the authors):

      Minor points

      (1) Related to swimming in 3D: The authors should specify the depth of field of the objective in their setup.

      We thank the reviewer for pointing this out. We have calculated the depth of field (DOF) of our objective to be approximately 3.7 µm. This estimate is based on the standard formula:

      where l = 610 nm (emission wavelength), n = 1.0 (refractive index), NA = 0.45 (numeric aperture), M = 20 (magnification), and e = 6.5 µm (camera resolution). We have added this specification to the "Microscopy and Data Acquisition" section of “Materials and Methods”.

      (2) Related to the interpretation of the width effect: We think plotting the cell enrichment, ie the probabilities P in Figure 4B normalized to the expected value if cells were homogeneously distributed ((3µm)/w for the side walls, (w - 6µm)/w for the middle) would help understand the strength of the wall 'siphoning' effect.

      We thank the reviewer for the suggestion. We have calculated the cell enrichment by normalizing the observed probabilities against the expected values for a homogeneous distribution, as suggested. The resulting relationship between cell enrichment and lane width is presented in Figure S4.

      Related to simulations:

      (1) Showing vd for the 3 regions in Figure S5 would be helpful also to understand the underlying mechanism.

      We thank the reviewer for the suggestion. The V<sub>d</sub> values for the three regions are shown in Fig. S5.

      (2) Figure 5B vs 4B: There is a mismatch in the right vs left side density at w=6µm in the simulations that is not here in the experiments. What could explain this difference?

      We appreciate the reviewer pointing this out. The mismatch in the simulations is due to the simplified treatment of cells as self-propelled particles, which overlooks the physical volume of the cell body and flagella. In narrow channels (w\=6 mm), these physical constraints would restrict the cells' ability to change direction freely - a factor not fully captured in the simulation. Accounting for these steric effects would trap cells more effectively against the walls, reducing the density asymmetry between the LSW and RSW and lowering the drift velocity. This would bring the simulation results closer to the experimental observations. We have added a discussion of these limitations and effects to the revised manuscript (lines 310-314).

      (3) The simulations essentially assume that the density of motile cells is homogeneous and equal at both x=0 and x=L open ends of the channel. Is it the case in the experiments, even with the gradient, and the walls creating some cell transport?

      We thank the reviewer for pointing this out. The simulation assumption is consistent with our experimental observations. Our data were recorded within 160-μm-long lanes located in the center of the wider (400 μm) cell channel. In this central region, the cells maintain a continuous flux. Furthermore, experiments were performed within 8 min of flow, limiting the time for significant cell density gradients to establish. As illustrated in Author response image 11, the inhomogeneity in the measured cell density distribution is insignificant across the length of the observation window, indicating that the walls and gradient do not create significant heterogeneity at the boundaries of the region of interest.

      Author response image 1.

      The cell density distribution along the gradient field from the data of 44-μm-wide lane.

      (4) Line 506: There is something strange with the definition of the bias. B cannot be the tumbling bias if k=B/0.31 s<sup>-1</sup> and the tumble-to-run rate is 5/s, because then the tumbling bias is B/0.31 / (B/0.31 + 5). Please clarify.

      We apologize for the confusion caused by the notation. In our model, B represents the CW bias of the individual flagellar motor, not the macroscopic tumbling bias of the cell. We assume the run-to-tumble rate is equivalent to the motor CCW-to-CW switching rate (k). Previous studies have shown that this rate increases linearly with the motor CW bias according to k=B/t, where t is a characteristic time (Ref. 50).

      Based on experimental data for wildtype cells, the average run time in the near-surface region is ~2.0 s (corresponding to a run-to-tumble rate of ~0.5 s<sup>-1</sup>) (Ref. 11), and the steady-state wildtype CW bias is ~0.15. Using these values, we determined t ~ 0.31 s. Consequently, the switching rate is defined as k=B/0.31 s<sup>-1</sup>. Since the tumble duration is constant (0.2 s) (Ref. 51), the tumble-to-run rate is fixed at 5 s<sup>-1</sup>. We have clarified these definitions and parameter values in lines 569-573.

      Other minor comments:

      (1) Line 20 and lines 34-35: We think that the connection to infection is questionable here and should be toned down.

      Thank you for the suggestion. We have revised Line 20 to read: “Understanding bacterial behavior in confined environments is helpful to elucidating microbial ecology and developing strategies to manage bacterial infections.” Additionally, we modified lines 34-35 to state: “Our results may offer insights into bacterial navigation in complex biological environments such as host tissues and biofilms, providing a preliminary step toward exploring microbial ecology in confined habitats and potential strategies for controlling bacterial infections.”

      (2) Line 49: Consider highlighting the change in the sense of rotation at the air-liquid interface.

      Thank you for the suggestion. We have now highlighted the difference in chirality between trajectories at the air-liquid interface and those at the liquid-solid interface. The text has been updated to read: “For example, E. coli swim clockwise when observed from above a solid surface, whereas Caulobacter crescentus move in tight, counter-clockwise circles when viewed from the liquid side.”

      (3) Lines 58-59: The sentence should be better formulated, explaining what is CheY-P and that its concentration changes because of a change in phosphorylation (P).

      Thank you for the suggestion. We have reformulated this section to explicitly define CheY-P and explain how its concentration is regulated through phosphorylation. The revised text reads: “The transmembrane chemoreceptors detect attractants or repellents and transmit signals into the cell by modulating the autophosphorylation of the histidine kinase CheA. Attractant binding suppresses CheA autophosphorylation, while repellent binding promotes it. This modulation alters the concentration of the phosphorylated response regulator protein, CheY-P.”

      (4) Lines 63-64: CheR CheB do a bit more than "facilitating" adaptation, they mediate it. The notation CheB(p) may be confusing, since "-P" was used above for CheY.

      Thank you for pointing this out. We have corrected the notation and strengthened the description of the enzymes' roles. The revised text is: “The adaptation enzymes CheR and CheB methylate and demethylate the receptors, respectively, mediating sensory adaptation.”

      (5) Line 130: there must be a typo in the formula.

      We have replaced the ambiguous lag time variable in Fig. 1C with _n_Δt to ensure mathematical consistency.

      (6) Additionally, \Delta t is both the time between the frame here and the lag time in Figure 1.

      Thank you for highlighting this ambiguity. We have updated the notation to distinguish these two values. The lag time in Figure 1 is now explicitly denoted as _n_Δt, while Δt remains the time interval between individual frames.

      (7) Line 162: "Consistent with previous reports," a reference to said reports is missing.

      Thank you for pointing this out. We have now added the reference (Ref. 41) to support this statement.

      (8) Figure 1B: Are these tracks in the presence of a gradient? Same as used in panel C? This needs to be explained.

      Response: Thank you for this question. We confirm that the tracks shown in Figure 1B were indeed recorded in the presence of a gradient and represent a subset of the data used in Figure 1C. We have clarified this in the figure legend as follows: "Thirty bacterial trajectories selected from the data of the 44-mm-wide lane in gradient assays. These represent a subset of the trajectories analyzed in panel C."

      (9) Simulations: the equation for x(t) should also be given for completeness.

      Thank you for the suggestion. For completeness, we have added the position updating equations for the run state to the Materials and Methods section (lines 579-580). The equations are defined as:

      (10) Figure S2: For the swimming directions that are more unstable due to the surface friction torque, RSW-DG, and LSW-UG, one would have expected that the Up-gradient motion is more persistent than the down gradient one. It seems to be the opposite. Is it significant, and what could be the reason for this?

      We apologize for the lack of clarity in our original explanation. While we would generally expect up-gradient motion to be more persistent than down-gradient motion in bulk fluid, our measurements near the surface show a different trend due to the specific contributions of run and tumble states to the escape rate. Cells swimming up-gradient (UG) in the LSW experience higher probability of running. Consequently, they are subjected to the destabilizing surface friction torque for a greater proportion of time compared to cells swimming down-gradient (DG) in the RSW. This can be explained mathematically. The escape rates for RSW-DG and LSW-UG can be expressed as:

      Where B<sup>+</sup> and B<sup>−</sup> represent the tumble bias (probability of tumbling) when swimming up-gradient and down-gradient, respectively, and k<sub>T</sub> and k<sub>R</sub> denote the escape rates during a tumble and a run, respectively. Due to the chemotactic response, 0≤ B<sup>+</sup>< B<sup>−</sup> ≤1. Crucially, our system is characterized by k<sub>R</sub>>k<sub>T</sub> (the escape rate is higher during a run than a tumble). Therefore, the lower tumble bias during up-gradient swimming (B<sup>+</sup>< B<sup>−</sup>) increases the weight of the run-state escape term((1−B<sup>+</sup>)k<sub>R</sub>), leading to a higher overall escape rate for LSW-UG compared to RSW-DG. We have added an intuitive understanding of k<sub>R</sub>>k<sub>T</sub> in the Supplemental text.

    1. Reviewer #3 (Public review):

      Summary:

      In this study, the authors investigate how the structural state of the microtubule lattice influences the accessibility of the α-tubulin C-terminal tail (CTT). By developing and applying new biosensors, they reveal that the tyrosinated CTT is largely inaccessible under normal conditions but becomes more accessible upon changes to the tubulin conformational state induced by taxol treatment, MAP expression, or GTP-hydrolysis-deficient tubulin. The combination of live imaging, biochemical assays, and simulations suggests that the lattice conformation regulates the exposure of the CTT, providing a potential mechanism for modulating interactions with microtubule-associated proteins. The work addresses a highly topical question in the microtubule field and proposes a new conceptual link between lattice spacing and tail accessibility for tubulin post-translational modification. Future work is required to distinguish CTT exposure in the microtubule lattice is sensitive to additional factors present in vivo but not in vitro.

      Strengths:

      (1) The study targets a highly relevant and emerging topic-the structural plasticity of the microtubule lattice and its regulatory implications.

      (2) The biosensor design represents a methodological advance, enabling direct visualization of CTT accessibility in living cells.

      (3) Integration of imaging, biochemical assays, and simulations provides a multi-scale perspective on lattice regulation.

      (4) The conceptual framework proposed lattice conformation as a determinant of post-translational modification accessibility is novel and potentially impactful for understanding microtubule regulation.

      [Editors' note: the authors have responded to the reviewers and this version was assessed by the editors.]

    2. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This is a careful and comprehensive study demonstrating that effector-dependent conformational switching of the MT lattice from compacted to expanded deploys the alpha tubulin C-terminal tails so as to enhance their ability to bind interactors.

      Strengths:

      The authors use 3 different sensors for the exposure of the alpha CTTs. They show that all 3 sensors report exposure of the alpha CTTs when the lattice is expanded by GMPCPP, or KIF1C, or a hydrolysis-deficient tubulin. They demonstrate that expansion-dependent exposure of the alpha CTTs works in tissue culture cells as well as in vitro.

      Weaknesses:

      There is no information on the status of the beta tubulin CTTs. The study is done with mixed isotype microtubules, both in cells and in vitro. It remains unclear whether all the alpha tubulins in a mixed isotype microtubule lattice behave equivalently, or whether the effect is tubulin isotype-dependent. It remains unclear whether local binding of effectors can locally expand the lattice and locally expose the alpha CTTs.

      Appraisal:

      The authors have gone to considerable lengths to test their hypothesis that microtubule expansion favours deployment of the alpha tubulin C-terminal tail, allowing its interactors, including detyrosinase enzymes, to bind. There is a real prospect that this will change thinking in the field. One very interesting possibility, touched on by the authors, is that the requirement for MAP7 to engage kinesin with the MT might include a direct effect of MAP7 on lattice expansion.

      Impact:

      The possibility that the interactions of MAPS and motors with a particular MT or region feed forward to determine its future interaction patterns is made much more real. Genuinely exciting.

      We thank the reviewer for their positive response to our work. We agree that it will be important to determine if the bCTT is subject to regulation similar to the aCTT. However, this will first require the development of sensors that report on the accessibility of the bCTT, which is a significant undertaking for future work.

      We also agree that it will be important to examine whether all tubulin isotypes behave equivalently in terms of exposure of the aCTT in response to conformational switching of the microtubule lattice.

      We thank the reviewer for the comment about local expansion of the microtubule lattice. We believe that Figure 3 does show that local binding of effectors can locally expand the lattice and locally expose the alpha-CTTs. We have added text to clarify this.

      Reviewer #2 (Public review):

      The unstructured α- and β-tubulin C-terminal tails (CTTs), which differ between tubulin isoforms, extend from the surface of the microtubule, are post-translationally modified, and help regulate the function of MAPs and motors. Their dynamics and extent of interactions with the microtubule lattice are not well understood. Hotta et al. explore this using a set of three distinct probes that bind to the CTTs of tyrosinated (native) α-tubulin. Under normal cellular conditions, these probes associate with microtubules only to a limited extent, but this binding can be enhanced by various manipulations thought to alter the tubulin lattice conformation (expanded or compact). These include small-molecule treatment (Taxol), changes in nucleotide state, and the binding of microtubule-associated proteins and motors. Overall, the authors conclude that microtubule lattice "expanders" promote probe binding, suggesting that the CTT is generally more accessible under these conditions. Consistent with this, detyrosination is enhanced. Mechanistically, molecular dynamics simulations indicate that the CTT may interact with the microtubule lattice at several sites, and that these interactions are affected by the tubulin nucleotide state.

      Strengths:

      Key strengths of the work include the use of three distinct probes that yield broadly consistent findings, and a wide variety of experimental manipulations (drugs, motors, MAPs) that collectively support the authors' conclusions, alongside a careful quantitative approach.

      Weaknesses:

      The challenges of studying the dynamics of a short, intrinsically disordered protein region within the complex environment of the cellular microtubule lattice, amid numerous other binders and regulators, should not be understated. While it is very plausible that the probes report on CTT accessibility as proposed, the possibility of confounding factors (e.g., effects on MAP or motor binding) cannot be ruled out. Sensitivity to the expression level clearly introduces additional complications. Likewise, for each individual "expander" or "compactor" manipulation, one must consider indirect consequences (e.g., masking of binding sites) in addition to direct effects on the lattice; however, this risk is mitigated by the collective observations all pointing in the same direction.

      The discussion does a good job of placing the findings in context and acknowledging relevant caveats and limitations. Overall, this study introduces an interesting and provocative concept, well supported by experimental data, and provides a strong foundation for future work. This will be a valuable contribution to the field.

      We thank the reviewer for their positive response to our work. We are encouraged that the reviewer feels that the Discussion section does a good job of putting the findings, challenges, and possibility of confounding factors and indirect effects in context. 

      Reviewer #3 (Public review):

      Summary:

      In this study, the authors investigate how the structural state of the microtubule lattice influences the accessibility of the α-tubulin C-terminal tail (CTT). By developing and applying new biosensors, they reveal that the tyrosinated CTT is largely inaccessible under normal conditions but becomes more accessible upon changes to the tubulin conformational state induced by taxol treatment, MAP expression, or GTP-hydrolysis-deficient tubulin. The combination of live imaging, biochemical assays, and simulations suggests that the lattice conformation regulates the exposure of the CTT, providing a potential mechanism for modulating interactions with microtubule-associated proteins. The work addresses a highly topical question in the microtubule field and proposes a new conceptual link between lattice spacing and tail accessibility for tubulin post-translational modification.

      Strengths:

      (1) The study targets a highly relevant and emerging topic-the structural plasticity of the microtubule lattice and its regulatory implications.

      (2) The biosensor design represents a methodological advance, enabling direct visualization of CTT accessibility in living cells.

      (3) Integration of imaging, biochemical assays, and simulations provides a multi-scale perspective on lattice regulation.

      (4) The conceptual framework proposed lattice conformation as a determinant of post-translational modification accessibility is novel and potentially impactful for understanding microtubule regulation.

      Weaknesses:

      There are a number of weaknesses in the paper, many of which can be addressed textually. Some of the supporting evidence is preliminary and would benefit from additional experimental validation and clearer presentation before the conclusions can be considered fully supported. In particular, the authors should directly test in vitro whether Taxol addition can induce lattice exchange (see comments below).

      We thank the reviewer for their positive response to our work. We have altered the text and provided additional experimental validation as requested (see below).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) The resolution of the figures is insufficient.

      (2) The provision of scale bars is inconsistent and insufficient.

      (3) Figure 1E, the scale bar looks like an MT.

      (4) Figure 2C, what does the grey bar indicate?

      (5) Figure 2E, missing scale bar.

      (6) Figure 3 C, D, significance brackets misaligned.

      (7) Figure 3E, consider using the same alpha-beta tubulin / MT graphic as in Figure 1B.

      (8) Figure 5E, show cell boundaries for consistency?

      (9) Figure 6D, stray box above the y-axis.

      (11) Figure S3A, scale bar wrong unit again.

      (12) S3B "fixed" and mount missing scale bar in the inset.

      (13) S4 scale bars without scale, inconsistency in scale bars throughout all the figures.

      We apologize for issues with the figures. We have corrected all of the issues indicated by the reviewer.

      (10) Figure 6F, surprising that 300 mM KCL washes out rigor binding kinesin

      We thank the reviewer for this important point. To address the reviewer’s concern, we have added a new supplementary figure (new Figure 6 – Figure Supplement 1) which shows that the washing step removes strongly-bound (apo) KIF5C(1-560)-Halo<sup>554</sup> protein from the microtubules. In addition, we have made a correction to the Materials and Methods section noting that ATP was added in addition to the KCl in the wash buffer. We apologize for omitting this detail in the original submission. We also added text noting that the wash out step was based on Shima et al., 2018 where the observation chamber was washed with either 1 mM ATP and 300 mM K-Pipes or with 10 mM ATP and 500 mM K-Pipes buffer. In our case, the chamber was washed with 3 mM ATP and 300 mM KCl. It is likely that the addition of ATP facilitates the detachment of strongly-bound KIF5C.

      (14) Supplementary movie, please identify alpha and beta tubules for clarity. Please identify residues lighting up in interaction sites 1,2 & 3.

      Thank you for the suggestions. We have made the requested changes to the movie.

      Reviewer #2 (Recommendations for the authors):

      There appear to have been some minor issues (perhaps with .pdf conversion) that leave some text and images pixelated in the .pdf provided, alongside some slightly jarring text and image positioning (e.g., Figure 5E panels). The authors should carefully look at the figures to ensure that they are presented in the clearest way possible.

      We apologize for these issues with the figures. We have reviewed the figures carefully to ensure that they are presented in the clearest way possible.

      The authors might consider providing a more definitive structural description of compact vs expanded lattice, highlighting what specific parameters are generally thought to change and by what magnitude. Do these differ between taxol-mediated expansion or the effects of MAPs?

      Thank you for the suggestion. We have added additional information to the Introduction section.

      Reviewer #3 (Recommendations for the authors):

      (1) Figure 1 should include a schematic overview of all constructs used in the study. A clear illustration showing the probe design, including the origin and function of each component (e.g., tags, domains), would improve clarity.

      Thank you for the suggestion. We have added new illustrations to Figure 1 showing the origin and design (including domains and tags) of each probe.

      (2) Add Western blot data for the 4×CAP-Gly construct to Figure 1C for completeness.

      We thank the reviewer for this suggestion. We carried out a far-western blot using the purified 4xCAPGly-mEGFP protein to probe GST-Y, GST-DY, and GST-DC2 proteins (new Figure 1 – Figure Supplement 1C). We note that some bleed-through signal can be seen in the lanes containing GST-ΔY and GST-ΔC2 protein due to the imaging requirements and exposure needed to visualize the 4xCAPGly-mEGFP protein. Nevertheless, the blot shows that the purified CAPGly sensor specifically recognizes the native (tyrosinated) CTT sequence of TUBA1A.

      (3) Essential background information on the CAP-Gly domain, SXIP motif, and EB proteins is missing from the Introduction. These concepts appear abruptly in the Results and should be properly introduced.

      Thank you for the suggestion. We have added additional information to the Introduction section about the CAP-Gly domain. However, we feel that introducing the SXIP motif and EB proteins at this point would detract from the flow of the Introduction and we have elected to retain this information in the Results section when we detail development of the 4xCAPGly probe.

      (4) In Figure 2E, it remains possible that the CAP-Gly domain displacement simply follows the displacement of EB proteins. An experiment comparing EB protein localization upon Taxol treatment would clarify this relationship.

      We thank the reviewer for raising this important point. To address the reviewer’s concern, we utilized HeLa cells stably expressing EB3-GFP. We performed live-cell imaging before and after Taxol addition (new Figure 2 – Figure Supplement 1C). EB3-EGFP was lost from the microtubule plus ends within minutes and did not localize to the now-expanded lattice.

      (5) Statements such as "significantly increased" (e.g., line 195) should be replaced with quantitative information (e.g., "1.5-fold increase").

      We have made the suggested changes to the text.

      (6) Phrases like "became accessible" should be revised to "became more accessible," as the observed changes are relative, not absolute. The current wording implies a binary shift, whereas the data show a modest (~1.5-fold) increase.

      We have made the suggested changes to the text.

      (7) Similarly, at line 209, the terms "minimally accessible" versus "accessible" should be rephrased to reflect the small relative change observed; saturation of accessibility is not demonstrated.

      We have made the suggested changes to the text.

      (8) Statements that MAP7 "expands the lattice" (line 222) should be made cautiously; to my knowledge, that has not been clearly established in the literature.

      We thank the reviewer for this important comment. We have added text indicating that MAP7’s ability to induce or presence an expanded lattice has not been clearly established.

      (9) In Figures 3 and 4, the overexpression of MAP7 results in a strikingly peripheral microtubule network. Why is there this unusual morphology?

      The reviewer raises an interesting question. We are not sure why the overexpression of MAP7 results in a strikingly peripheral microtubule network but we suspect this is unique to the HeLa cells we are using. We have observed a more uniform MAP7 localization in other cell types [e.g. COS-7 cells (Tymanskyj et al. 2018), consistent with the literature [e.g. BEAS-2B cells (Shen and Ori-McKenney 2024), HeLa cells (Hooikaas et al. 2019)].

      (10) In Supplementary Figure 5C, the Western blot of detyrosination levels is inconsistent with the text. Untreated cells appear to have higher detyrosination than both wild-type and E254A-overexpressing cells. Do you have any explanation?

      We thank the reviewer for this important comment. We do not have an explanation at this point but plan to revisit this experiment. Unfortunately, the authors who carried out this work recently moved to a new institution and it will be several months before they are able to get the cell lines going and repeat the experiment. We thus elected to remove what was Supp Fig 5C until we can revisit the results. We believe that the important results are in what is now Figure 5 - Figure Supplement 1A,B which shows that the expression levels of the WT and E254E proteins are similar to each other.

      (11) The image analysis method in Figures 5B and 5D requires clarification. It appears that "density" was calculated from skeletonized probe length over total area, potentially using a strict intensity threshold. It looks like low-intensity binding has been excluded; otherwise, the density would be the same from the images. If so, this should be stated explicitly. A more appropriate analysis might skeletonize and integrate total fluorescence intensity relative to the overall microtubule network.

      We have added additional information to the Materials and Methods section to clarify the image analysis. We appreciate the reviewer’s valuable feedback and the suggestion to use the integrated total fluorescence intensity, which is a theoretically sound approach. While we agree that integrated intensity is a valid metric for specific applications, its appropriate use depends on two main preconditions:

      (1) Consistent microscopy image acquisition conditions.

      (2) Consistent probe expression levels across all cells and experiments.

      We successfully maintained consistent image acquisition conditions (e.g., exposure time) throughout the experiment. However, despite generating a stably-expressing sensor cell lines to minimize variation, there remains an inherent, biological variability in probe expression levels between individual cells. Integrated intensity is highly susceptible to this cell-to-cell variability. Relying on it would lead to a systematic error where differences in the total amount of expressed probe would be mistaken for differences in Y-aCTT accessibility.

      The density metric (skeletonized probe length / total cell area) was deliberately chosen as it serves as a geometric measure rather than an intensity-based normalization. The density metric quantifies the proportion of the microtubule network that is occupied by Y-aCTT-labeled structures, independent of fluorescence intensity. Thus, the density metric provides a more robust and interpretable measure of Y-aCTT accessibility under the variable expression conditions inherent to our experimental system. Therefore, we believe that this geometric approach represents the most appropriate analysis for our image dataset.

      (12) In Figure 5D, the fold-change data are difficult to interpret due to the compressed scale. Replotting is recommended. The text should also discuss the relative fold changes between E254A and Taxol conditions, Figure 2H.

      We appreciate the reviewer's insightful comment. We agree that the presence of significant outliers led to a compressed Y-axis scale in Figure 5D, obscuring the clear difference between the WT-tubulin and E254A-tubulin groups. As suggested, we have replotted Figure 5D using a broken Y-axis to effectively expand the relevant lower range of the data while still accurately representing all data points, including the outliers. We believe that the revised graph significantly enhances the clarity and interpretability of these results. For Figure 2, we have added the relative fold changes to the text as requested.

      (13) Figure 6. The authors should directly test in vitro whether Taxol addition can induce lattice exchange, for example, by adding Taxol to GDP-microtubules and monitoring probe binding. Including such an assay would provide critical mechanistic evidence and substantially strengthen the conclusions. I was waiting for this experiment since Figure 2.

      We thank the reviewer for this suggestion. As suggested, we generated GDP-MTs from HeLa tubulin and added it to two flow chambers. We then flowed in the YL1/2<sup>Fab</sup>-EGFP probe into the chambers in the presence of DMSO (vehicle control) or Taxol. Static images were taken and the fluorescence intensity of the probe on microtubules in each chamber was quantified. There was a slight but not statistically significant difference in probe binding between control and Taxol-treated GDP-MTs (Author response image 1). While disappointing, these results underscore our conclusion (Discussion section) that microtubule assembly in vitro may not produce a lattice state resembling that in cells, either due to differences in protofilament number and/or buffer conditions and/or the lack of MAPs during polymerization.

      Author response image 1.

      References

      Hooikaas, P. J., Martin, M., Muhlethaler, T., Kuijntjes, G. J., Peeters, C. A. E., Katrukha, E. A., Ferrari, L., Stucchi, R., Verhagen, D. G. F., van Riel, W. E., Grigoriev, I., Altelaar, A. F. M., Hoogenraad, C. C., Rudiger, S. G. D., Steinmetz, M. O., Kapitein, L. C. and Akhmanova, A. (2019). MAP7 family proteins regulate kinesin-1 recruitment and activation. J Cell Biol, 218, 1298-1318.

      Shen, Y. and Ori-McKenney, K. M. (2024). Microtubule-associated protein MAP7 promotes tubulin posttranslational modifications and cargo transport to enable osmotic adaptation. Dev Cell, 59, 1553-1570.

      Tymanskyj, S. R., Yang, B. H., Verhey, K. J. and Ma, L. (2018). MAP7 regulates axon morphogenesis by recruiting kinesin-1 to microtubules and modulating organelle transport. Elife, 7.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This manuscript uses primarily simulation tools to probe the pathway of cholesterol transport with the smoothened (SMO) protein. The pathway to the protein and within SMO is clearly discovered, and interactions deemed important are tested experimentally to validate the model predictions.

      Strengths:

      The authors have clearly demonstrated how cholesterol might go from the membrane through SMO for the inner and outer leaflets of a symmetrical membrane model. The free energy profiles, structural conformations, and cholesterol-residue interactions are clearly described.

      We thank the reviewer for their kind words.

      (1) Membrane Model: The authors decided to use a rather simple symmetric membrane with just cholesterol, POPC, and PSM at the same concentration for the inner and outer leaflets. This is not representative of asymmetry known to exist in plasma membranes (SM only in the outer leaflet and more cholesterol in this leaflet). This may also be important to the free energy pathway into SMO. Moreover, PE and anionic lipids are present in the inner leaflet and are ignored. While I am not requesting new simulations, I would suggest that the authors should clearly state that their model does not consider lipid concentration leaflet asymmetry, which might play an important role.

      We thank the reviewer for their comment. Membrane asymmetry is inherent in endogenous systems; we acknowledge that as a limitation of our current model. We have addressed the comment by adding this limitation to our discussion in the manuscript.

      Added lines: (End of paragraph 6, Results subsection 2):

      “One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (2) Statistical comparison of barriers: The barriers for pathways 1 and 2 are compared in the text, suggesting that pathway 2 has a slightly higher barrier than pathway 1. However, are these statistically different? If so, the authors should state the p-value. If not, then the text in the manuscript should not state that one pathway is preferred over the other.

      We thank the reviewer for their comment. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.0013)”

      (3) Barrier of cholesterol (reasoning): The authors on page 7 argue that there is an enthalpy barrier between the membrane and SMO due to the change in environment. However, cholesterol lies in the membrane with its hydroxyl interacting with the hydrophilic part of the membrane and the other parts in the hydrophobic part. How is the SMO surface any different? It has both characteristics and is likely balanced similarly to uptake cholesterol. Unless this can be better quantified, I would suggest that this logic be removed.

      We thank the reviewer for this suggestion. We have removed the line to avoid confusion.

      Reviewer #2 (Public review):

      Summary:

      In this work, the authors applied a range of computational methods to probe the translocation of cholesterol through the Smoothened receptor. They test whether cholesterol is more likely to enter the receptor straight from the outer leaflet of the membrane or via a binding pathway in the inner leaflet first. Their data reveal that both pathways are plausible but that the free energy barriers of pathway 1 are lower, suggesting this route is preferable. They also probe the pathway of cholesterol transport from the transmembrane region to the cysteine-rich domain (CRD).

      Strengths:

      (1) A wide range of computational techniques is used, including potential of mean force calculations, adaptive sampling, dimensionality reduction using tICA, and MSM modelling. These are all applied rigorously, and the data are very convincing. The computational work is an exemplar of a well-carried out study.

      (2) The computational predictions are experimentally supported using mutagenesis, with an excellent agreement between their PMF and mRNA fold change data.

      (3) The data are described clearly and coherently, with excellent use of figures. They combine their findings into a mechanism for cholesterol transport, which on the whole seems sound.

      (4) The methods are described well, and many of their analysis methods have been made available via GitHub, which is an additional strength.

      Weaknesses:

      (1) Some of the data could be presented a little more clearly. In particular, Figure 7 needs additional annotation to be interpretable. Can the position of the cholesterol be shown on the graph so that we can see the diameter change more clearly?

      We thank the reviewer for this suggestion. We have added the cholesterol positions as requested.

      Changes made: (Caption, Figure 7)

      “The tunnel profile during cholesterol translocation in SMO. (a) Free energy plot of the zcoordinate v/s the tunnel diameter when cholesterol is present in the core TMD. The tunnel shows a spike in the radius in the TMD domain, indicating the presence of a cholesterol-accommodating cavity. (b) Representative figure for the tunnel when a cholesterol molecule is in the TMD. (c) Same as (a), when cholesterol is at the TMD-CRD interface. (e) same as (b), when cholesterol is at the TMD-CRD interface. (e) same as (a), when cholesterol is at the CRD binding site. (f) same as (b), when cholesterol is at the CRD binding site. Tunnel diameters shown as spheres. Cholesterol positions marked on plots using dotted lines. All snapshots presented are frames taken from MD simulations.”

      (2) In Figure 3C, it doesn’t look like the Met is constricting the tunnel at all. What residue is constricting the tunnel here? Can we see the Ala and Met panels from the same angle to compare the landscapes? Or does the mutation significantly change the tunnel? Why not A283 to a bulkier residue? Finally, the legend says that the figure shows that cholesterol can still pass this residue, but it doesn’t really show this. Perhaps if the HOLE graph was plotted, we could see the narrowest point of the tunnel and compare it to the size of cholesterol.

      We thank the reviewer for this suggestion. A283 was mutated to methionine as it presents with a longer heavy tail containing sulfur. We have plotted the tunnel radii for both WT and A283M mutants and added them as a supplemental figure. As shown in the figure, the presence of methionine doesn’t completely block the tunnel, but occludes it, thereby increasing the barrier for cholesterol transport slightly.

      Changes made: (End of Results subsection 1)

      “When we calculated the PMF for cholesterol entry, A<sup>2.60f</sup>M mutant showed restricted tunnel but it did not fully block the tunnel (Figure 3—figure Supplement 3).”

      (3) The PMF axis in 3b and d confused me for a bit. Looking at the Supplementary data, it’s clear that, e.g., the F455I change increases the energy barrier for chol entering the receptor. But in 3d this is shown as a -ve change, i.e., favourable. This seems the wrong way around for me. Either switch the sign or make this clearer in the legend, please.

      We thank the reviewer for this suggestion. We measured ∆PMF as PMF<sub>WT</sub> PMF<sub>mutant</sub>, hence the negative values. We have added additional text to the legend to clarify this.

      Changes made: (Caption, Figure 3)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF , calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for the mutants in Pathway 1. (c) Example mutant A<sup>2_._60f</sup>M shows that cholesterol can enter SMO through Pathway 1 even on a bulky mutation. (d) Same as (b) but for Pathway 2 (e) Example mutant L<sup>5.62f</sup>A shows that cholesterol can enter SMO through Pathway 2 due to lesser steric hindrance. All snapshots presented are frames taken from MD simulations.”

      Changes made: (Caption, Figure 6)

      “(b) ∆Gli1 mRNA fold change (high SHH vs untreated) and ∆ PMF (difference of peak PMF, calculated as PMF<sub>WT</sub> - PMF<sub>mutant</sub>) plotted for mutants along the TMD-CRD pathway. (c, d) Example mutants Y<sup>LD</sup>A and F<sup>5.65f</sup>A show that cholesterol is unable to translocate through this pathway because of the loss of crucial hydrophobic contacts provided by Y207 and F484 and along the solvent-exposed pathway.”

      (4) The impact of G280V is put down to a decrease in flexibility, but it could also be a steric hindrance. This should be discussed.

      We thank the reviewer for this suggestion. We have added it as a possible mechanism of the decrease in activity of SMO.

      Changes made: (Paragraph 5, Results subsection 1)

      “We mutated G280<sup>2.57f</sup>  to valine - G<sup>2.57f</sup>V to test whether reducing the flexibility of TM2 prevents cholesterol entry into the TMD. Consequently, the activity of mSMO showed a decrease. However, this decrease could also be attributed to steric hindrance added by the presence of a bulky propyl group in valine.”

      (5) Are the reported energy barriers of the two pathways (5.8plus minus0.7 and 6.5plus minus0.8 kcal/mol) significantly and/or substantially different enough to favour one over the other? This could be discussed in the manuscript.

      We thank the reviewer for this suggestion. We have added statistical t-tests for the barriers.

      Changes made: (Paragraph 6, Results subsection 2)

      “However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol v/s 6.5 ± 0.8 kcal/mol, p = 0.001)”

      (6) Are the energy barriers consistent with a passive diffusion-driven process? It feels like, without a source of free energy input (e.g., ion or ATP), these barriers would be difficult to overcome. This could be discussed.

      We thank the reviewer for this suggestion. We have added a discussion to further clarify this point.

      Discussion: (Paragraph 6, Results subsection 2)

      “These values are comparable to ATP-Binding Cassette (ABC) transporters of membrane lipids, which use ATP hydrolysis (-7.54 ± 0.3 kcal/mol) (Meurer et al., 2017) to drive lipid transport from the membrane to an extracellular acceptor. Some of these transporters share the same mechanism as SMO, where the lipid from the inner leaflet is flipped and transported to the extracellular acceptor protein (Tarling et al., 2013). Additionally, for secondary active transporters that do not use ATP for the transport of substrates, a thermodynamic barrier of 5-6 kcal/mol has been reported in literature. (Chan et al., 2022; Selvam et al., 2019; McComas et al., 2023; Thangapandian et al., 2025).”

      (7) Regarding the kinetics from MSM, it is stated that the values seen here are similar to MFS transporters, but this then references another MSM study. A comparison to experimental values would support this section a lot.

      We thank the reviewer for this suggestion. We have added a discussion discussing millisecond-scale timescales measured for MFS transporters.

      Changes made: (Paragraph 2, Results subsection 5)

      “These timescales are comparable to the substrate transport timescales of Major Facilitator Superfamily (MFS) transporters (Chan et al., 2022). Furthermore, several experimental studies have also resolved the millisecond-scale kinetics of MFS transporters (Blodgett and Carruthers, 2005; Körner et al., 2024; Bazzone et al., 2022; Smirnova et al., 2014; Zhu et al., 2019), further corroborating the results from our study.”

      Reviewer #2 (Recommendations for the authors):

      (1) The heatmaps in Figures 2a and 4a are great. On these, an arrow denotes what looks like a minimum energy path. Is it possible to see this plotted, as this might show the height of the energy barriers more clearly?

      We thank the reviewer for this suggestion. We have computed the minimum energy paths for both pathways and presented them in a supplementary figure.

      Added lines: (Paragraph 4, Results subsection 1):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)a,b)

      Added lines: (Paragraph 4, Results subsection 2):

      For further clarity, we have plotted the minimum energy path taken by cholesterol as it translocates along this pathway (Figure 2—figure Supplement 3)c,d)

      (2) The tiCA data in S15 is first referred to on line 137, but the technique isn’t introduced until line 222. This makes understanding the data a little confusing. Reordering this might improve readability.

      We thank the reviewer for this suggestion. We have reordered the text to make it clearer.

      Changes made: (Paragraph 2, Results subsection 1) This provides evidence for multiple stable poses along the pathway as observed in the multiple stable poses of cholesterol in Cryo-EM structures of SMO bound to sterols (Deshpande et al., 2019; Qi et al., 2019b, 2020). A reliable estimate of the barriers comes from using the time-lagged Independent Components (tICs), which project the entire dataset along the slowest kinetic degrees of freedom. Overall, the highest barrier along Pathway 1 is 5.8 ± 0.7 kcal/mol, and it is associated with the entry of cholesterol into the TMD (Figure 2—Figure Supplement 2).

      Changes made: (Paragraph 3, Results subsection 2)

      “On plotting the first two components of tICs, (Figure 2—Figure Supplement 2), we observe that the energetic barrier between η and θ is ∼6.5 ± 0.8 kcal/mol.”

      (3) Missing bracket on line 577.

      We thank the reviewer for this suggestion. The typo has been fixed.

      (4) Line 577: Fig. S2nd?

      We thank the reviewer for this suggestion. This typo has been fixed.

      Reviewer #3 (Public review):

      Summary:

      This manuscript presents a study combining molecular dynamics simulations and Hedgehog (Hh) pathway assays to investigate cholesterol translocation pathways to Smoothened (SMO), a G protein-coupled receptor central to Hedgehog signal transduction. The authors identify and characterize two putative cholesterol access routes to the transmembrane domain (TMD) of SMO and propose a model whereby cholesterol traverses through the TMD to the cysteine-rich domain (CRD), which is presented as the primary site of SMO activation. The MD simulations and biochemical experiments are carefully executed and provide useful data.

      Weaknesses:

      However, the manuscript is significantly weakened by a narrow and selective interpretation of the literature, overstatement of certain conclusions, and a lack of appropriate engagement with alternative models that are well-supported by published data-including data from prior work by several of the coauthors of this manuscript. In its current form, the manuscript gives a biased impression of the field and overemphasizes the role of the CRD in cholesterol-mediated SMO activation. Below, I provide specific points where revisions are needed to ensure a more accurate and comprehensive treatment of the biology.

      (1) Overstatement of the CRD as the Orthosteric Site of SMO Activation

      The manuscript repeatedly implies or states that the CRD is the orthosteric site of SMO activation, without adequate acknowledgment of alternative models. To give just a few examples (of many in this manuscript):

      (a) “PTCH is proposed to modulate the Hh signal by decreasing the ability of membrane cholesterol to access SMO’s extracellular cysteine-rich domain (CRD)” (p. 3).

      (b) “In recent years, there has been a vigorous debate on the orthosteric site of SMO” (p. 3).

      (c) “cholesterol must travel through the SMO TMD to reach the orthosteric site in the CRD” (p. 4).

      (d) “we observe cholesterol moving along TM6 to the TMD-CRD interface (common pathway, Fig. 1d) to access the orthosteric binding site in the CRD” (p. 6).

      While the second quote in this list at least acknowledges a debate, the surrounding text suggests that this debate has been entirely resolved in favor of the CRD model. This is misleading and not reflective of the views of other investigators in the field (see, for example, a recent comprehensive review from Zhang and Beachy, Nature Reviews Molecular and Cell Biology 2023, which makes the point that both the CRD and 7TM sites are critical for cholesterol activation of SMO as well as PTCH-mediated regulation of SMO-cholesterol interactions).

      In contrast, a large body of literature supports a dual-site model in which both the CRD and the TMD are bona fide cholesterol-binding sites essential for SMO activation. Examples include:

      (a) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study).

      (b) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig. 4 Byrne et al, Nature 2016).

      (c) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019, is not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules).

      Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      Recommendation: The authors should revise the manuscript to provide a more balanced overview of the field and explicitly acknowledge that the CRD is not the sole activation site. Instead, a dual-site model is more consistent with available structural, mutational, and functional data. In addition, the authors should reframe their interpretation of their MD studies to reflect this broader and more accurate view of how cholesterol binds and activates SMO.

      We thank the reviewer for this comprehensive overview of the existing literature. We agree that cholesterol binding to both the TMD and CRD sites is required for full activation of SMO. As described below in responses to comments, we have made changes to the manuscript to make this point clear. For instance, in the revised manuscript, we refrain from calling the CRD cholesterol binding site the “orthosteric site”. Instead, we highlight that the goal of the manuscript is not to resolve the debate over whether the TMD or CRD site is more important for PTCH1 regulation by SMO but rather to use molecular dynamics to understand the fascinating question of how cholesterol in the membrane can reach the CRD, located at a significant distance above the outer leaflet of the membrane. We believe that this is an important goal since there is an abundance of evidence that supports the view that PTCH1 inhibits SMO by reducing cholesterol access to the CRD. This evidence is now summarized succinctly in the introduction:

      Changes made: (Paragraph 4, Introduction)

      “While cholesterol binding to both the TMD and CRD sites is required for full SMO activation, our work focuses on how cholesterol gains access to the CRD site, perched above the outer leaflet of the membrane (Luchetti et al., 2016; Kinnebrew et al., 2022). Multiple lines of evidence suggest that PTCH1-regulated cholesterol binding to the CRD plays an instructive role in SMO regulation both in cells and animals. Mutations in residues predicted to make hydrogen bonds with the hydroxyl group of cholesterol bound to the CRD reduced both the potency and efficacy of SHH in cellular signaling assays (Kinnebrew et al., 2022; Byrne et al., 2016) and, more importantly, eliminated HH signaling in mouse embryos (Xiao et al., 2017). Experiments using both covalent and photocrosslinkable sterol probes in live cells directly show that PTCH1 activity reduces sterol access to the CRD (Kinnebrew et al., 2022; Xiao et al., 2017). Notably, our simulations evaluate a path of cholesterol translocation that includes both the TMD and CRD sites: cholesterol first enters the 7-transmembrane domain bundle from the membrane; it then engages the TMD site before continuing along a conduit to the CRD site. Thus, we analyze translocation energetics and residue-level contacts along a path that includes both the TMD and the CRD.”

      However, Reviewer 3 makes several comments below that are biased, inaccurate, or selective. We feel it is important to address these so readers can approach the literature from a balanced perspective. Indeed, the eLife review forum provides an ideal venue to present contrasting views on a scientific model. We encourage the editors to publish both Reviewer 3’s comments and our response in full so readers can read the original papers and reach their own conclusions. It is important to note these issues are not relevant to the quality of the computational and experimental data presented in this paper.

      We have now removed the term “orthosteric” to describe the CRD site throughout the paper and clearly state in the introduction that “both the CRD and TMD sites are required for SMO activation” but that our focus is on how cholesterol moves from the membrane to the CRD site. There is no doubt that cholesterol binding to the CRD plays a key role in SMO activation– our focus on this path is justified and does not devalue the importance of the TMD site. Our prior models (see Figure 7 of Kinnebrew 2022 explicitly include contributions of both sites).

      Now we respond to some of the concerns outlined, individually:

      (1) Byrne et al., Nature 2016: point mutation of the CRD cholesterol binding site impairs-but does not abolish-SMO activation by cholesterol (SMO D99A, Y134F, and combination mutants - Fig 3 of the 2016 study)

      The fact that a point mutation dramatically diminishes (but does not abolish signaling) does not mean that the CRD cholesterol binding site is not important for SMO regulation. Indeed, the reviewer fails to mention that Song et. al. (Molecular Cell, 2017) found that a SMO protein carrying a subtle mutation at D99 (D95/99N, a residue that makes a hydrogen bond with the cholesterol hydroxyl) completely abolishes SMO signaling in mouse embryos. Thus, the CRD site is critical for SMO activation in an intact animal, justifying our focus on evaluating the path of cholesterol translocation to the CRD site.

      (2) Myers et al., Dev Cell 2013 and PNAS 2017: CRD deletion mutants retain responsiveness to PTCH regulation and cholesterol mimetics (similar Hh responsiveness of a CRD deletion mutant is also observed in Fig 4 Byrne et al, Nature 2016).

      The Reviewer fails to note that CRD-deleted versions of SMO have markedly (>10-fold) higher basal (i.e. ligand-independent) activity compared to full-length SMO. The response to SHH is minimal (∼2-fold), compared to >50-100-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. We encourage the reviewer to read our previous paper (Kinnebrew et. al. 2022), which presents a unified view of how the TMD and CRD sites together regulate SMO activation.

      A more physiological experiment, reported in Kinnebrew et. al. 2022, tested mutations in residues that make hydrogen bonds with cholesterol at the CRD and TMD sites in the context of full-length SMO. These mutants were stably expressed at moderate levels in Smo<sup>−/−</sup> cells. Mutations at the CRD site reduced the fold-increase in signaling output in response to SHH, as would be expected for a PTCH1-regulated site. In contrast, analogous mutations in the TMD site reduced the magnitude of both basal and maximal signaling, without affecting the fold-change in response to SHH. In signaling assays, the key parameter in evaluating the impact of a mutation is whether it impacts the change in output in response to a signal (in this case PTCH1 inactivation by SHH). A mutation in SMO that affects PTCH1 regulation is expected to decrease the fold-change in signaling in response to SHH, a criterion that is fulfilled by mutations in the CRD site. Accordingly, mutations in the CRD site abolish SMO signaling in mouse embryos (Xiao et al., 2017).

      (3) Deshpande et al., Nature 2019: mutation of residues in the TMD cholesterol binding site blocks SMO activation entirely, strongly implicating the TMD as a required site, in contrast to the partial effects of mutating or deleting the CRD site.

      Introduction of bulky mutations at the TMD site (V333F) that abolish SMO activity were first reported by Byrne et. al. 2016 and were used to markedly increase the stability of SMO for protein expression. These mutations indeed stabilize the inactive state of SMO, increasing protein abundance and completely preventing its localization at primary cilia. SMO variants carrying such bulky mutations cannot be used to infer the importance of the TMD site since they do not distinguish between the following possibilities: (1) SMO is inactive because the sterol cannot bind, or (2) SMO is inactive because it is locked in an inactive conformation, or (3) SMO is inactive because it cannot localize to primary cilia (where it must be localized to activate downstream signaling).

      As described in Response 3.3, a better evaluation of the importance of the TMD site is the use of mutations in residues that make hydrogen bonds with the hydroxyl group of TMD cholesterol. These mutations do not markedly increase protein stability or prevent ciliary localization (Kinnebrew 2022, Fig.S2). While a TMD site mutation decreases the magnitude of maximal (and basal) SMO signaling, it does not impact the fold-increase in signal output in response to Hh ligands (the key parameter that should be used to evaluate PTCH1 activity).

      (4) Qi et al., Nature 2019, and Deshpande et al., Nature 2019, both reported cholesterol binding at the TMD site based on high-resolution structural data. Oddly, Deshpande et al., Nature 2019 not cited in the discussion of TMD binding on p. 3, despite being one of the first papers to describe cholesterol in the TMD site and its necessity for activation (the authors only cite it regarding activation of SMO by synthetic small molecules)

      The reference has now been added at this location in the manuscript.

      (5) Kinnebrew et al., Sci Adv 2022 report that CRD deletion abolished PTCH regulation, which is seemingly at odds with several studies above (e.g., Byrne et al, Nature 2016; Myers et al, Dev Cell 2013); but this difference may reflect the use of an N-terminal GFP fusion to SMO in the Kinnebrew et al 2022, which could alter SMO activation properties by sterically hindering activation at the TMD site by cholesterol (but not synthetic SMO agonists like SAG); in contrast, the earlier work by Byrne et al is not subject to this caveat because it used an untagged, unmodified form of SMO.

      The reviewer fails to note that CRD deleted versions of SMO have markedly (>10-fold) higher basal activity than full-length SMO. The response to SHH is minimal (∼2fold), compared to >50-fold with full-length SMO. Thus, CRD-deleted SMO is likely in a non-native conformation. Local changes in cholesterol accessibility caused by PTCH1 inactivation or cholesterol loading can cause small fluctuations in delta-CRD activity, but this cannot be used to infer meaningful insights about how native, full-length SMO (with >10-fold lower basal activity) is regulated. Please see Response 3.3 for further details.

      Reviewer 3 presents an incomplete picture of the extensive experiments reported in Kinnebrew et. al. to establish the functionality of YFP-tagged delta-CRD SMO. Most importantly, a TMDselective sterol analog (KK174) can fully activate YFP-tagged delta-CRD, showing conclusively that the YFP fusion does not block sterol access to the TMD site. The fact that this protein is nearly unresponsive to SHH highlights the critical role of the CRD-bound cholesterol in SMO regulation by PTCH1. Indeed, the YFP-tagged, CRD-deleted SMO was made purposefully to test the requirement of the CRD in a construct that had normal basal activity. Again, this data justifies the value of investigating the path of cholesterol movement from the membrane via the TMD site to the CRD.

      (6) Although overexpression of PTCH1 and SMO (wild-type or mutant) has been noted as a caveat in studies of CRD-independent SMO activation by cholesterol, this reviewer points out that several of the studies listed above include experiments with endogenous PTCH1 and low-level SMO expression, demonstrating that SMO can clearly undergo activation by cholesterol (as well as regulation by PTCH1) in a manner that does not require the CRD.

      This comment is inaccurate. The data presented in Deshpande et. al. (and prior work in Myers et. al.) used transient transfection to overexpress SMO in Smo<sup>−/−</sup> cells. At the individual cell level transient transfection produces expression levels that are markedly higher (10-1000-fold) than stable expression (in addition to being more variable). Most scientists would agree that stable expression (as used in Kinnebrew 2022) at a moderate expression level is a better system to compare mutant phenotypes, assess basal and activated signaling, and provide an accurate measure of the fold-change in signal output in response to SHH. Notably, introduction of a mutation in the CRD cholesterol binding site at the endogenous mouse Smo locus (an even better experiment than stable expression) leads to complete loss of SMO activity (PMID 28344083). This result again justifies our investigation of the pathway of cholesterol movement from the membrane to the CRD site.

      We have changed the initial discussion and reflect a more general outlook.

      Changes made: (Paragraph 1, Introduction)

      “PTCH modulates the availability of accessible cholesterol at the primary cilium and thereby regulates SMO, with models invoking effects on both the CRD and 7TM pockets.”

      Changes made: (Results subsection 3, paragraph 1)

      “According to the dual-site model, to reach the binding site in the CRD (ζ), cholesterol translocate along the TMD-CRD interface from the TM binding site (α∗) is required.”

      Added lines: (Paragraph 5, Results subsection 3):

      “The computational investigation showed here covers the dual-site model, where cholesterol reaches the CRD site via binding to the TM binding site first. In comparison to the CRD site, the TM site is more stable by ∼ 2 kcal/mol (Figure 2—Figure Supplement 3b, d).”

      Added lines: (Paragraph 2, Conclusions):

      “Here we have explored the role the CRD-site plays in SMO activation. In addition, through simulating the CRD site-dependent SMO activation hypothesis, we have also simulated the TMD site-dependent activation. We show that the overall stability of cholesterol is higher than the CRD site by ∼ 2 kcal/mol.”

      (2) Bias in Presentation of Translocation Pathways

      The manuscript presents the model of cholesterol translocation through SMO to the CRD as the predominant (if not sole) mechanism of activation. Statements such as: "Cholesterol traverses SMO to ultimately reach the CRD binding site" (p. 6) suggest an exclusivity that is not supported by prior literature in the field. Indeed, the authors’ own MD data presented here demonstrate more stable cholesterol binding at the TMD than at the CRD (p 17), and binding of cholesterol to the TMD site is essential for SMO activation. As such, it is appropriate to acknowledge that cholesterol may activate SMO by translocating through the TM5/6 tunnel, then binding to the TMD site, as this is a likely route of SMO activation in addition to the CRD translocation route they highlight in their discussion.

      The authors describe two possible translocation pathways (Pathway 1: TM2/3 entry to TMD; Pathway 2: TM5/6 entry and direct CRD transfer), but do not sufficiently acknowledge that their own empirical data support Pathway 2 as more relevant. Indeed, because their experimental data suggest Pathway 2 is more strongly linked to SMO activation, this pathway should be weighted more heavily in the authors’ discussion. In addition, Pathway 2 is linked to cholesterol binding to both the TMD and CRD sites (the former because the TMD binding site is at the terminus of the hydrophobic tunnel, the latter via the translocation pathway described in the present manuscript), so it is appropriate that Pathway 2 figures more prominently than Pathway 1 in the authors’ discussion.

      The authors also claim that "there is no experimental structure with cholesterol in the inner leaflet region of SMO TMD" (p 16). However, a structural study of apo-SMO from the Manglik and Cheng labs (Zhang et al., Nat Comm, 2022) identified a cholesterol molecule docked at the TM5/6 interface and also proposed a "squeezing" mechanism by which cholesterol could enter the TM5/6 pocket from the membrane. The authors do not consider this SMO conformation in their models, nor do they discuss the possibility that conformational dynamics at the TM5/6 interface could facilitate cholesterol flipping and translocation into the hydrophobic conduit, despite both possibilities having precedent in the 2022 empirical cryoEM structural analysis.

      Recommendation: The authors should avoid oversimplifying the SMO cholesterol activation process, either by tempering these claims or broadening their discussion to better reflect the complexity and multiplicity of cholesterol access and activation routes for SMO. They should also consider the 2022 apo-SMO cryoEM structure in their analysis of the TM5/6 translocation pathway.

      We thank the reviewer for this comprehensive overview of the existing literature and parts we have missed to include in the discussion. We agree with the reviewer, since our data shows that both pathways are probable. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. While we agree that experimental evidence suggests the inner leaflet pathway is possible, we cannot discount the observations made in previous studies that support the outer leaflet pathway, particularly Hedger et al. (2019), Bansal et al. (2023), and Kinnebrew et al. (2021). Therefore, considering the reviewer’s comments have made the following changes:

      (1) Added lines: (Paragraph 3, Conclusions):

      “We show that the barriers associated with the pathway starting from the outer leaflet are lower by ∼0.7 kcal, (p=0.0013). We also provide evidence that cholesterol can enter SMO via both leaflets, considering that multiple computational and experimental studies have found cholesterol entry sites and activation modulation via the outer leaflet, between TM2TM3. This is countered by evidence from multiple experimental and computational studies corroborating entry via the inner leaflet, between TM5-TM6, including this study. Overall, we posit that cholesterol translocation from either pathway is feasible.”

      (2)nChanges made: (Paragraph 6, Results subsection 2)

      “Based on our experimental and computational data, we conclude that cholesterol translocation can happen via either pathway. This is supported on the basis of the following observations: mutations along pathway 2 affect SMO activity more significantly, and the presence of a direct conduit that connects the inner leaflet to the TMD binding site. In addition, a resolved structure of SMO in the presence of cholesterol shows a cholesterol situated at the entry point from the membrane into the protein between TM5 and TM6, in the inner leaflet. However, we also observe that pathway 1 shows a lower thermodynamic barrier (5.8 ± 0.7 kcal/mol vs. 6.5 ± 0.8 kcal/mol, p \= 0.0013). Additionally, PTCH1 controls cholesterol accessibility in the outer leaflet. This shows that there is a possibility for transport from both leaflets. One possibility that might alter the thermodynamic barriers is native membrane asymmetry, particularly the anionic lipid-rich inner leaflet. This presents as a limitation of our current model.”

      (3)nChanges made: (Paragraph 1, Results subsection 2)

      “In a structure resolved in 2022, cholesterol was observed at the interface between the protein and the membrane, in the inner leaflet, between TMs 5 and 6. However, cholesterol in the inner leaflet has a downward orientation, with the polar hydroxyl group pointing intracellularly (η). A striking observation is that this cholesterol binding site pose was never used as a starting point for simulations and was discovered independent of the pose described in Zhang et al. (2022) (Figure 4—Figure Supplement 1).”

      (3) Alternative Possibility: Direct Membrane Access to CRD

      The possibility that the CRD extracts cholesterol directly from the membrane outer leaflet is not considered. While the crystal structures place the CRD in a stable pose above the membrane, multiple cryo-EM studies suggest that the CRD is dynamic and adopts a variety of conformations, raising the possibility that the stability of the CRD in the crystal structures is a result of crystal packing and that the CRD may be far more dynamic under more physiological conditions.

      Recommendation: The authors should explicitly acknowledge and evaluate this potential mechanism and, if feasible, assess its plausibility through MD simulations.

      We thank the reviewer for the suggestion. We have addressed this comment by calculating the distance from the lipid headgroups for each lipid in the membrane to the cholesterol binding site. We show that in our study, we do not observe any bending of the CRD over the membrane, precluding any cholesterol from being extracted from the membrane directly.

      Added lines: (Paragraph 3, Conclusions):

      “An alternative possibility states that the flexibility associated with the CRD would allow it to directly access the membrane, and consequently, cholesterol. In the extensive simulations reported in this study, the binding site of cholesterol in the CRD remains at least 20 Å away from the nearest lipid head group in the membrane, suggesting that such direct extraction and the bending of the CRD do not occur within the timescales sampled (Appendix 2 – Figure 6).

      The mechanistic details of this process are still unexplored and form the basis of future work.”

      (4) Inconsistent Framing of Study Scope and Limitations

      The discussion contains some contradictory and misleading language. For example, the authors state that "In this study we only focused on the cholesterol movement from the membrane to the CRD binding site," and then several sentences later state that "We outline the entire translocation mechanism from a kinetic and thermodynamic perspective." These statements are at odds. The former appropriately (albeit briefly) notes the limited scope of the modeling, while the latter overstates the generality of the findings.

      In addition, the authors’ narrow focus on the CRD site constitutes a major caveat to the entire work. It should be acknowledged much earlier in the manuscript, preferably in the introduction, rather than mentioned as an aside in the penultimate paragraph of the conclusion.

      Recommendation: The authors should clarify the scope of the study and expand the discussion of its limitations. They should explicitly acknowledge that the study models one of several cholesterol access routes and that the findings do not rule out alternative pathways.

      We thank the reviewer for the suggestion. We have addressed this comment by explicitly mentioning the scope of the study.

      Changes made: (Paragraph 3, Conclusions)

      “We outline the entire translocation mechanism from a kinetic and thermodynamic perspective for one of the leading hypotheses for the activation mechanism of SMO.”

      (5) Summary:

      This study has the potential to make a useful contribution to our understanding of cholesterol translocation and SMO activation. However, in its current form, the manuscript presents an overly narrow and, at times, misleading view of the literature and biological models; as such, it is not nearly as impactful as it could be. I strongly encourage the authors to revise the manuscript to include:

      (1) A more balanced discussion of the CRD vs. TMD binding sites.

      (2) Acknowledgment of alternative cholesterol access pathways.

      (3) More comprehensive citation of prior structural and functional studies.

      (4) Clarification of assumptions and scope.

      Of note, the above suggestions require little to no additional MD simulations or experimental studies, but would significantly enhance the rigor and impact of the work.

      We thank the reviewer for the suggestions. We have taken into account the literature and diverse viewpoints. We have changed the initial discussion and reflected a more general outlook. In the revised version of the manuscript, we have refrained from referring to the CRD site as the orthosteric site. Instead, we refer to it as the CRD sterol-binding site. To better represent the dual-site model, we add further discussion in the Introduction. Through our manuscript, we have avoided using a competitive approach (that one pathway dominates over the other). Instead, we have evaluated both pathways independently and presented a comparative rather than competitive overview of both pathways from our observations. We explicitly mention the scope of the study.

    1. Reviewer #1 (Public review):

      Willeke et al. hypothesize that macaque V4, like other visual areas, may exhibit a topographic functional organization. One challenge to studying the functional (tuning) organization of V4 is that neurons in V4 are selective for complex visual stimuli that are hard to parameterize. Thus, the authors leverage an approach comprising digital twins and most exciting stimuli (MEIs) that they have pioneered. This data-driven, deep-learning framework can effectively handle the difficulty of parametrizing relevant stimuli. They verify that the model-synthesized MEIs indeed drive V4 neurons more effectively than matched natural image controls. They then performed psychophysics experiments (on humans) along with the application of contrastive learning to illustrate that anatomically neighboring neurons often care about similar stimuli. Importantly, the weaknesses of the approach are clearly appreciated and discussed.

      Comments:

      (1) The correlation between predictions and data is 0.43. I'd agree with the authors that this is "reliable" and would recommend that they discuss how the fact that performance is not saturated influences the results.

      (2) Modeling V4 using a CNN and claiming that the identified functional groups look like those found in artificial vision systems may be a bit circular.

      (3) No architecture other than ResNet-50 was tested. This might be a major drawback, since the MEIs could very well be reflections of the architecture and also the statistics of the dataset, rather than intrinsic biological properties. Do the authors find the same result with different architectures as the basis of the goal-driven model?

      (4) The closed-loop analysis seems to be using a much smaller sample of the recorded neurons - "resulting in n=55 neurons for the analysis of the closed-loop paradigm".

      (5) A discussion on adversarial machine learning and the adversarial training that was used is lacking.

    2. Reviewer #2 (Public review):

      This is an ambitious and technically powerful study, investigating a long-standing question about the functional organization of area V4. The project combined large-scale single-unit electrophysiology in macaque V4 with deep learning-based activation maximization to characterize neuronal tuning in natural image space. The authors built predictive encoding models for V4 neurons and used these models to synthesize most exciting images (MEIs), which are subsequently validated in vivo using a closed-loop experimental paradigm.

      Overall, the manuscript advances three main claims:

      (1) Individual V4 neurons showed complex and highly structured selectivity for naturalistic visual features, including textures, curvatures, repeating patterns, and apparently eye-like motifs.

      (2) Neurons recorded along the same linear probe penetration tended to have more similar MEIs than neurons recorded at different cortical locations (this similarity was supported by human psychophysics and by distances in a learned, contrastive image embedding space).

      (3) MEIs clustered into a limited number of functional groups that resembled feature visualizations observed in deep convolutional neural networks.

      Strengths:

      (1) The study is important in that it is the first to apply activation maximization to neurons sampled at such fine spatial resolution. The authors used 32-channel linear silicon probes, spanning approximately 2 mm of cortical depth, with inter-contact spacing of roughly 60 µm. This enabled fine sampling across most of the cortical thickness of V4, substantially finer resolution than prior Utah-array or surface-biased approaches.

      (2) A key strength is the direct in vivo validation of model-derived synthetic images by stimulating the same neurons used to build the models, a critical step often absent in other neural network-based encoding studies.

      (3) More broadly, the study highlights the value of probing neuronal selectivity with rich, naturalistic stimulus spaces rather than relying exclusively on oversimplified stimuli such as Gabors.

      Weaknesses:

      (1) A central claim is that neurons sampled within the same penetration shared MEI tuning properties compared to neurons sampled in different penetrations because of functional organization. I am concerned about technical correlations in activity due to technical or methodology-related approaches (for example, shared reference or grounding) instead of functional organization alone. These recordings were obtained with linear silicon probes, and there have been observations that neuronal activity along this type of probe (including neuropixels probes) may be correlated above what prior work showed, using manually advanced single electrodes. For example, Fujita et al. (1992) showed finer micro-domains and systematic changes in selectivity along a cortical penetration, and it is not clear if that is true or detectable here. I think that the manuscript would be strengthened by a more thorough and explicit characterization of lower-level response correlations (at the neuronal electrophysiology level) prior to starting with fitting models. In particular, the authors could examine noise correlations along the electrode shaft (using the repeated test images, for example), as well as signal correlations in tuning, both within and across sessions. It would also be helpful to clarify whether these correlations depended on penetration day, recording chamber hole (how many were used?), or spatial separation between penetrations, and whether repeated use of the same hole yielded stable or changing correlations. Illustrations of the peristimulus time histogram changes across the shaft and across penetrations would also help. All of this would help us understand if the reports of clustering were technically inevitable due to the technique.

      (2) It is difficult to understand a story of visual cortex neurons without more information about their receptive field locations and widths, particularly given that the stimulus was full-screen. I understand that there was a sparse random dot stimulus used to find the population RF, so it should be possible to visualize the individual and population RFs. Also, the investigators inferred the locations of the important patches using a masking algorithm, but where were those masks relative to the retinal image, and how distributed were they as a function of the shaft location? This would help us understand how similar each contact was.

      (3) A major claim is that V4 MEIs formed groups that were comparable to those produced by artificial vision systems, "suggesting potential shared encoding strategies." The issue is that the "shared encoding strategy" might be the authors' use of this same class of models in the first place. It would be useful to know if different functional groups arise as a function of other encoding neural network models, beyond the robust-trained ResNet-50. I am unsure to what extent the reported clustering, depth-wise similarity, and correspondence to artificial features depended on architectural and training bias. It would substantially strengthen the manuscript to test whether a similar organizational structure would emerge using alternative encoding models, such as attention-based vision transformers, self-supervised visual representations, or other non-convolutional architectures. Another important point of contrast would be to examine the functional groups encoded by the ResNet architecture before its activations were fit to V4 neuronal activity: put simply, is ResNet just re-stating what it already knows?

      (4) Several comparisons to prior work are presented largely at a qualitative level, without quantitative support. For example, the authors state that their MEIs are consistent with known tuning properties of macaque V4, such as selectivity for shape, curvature, and texture. However, this claim is not supported by explicit image analyses or metrics that would substantiate these correspondences beyond appeal to visual inspection. Incorporating quantitative analyses, for instance, measures of curvature, texture statistics, or comparisons to established stimulus sets, would strengthen these links to prior literature and clarify the relationship between the synthesized MEIs and previously characterized V4 tuning properties.

    3. Author response:

      We thank the reviewers for their careful reading and constructive feedback. We were glad to see that they recognized both the technical scope of the study and its contribution as the first to apply activation maximization with such fine spatial sampling. Their appreciation for the critical in vivo validation of model-derived stimuli is very encouraging.

      The reviewers raised several important points that we plan to address in the revised manuscript. These center on:

      Model Architecture and Potential Circularity:

      Both reviewers raised the concern that using a CNN-based model could introduce circularity when comparing V4 functional groups to artificial vision systems, and questioned whether similar results would emerge with alternative architectures. We believe that the in vivo verification provides a critical control for this concern: the MEIs synthesized by our model were empirically validated to elicit significantly higher responses than matched natural image controls, demonstrating that the model captures genuine biological tuning properties rather than architectural artifacts. This means that even if these features emerged from the particular architectural choice, the biological neurons seem to prefer the same features. We will clarify this point in the respective section in the revised manuscript.

      Recording locations and spike sorting contamination:

      Reviewer #2 raised concerns about potential correlation artefacts along the silicon probe. Unfortunately, assessing functional correlations across sessions proved challenging because neurons recorded at different penetration sites had non-overlapping receptive fields, precluding direct comparison of responses to identical stimuli across recording sites. We will make this limitation explicit in the manuscript. Furthermore, we maintain conservative standards for spike sorting to minimize the risk of multi-unit activity (MUA) "smearing" across unit definitions. Our primary analyses are restricted to well-isolated single units that meet all isolation metrics. Due to our low-impedance ground placed on the bone, shared-reference contamination as a source of tuning similarity is also mitigated.

      Quantitative Comparisons to Prior Literature:

      Reviewer #2 also noted that our comparisons between MEIs and known V4 tuning properties (e.g., shape, curvature, texture selectivity) were presented qualitatively, and suggested that explicit image analyses or metrics would strengthen these links to prior literature. We will revise the text to more carefully frame these comparisons as qualitative observations consistent with prior findings.

      Alternative Similarity Metrics:

      We will expand our justification for the Böhm et al. contrastive embedding approach in the Methods section. However, we believe that a systematic comparison of multiple clustering and similarity methods is beyond the scope of the current study.

      In the revised manuscript, we will address these points primarily through clarifications and expanded discussion. Specifically, we will: (1) strengthen our discussion of model architecture choice emphasizing that in vivo verification serves as a critical control against architectural artifacts; (2) clarify the stringent matching criteria underlying our closed-loop sample size and its consistency with the larger population analyses; (3) explicitly describe the recording geometry, including the use of multiple grid holes, and explain why direct functional comparisons across penetrations were precluded by non-overlapping receptive fields; (4) better characterize the spatial relationship between receptive fields and MEI masks; (5) reframe comparisons to prior V4 literature as qualitative observations rather than quantitative validations; and (6) expand our justification for the contrastive embedding approach. We believe these revisions will improve the clarity and rigor of the manuscript while appropriately scoping the claims to what the current data support.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Evidence, reproducibility and clarity

      Authors should be commended for the availability of data/code and detailed methods. Clarity is good. Authors have clearly spent a lot of time thinking about the challenges of metabolomics data analysis.

      Significance

      Schmidt et al. present MetaProViz, a comprehensive and modular platform for metabolomics data analysis. The tool provides a full suite of processing capabilities spanning metabolite annotation, quality control, normalization, differential analysis, integration of prior knowledge, functional enrichment, and visualization. The authors also include example datasets, primarily from renal cancer studies, to demonstrate the functionality of the pipeline. The MetaProViz framework addresses several long-standing challenges in metabolomics data analysis, particularly issues of reproducibility, ambiguous metabolite annotation, and the integration of metabolite features with pathway knowledge. The platform is likely to be a valuable addition for the community, but the reviewer has some comments that need to be addressed prior to publication.

      We thank the reviewer for this positive feedback.

      Comments:

      (1) (Planned)

      The section "Improving the connection between prior knowledge and metabolomics features" could benefit from additional clarification. It is not entirely clear to the reader what specific steps were taken beyond using RaMP-DB to translate metabolite identifiers. For example, how exactly were ambiguous mappings ("different scenarios") handled in practice, and to what extent does this process "fix" or merely flag inconsistencies? A more explicit description or example of how MetaProViz resolves these cases would help readers better understand the improvements claimed.

      We thank the reviewer for pointing this out and we agree that this section requires extension to ensure clarity. Beyond using RaMP-DB, we are characterising the mapping ambiguity (one-to-none, one-to-many, many-to-one, many-to-many) within and across metabolite-sets (i.e. pathways) and return this information to the user together with the translated identifiers. This is important to understand potential inflation/deflation of metabolite-sets that occur due to the translation. Moreover, we also offer the manually curated amino-acid collection to ensure L-, D- and zwitterion without chirality IDs are assigned for aminoacids (Fig. 2b). Ambiguous mappings are handled based on the measured data (Fig. 2e). Indeed, many translation cases that deflate (many-to-one mapping) or inflate (one-to-many mapping) the metabolite-sets are resolved when merging the prior knowledge with actual measured data (i.e. Fig. 2e, one-to-many in scenario 1, which becomes obsolete as only one/none of the many potential metabolite IDs is detected). By sorting each mapping into one of those scenarios, we only flag those cases. The reason for this decision has been that in many cases multiple decisions are valid (i.e. Fig. 2e, Scenario 5: Here the values of the two detected metabolites could be summed or the metabolite value with the larger Log2FC could be kept) and it should really be up to the user to make those dependent on their knowledge of the biological system and the analytical LC-MS method used.

      Since these points have not been clear enough, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data. This has also been suggested by Reviewer 3 (Minor Comment 7 and 8), so feel free to also see the responses below.

      (2) (Planned)

      The introduction of MetSigDB is intriguing, but its construction and added value are not sufficiently described. It would be helpful to clarify what specific advantages MetSigDB provides over directly using existing pathway resources such as KEGG, Reactome, or WikiPathways. For example, how many features, interactions, or metabolite-set relationships are included, and in what way are these pathways improved or extended compared to those already available in public databases?

      We thank the reviewer for this valuable comment and we apologise that this was not described sufficiently. One of the major advantages is that all the resources are available in one place following the same table format without the need to visit the different original resources and perform data wrangling prior to enrichment analysis. In addition, where applicable, we have removed metabolites that are not detectable by LC-MS (i.e. ions, H2O, CO2) to circumvent pathway inflation with features that are never within the data and hence impacting the statistical testing in enrichment analysis workflows.

      During the revision, we will compile an Extended Data Table listing all the resources present in MetSigDB, their number of features and interactions. We will also extend the methods section "Prior Knowledge access" about MetSigDB and how we removed metabolites.

      (3)

      Figure 1D/1E: The reviewer appreciates the inclusion of the visualizations illustrating the different mapping scenarios, as these effectively convey the complexity of metabolite ID translation. However, it took some time to interpret what each scenario represented. It would be helpful to include brief annotations or explanatory text directly on the figures to clarify what each scenario depicts and how it relates to the underlying issue being addressed.

      *We think the reviewer refers to Fig. 2D/E and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 2 (Minor Comment 1), who asked to extend the figure legend description of what the different scenarios display. *

      We have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242):

      "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      We have also rearranged the Scenarios in Fig. 2e. We hope that together with the extended figure legend this is now clear.

      (4) (Planned)

      "By assigning other potential metabolite IDs and by translating between the present ID types, we not only increase the number of features within all ID types but also increase the feature space with HMDB and KEGG IDs (Fig. 2a, right, SFig. 2 and Supplementary Table 1)". The reviewer would appreciate additional clarification on how this was done. It is not clear what specific steps or criteria were used to assign additional metabolite IDs or to translate between identifier types. The reviewer also appreciates the inclusion of the UpSet plots. However, simply having the plots side-by-side makes it difficult to determine the specific differences. An alternative visualization, such as stacked bar plots, scatter plots summarizing the changes in feature counts, or other representation that more clearly highlights the deltas, might make these results easier to interpret.

      The main Fig. 2a shows the original (left) metabolite ID availability per detected metabolite feature in the ccRCC data and the adapted (right) metabolite IDs. The individual steps taken to extend the metabolite ID coverage of the measured features and obtain Fig 2a (right), are shown in SFig. 2 for HMDB (SFig. 2a) and KEGG (SFig. 2b). We did not include the plots for the pubchem IDs as they follow the same principle. The individual steps we are showcasing with SFig. 2 are (I) How many of the detected features (577) have a HMDB ID (341, red bar + grey bar), (II) How this distribution changed after equivalent amino-acid IDs are added, which does not change the number of features with an HMDB ID, but the number of features with a single HMDB ID, and (III) How this distribution changed after translating from the other available ID types (KEGG and PubChem) to HMDB IDs using RaMP-DBs knowledge, which leads to 430 detected features with one or multiple HMDB IDs. The exact numbers can be extracted from Supplementary Table 1, Sheet "Feature metadata", where for example N-methylglutamate had no HMDB ID assigned in the original publication (see column HMDB_Original), yet by translating HMDB from KEGG (hmdb_from_kegg) and PubChem (see column hmdb_from_pubchem) we obtain in both cases the same HMDB ID "HMDB0062660". In order to clarify this in the manuscript, we have extended the figure legend of SFig. 2: "a-b) Bargraphs showing the frequency at which a certain number of metabolite IDs per integrated peak are available as per ccRCC patients feature metadata provided in the original publication (left), after potential equivalent IDs for amino-acid and amnio-acid-related features were assigned (middle), which increases the number of features with multiple (middle: grey bars) and after IDs were translated from the other available ID types (right). for a) Of 577 detected features, 341 had at least one HMDB IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from KEGG-to-HMDB and from PubChem-to-HMDB increased the number of features with an HMDB ID from 341 to 430 (left). and __b) __Of 577 detected features, 306 had at least one KEGG IDs assigned (left graph, red + grey bar) according to the original publication (left). Translating from HMDB-to-KEGG and from PubChem-to-KEGG did not increase the total number of features with an KEGG ID (left)."

      We like the suggestion of the reviewer to provide representations of the deltas and will add additional plots to SFig. 2 as part of our planned revision.

      (5) (Planned)

      MetaboAnalyst is mentioned several times in the manuscript. The reviewer is familiar with some of the limitations and practical challenges associated with using MetaboAnalyst and its R package. Given that MetaboAnalyst already offers some overlapping functionality with MetaProViz (and offers it in the form of an interactive website and a sometimes functional R package), a more explicit comparison between the two tools would help readers fully understand the unique advantages and improvements provided by MetaProViz.

      This is a good point the reviewer raises. As part of the revisions, we plan to create a supplementary data table that includes both tools and their respective features. We will refer to this table within the manuscript text.

      (6)

      Page 11: The authors state that they used limma for statistical testing, including for the analysis of exometabolomics data, where the values appear to represent log2-transformed distances or ratios rather than normally distributed intensities. Since limma assumes approximately normal residuals, please provide evidence or justification that this assumption holds for these data types. If the distributions deviate substantially from normality, a non-parametric alternative might be more appropriate.

      For exometabolomics data we use data normalised to media blank and growth factor (formula (1)). Limma is performed on those data, not on the log2-transformed distances. The Log2(Distance) is calculated separately to the statistical results using the normalised exometabolomics data. In addition, we always perform the Shapiro-Wilk test as part of MetaProViz differential analysis function on each metabolite to understand the distribution. In this particular case we have the following distributions:

      Cell line

      Metabolites normal distribution [%]

      Metabolites not-normal distribution [%]

      HK2

      82.35

      17.65

      786-O

      95.71

      4.29

      786-M1A

      97.14

      2.86

      786-M2A

      88.57

      11.43

      OSRC2

      92.86

      7.14

      OSLM1B

      85.71

      14.29

      RFX631

      97.14

      2.86

      If a user would have distributions that deviate substantially from normality, non-parametric alternatives are also available in MetaProViz (see methods section for all options).

      7)

      Page 13: why were young and old defined this way? Authors should provide their reasoning and/or citations for this grouping.

      We thank the reviewer for pointing this out. The explanation of our choices of the age groups is purely based on the literature:

      First, ccRCC can be sporadic (>96%) or familial (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3308682/pdf/nihms362390.pdf). This was also observed in other cohorts, where of 1233 patients only 93 were under 40 years of age (%, whilst 1140 (%) were older than 40 years (https://www.europeanurology.com/article/S0302-2838(06)01316-9/fulltext). Second, given the high frequency of sporadic cases it is unsurprising that ccRCC incidences were found to peak in patients aged 60 to 79 years with more male than female incidences (https://journals.lww.com/md-journal/Fulltext/2019/08020/Frequency,_incidence_and_survival_outcomes_of.49.aspx). Third, it was shown that sex impacts on the renal cancer-specific mortality and is modified by age, which is a proxy for hormonal status with premenopausal period below 42 years and postmenopausal period above 58 years (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4361860/pdf/srep09160.pdf). Putting all of this information together, we decided on our age groups of young (58years) following the hormonal period in order to account for sex impact. Additionally, our young age group is representative of the age of familial ccRCC, whilst our old age group summarises the age group where incidences were found to peak.

      To make this clear in the manuscript we have extended the method section of the manuscript (Line 547-548):

      "For the patient's ccRCC data, we compared tumour versus normal of two patient subset, "young" (58years)."

      (8)

      Figure 4e: It may help with interpretation to have these Sankey-like graph edges be proportional to the number of metabolites.

      We thank the reviewer for this suggestion, which we also pondered. When we tested this visualisation, the plot became convoluted, hard to interpret and not all potential flows exist in the data. This is why we have opted to create an overview graph of each potential flow, with each edge representing a potentially existing flow. The number of times a flow exists is shown in Fig. 4f.

      (9)

      Figure 4h: The values appear to be on an intensity scale (e.g., on the order of 3e10), yet some of them are negative, which would not be expected for raw or log-transformed mass spectrometry intensities. It is unclear whether these represent normalized abundance values, distances, or some other transformation. In addition, for the comparison of tumour versus normal tissue, it is not specified what statistical test was applied. Since mass spectrometry data are typically log2-transformed to approximate a log-normal distribution before performing t-tests or similar parametric methods, clarification is needed on how these data were processed.

      Thanks for pointing this out, it made us realize that we need to extend our figure legend for clarity for Fig. 4h (Line 343-345). In both cases we show normalized intensities following the workflow described in Fig. 3a. In case of the left graph labelled "CoRe", we are plotting an exometabolomics experiment, were additionally normalised using both media blanks (samples where no cells were cultured in) and growth factor (accounts for cell growth during experiment) as growth rate (accounts for variations in cell proliferation) has not been available (see also formula (1) in methods section). A result has a negative value if the metabolite has been consumed from the media, or a positive value if the metabolite has been released from the cell into the culture media.

      In addition, the reviewer refers to the comparison of tumour versus normal (Fig. 4a __and 4d__) and the missing description of the chosen statistical test. We have added the details to the figure legend (Lines 334 and 345).

      Adapted legend Fig. 4: "a) Differential metabolite analysis results for exometabolomics data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. b) __Heatmap of mean consumption-release of the measured metabolites across cell lines. c) Heatmap of normalised ccRCC cell line exometabolomics data for the selected metabolites of amino acid metabolism for a sample subset. __d) __Differential metabolite analysis results for intracellular data comparing 786-O versus HK2 cells using Annova and false discovery rate (FDR) for p-value adjustment. __e) __Schematics of bioRCM process to integrate exometabolomics with intracellular metabolomics and __f) __number of metabolites by their combined change patterns in intracellular- and exometabolomics in 786-M1A versus HK2. g)__ Heatmap of the metabolite abundances in the "Both_DOWN (Released/Comsumed)" cluster. __h) __Bar graphs of normalised methionine intensity for exometabolomics (CoRe: negative value, if the metabolite has been consumed from the media, or a positive value, if the metabolite has been released from the cell into the culture media) and intracellular metabolomics (Intra)."


      (10)

      Figure 5: "Tukey's p.adj We thank the reviewer for pointing this out. We have used the TukeyHSD (Tukey's Honestly Significant Difference) test in R on the Anova results. We have added more details into the figure legend (Line 384): "(Tukey's post-doc test after anova p.adj<br /> (11)

      The potential for multi-omics is mentioned. Please clarify how generalizable this framework is. Can it readily accommodate transcriptomics, proteomics, or fluxomics data, or does it require custom logic or formatting for each new data type?

      Thanks for raising this question. MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis using for example MetalinksDB metabolite-receptor pairs. Yet, MetaProViz does not support modelling fluxomics data into metabolic networks. We state in the discussion that this could be future development ("Beyond current capabilities, future developments could also incorporate mechanistic modeling to capture metabolic fluxes, subcellular compartmentalization, enzyme kinetics, regulatory feedback loops, and thermodynamic constraints to dissect metabolic response under perturbations."). To clarify on the availability of multi-omics integration for combined enrichment analysis, we have added some more details into the discussion section.

      Line 467-469: "In addition, providing knowledge of receptor-, transporter- and enzyme-metabolite pairs, MetaProViz can readily accommodate transcriptomics and proteomics data for combined enrichment analysis."

      (12)

      Please clarify if/how enrichment analyses account for varying set sizes and redundant metabolite memberships across pathways, which can bias over-representation analysis results.

      This is a very relevant point, which we have already been working on. Indeed, we agree that enrichment results from enrichment analyses can be biased due to varying set sizes and redundant metabolite memberships across pathways. MetaProViz explicitly accounts for varying set sizes when running over representation analysis (functions standard_ora()and cluster_ora()), which uses a model that computes the p-value under a hypergeometric distribution. Thereby, larger pathways are penalized unless the overlap is proportionally large, while smaller pathways can be significant with fewer overlaps. Hence, the test quantifies whether the observed overlap between the query set and a pathway is larger than would be expected under random sampling. In addition, we explicitly filter by gene‑set size using min_gssize/max_gssize, which further controls for extreme small or large sets. So both the statistical test itself and the size filters incorporate gene‑set size variation.

      Regarding the redundant metabolite-set (i.e. pathways) memberships, we have now implemented a new function (cluster_pk()) to cluster metabolite-sets like pathways based on overlapping metabolites. Thereby we allow investigation of enrichment results in regard to redundancy and similarity. For given metabolite-sets, the function calculates pathway similarities via either overlap- or correlation-based metrics. After optional thresholding to remove weak similarities, we implemented three clustering algorithms (connected-components clustering, Louvain community detection and hierarchical clustering) to group similar pathways. We then visualize the clustering results as a network graph using the new function viz_graph based on igraph. We have added all information into our methods section "Metabolite-set clustering" (Lines 656-671). In addition, we have also added the results of the clustering into Fig. 5f.

      New Fig. 5f:"f) *Network graph of top enriched pathways (p.adjusted

      Reviewer #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

      We thank the reviewer for this positive feedback on our package. We appreciate that there are no major comments from the reviewer.

      Minor comments:

      (1)

      Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.

      We thank the reviewer for pointing this out and we acknowledge that this is a complex problem we try to convey. We received a similar comment from Reviewer 1 (Comment 3), so please see the extensive response there. In brief, we have extended the figure legend and specifically explained each displayed case and its meaning (Line 222-242) and extended the Figure itself by adding additional categories to Fig. 2e.

      Extended legend Fig.2 d-e: "d-e) Schematics of possible mapping cases between metabolite IDs (= each circle corresponds to one ID) of a pathway-metabolite set (e.g. KEGG) to metabolites IDs of a different database (e.g. HMDB) with (d) showing many-to-many mappings that can occur within and across pathway-metabolite sets and (e) additionally showing the mapping to metabolite IDs that were assigned to the detected peaks within and across pathway-metabolite sets. (d) __Translating the metabolite IDs of a pathway-metabolite set can lead to special cases such as many-to-one mappings (Pathway 1), where for example the original resource used the ID for L-Alanine (Pathway 1, green) and D-Alanine (Pathway 1, yellow) in the amino-acid pathway, whilst the translated resources only has an entry for Alanine zwitterion (Pathway 1, blue). Additionally, many-to-one mappings can also occur across pathways (Pathway 2-4), where this mapping is only detected when mappings are analysed taking all pathways into account. Both of these cases deflate the pathways, which can also happen for one-to-none mappings (Pathway 1, white). There are also cases that inflate the pathway such as one-to-many mappings (e.g. Pathway 2-4, orange mapping to pink and violet). (e)__ Showcasing the different scenarios when merging measured data (detected) based on the translated metabolites within pathways (scenario 1-5) and across pathways (scenario 6-8) highlighting problematic scenarios (4-7) that require further actions. Unproblematic scenarios (1-3 and 8) can include special cases between original and translated (i.e. one-to-many in scenario 1), which become obsolete as only one/none of the many potential metabolite IDs is detected. Yet, if multiple metabolites are detected action is required (scenario 5), which can include building the sum of the multiple detected features or only keeping the one with the highest Log2FC between two conditions. Other special cases between original and translated (i.e. many-to-one in scenario 4 and 6) also depend on what has been mapped to the measured features. If features have been measured in those scenarios, pathway deflation (i.e. only one original entry remains) or measured feature duplication (the same measurement is mapped to many features in the prior knowledge) are the possible results within and across pathways. Those scenarios should be addressed on a case-by-case basis as they also require biological information to be taken into account."

      (2) Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.

      We think the reviewer refers to this section in the text (Line 363-370):

      "Next, we focused on the cluster "Both_DOWN (Released-Consumed)" and found that several amino acids are consumed by the ccRCC cell line 786-M1A but released by healthy HK2 cells. At the same time, intracellular levels are significantly lower than in HK2 (Log2FC = -0.9, p.adj = 4.4e-5) (Fig. 4g). To explore the role of these metabolites in signaling, we queried the prior knowledge resource MetalinksDB, which includes metabolite-receptor, metabolite-transporter and metabolite-enzyme relationships, for their known upstream and downstream protein interactors for the measured metabolites (Supplementary Table 5). This approach is especially valuable for exometabolomics, as it allows us to generate hypotheses about cell-cell communication. Notably, we identified links involving methionine (Fig. 4h), enzymes such as BHMT, and transporters such as SLC43A2 that were previously shown to be important in ccRCC25,42 (Supplementary Table 5)."

      We have now extended this part to clearly state that here MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 363-364). In addition we have extended our summary statement to ensure clarity for the reader that we combine the biological clustering, which revealed the amino acid changes, with prior knowledge for the mechanistic insight (Line 380-381):

      "In summary, calculating consumption-release and combining it with intracellular metabolomics via biological regulated clustering reveals metabolites of interest. Further combining these results with prior knowledge using the MetaproViz toolkit facilitates biological interpretation of the data."

      (3)

      Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      The reviewer is correct in stating the functional diversity of metabolites, which is also why prior knowledge is needed to add mechanistic interpretation to the finding from the metadata analysis (as we showcased by focusing on the separation of age (Fig. 5c-d)). We think that approaches such as PCA or enrichment can be helpful, even if admittedly limited. For example, in the metadata analysis presented in Fig. 5b and the subsequent enrichment analysis presented in Fig. 5, we used PCA to extract the eigenvector and the loading, which act as weights indicating the contribution of each original metabolite to that specific principal components separation. Hence, the eigenvector of PCA shows the metabolite drivers of the separation. This does not necessarily mean that those metabolites are drivers of a (patho)physiological state - the (patho)physiological state can equally be the reason for those metabolites driving the separation on the Eigenvectors. Thus, the metadata analysis presented in Fig. 5b enables us to extract the metadata variables (patho)physiological states separated on a PC with the explained variance. This can also lead to co-variation, when multiple (patho)physiological states are separated on the same PC, as the reviewer correctly points out. Regarding the enrichment analysis, we provide different types of prior knowledge for classical mapping, but also the prior knowledge we used to create the biological regulated clustering, which together help to identify key metabolic groups as we can first cluster the metabolites and afterwards perform functional enrichment. Yet, this does not account for the technical issues of enrichment analysis. In this context multi-omics integration building metabolic-centric networks could further elucidate the diversity of metabolic pathways and connection to signalling and co-variation, yet this is not the scope of MetaProViz. To sum up, we are aware of the limitations of this analysis and the constraints on the downstream interpretation.

      To capture the functional diversity amongst metabolites, which leads to metabolites being present in multiple pathways of metabolite-pathways sets, we have implemented a new function to cluster metabolite-sets like pathways based on overlapping metabolites and visualize redundant metabolite-set (i.e. pathways) memberships (Fig.5f). For more details also see our response to Reviewer 1, Comment 12. We hope this will circumvent miss- and over-interpretation of the enrichment results.

      In addition, we have extended the text to include the analysis pitfalls explicitly (Line 416-419): "Another variable explaining the same amount of variance in PC1 is the tumour stage, which could point to adjacent normal tissue metabolic rewiring that happens in relation to stage and showcases that biological data harbour co-variations, which can not be disentangled by this method."

      Reviewer #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

      Major comments affecting conclusions

      none.

      We thank the reviewer for this positive feedback on evidence, reproducibility and clarity as well as significance of our work given the reviewers experience with metabolomics data analysis mentioned. We appreciate that there are no major comments from the reviewer.

      Minor comments

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      (1)

      1- You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.

      Thanks, that's a good note and we have removed it from the abstract and introduction.

      (2)

      2- You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.

      We fully agree with the reviewer on imputation handling. The manuscript we cite from Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods and made this comparison strategy available as a web-based tool not as any code-based package such as an R-package. Yet, the reviewer is right, the web-tool is no longer reachable. Hence, we have adapted the statement in our introduction (Line 61-62): "Moreover, there are tools that focus on specific steps of the pre-processing of feature intensities, which encompasses feature selection, missing value imputation (MVI)9 and data normalisation. For example, MetImp4 is a web-tool that includes and compares multiple MVI methods9. "

      (3)

      3- The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.

      We appreciate the comment and have highlighted this in the abstract and introduction: "MetaProViz operates on annotated intensity values..." (Line 29 and 88).

      Given the newest advancements in metabolite identification using AI-based methods, MetaProViz toolkit with a focus on connecting metabolite IDs to prior knowledge becomes increasingly valuable. We added this to our discussion (Line 484-488): "Given the imminent shift in metabolite identification through AI-based approaches, including language model-guided48 methods and self-supervised learning49, the growing number of identified metabolites will make the MetaProViz toolkit increasingly valuable for the community to gain functional insights."

      In regards to the introduction, where we mention some tools for peak annotation: The reason why we have this paragraph where peak annotation are named is that we wanted to set the basis by (I) listing the different steps of metabolomics data analysis and (II) pointing to well-known tools of those steps. We also have a dedicated paragraph for pathway-analysis challenges.

      (4)

      4- I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.

      We thank the reviewer for this positive feedback.

      (5)

      5- It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?

      We specifically chose the ccRCC cell line data as example since, for a multitude of cell lines, both media (exometabolomics) and intracellular metabolomics had been performed. The combination of both data types is only used in the biological regulated clustering (Fig. 5e-g), all other analyses do not require additional data modalities. We have not specifically tested how performance differs for this particular case as it would require multiple paired data (exometabolomics and intracellular metabolomics) taken at the same time and at different times.

      (6)

      6- Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.

      Thanks for raising this point. We initially had ordered the bars based on their intersection size, but we agree with the reviewers that for our point it makes sense to fix the order in the adapted plot to match the order of the original plot. We have done this (Fig 2a) and also extended the figure legend text of SFig. 2, which shows the individually performed adaptations summarized in Fig 2a.

      (7) (Planned)

      7- In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.

      We thank the reviewer for this suggestion. Since this is a complex problem, we will add a more explicit description to the results section by showcasing more details on how we exactly tackled this problem in the ccRCC example data.

      In regards to D- and L-alanine, even though chirality is an important biological feature, in a standard experiment we can not distinguish if we detect the L- or D-aminoacid. This is why we try to assign all possible IDs to increase the overlap with the prior knowledge. In Fig. 2b we showcase that this can potentially lead to multiple mappings of the same measured feature to multiple pathways. For example, if we measure alanine and assign the pubchem ID for L-Alanine, D-Alanine and Alanine and try to map to metabolite-sets that include both L-Alanine and D-Alanine. In turn this could fall into Scenario 6 (Fig. 2e), where across pathways there is a D-Alanine specific one (Pathway 1) and a L-Alanine specific one (Pathway 2). Now we can decide, if we want to allow both mapping (many-to-one) or if we decide to exclude D-Alanine because we know our biological system is human and should primarily have L-Alanine.

      (8) (Planned)

      8- In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?

      We have quantified the frequency for the example of translating the KEGG metabolite-set into HMDB IDs (Fig. 2c, left panel). Yet, we are not showcasing the quantification across the KEGG metabolite-sets with this plot. During the revision we will add the full results available to the Extended Data Table 2, which currently only includes the results displayed in Fig.2c.

      (9)

      9- QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.

      Yes, we totally agree with the reviewer on this. For this reason, we have applied CV only in instances where this is not leading to any condition-driven CV differences, but is truly feature-focused: (1) Function pool_estimation performs CV on the pool samples only, which are a homogeneous mixture of all samples, and hence can be used to assess feature variability. (2) Function processing performs CV on exometabolomics media samples (=blanks), which are also not impacted by different conditions.

      (10)

      10- Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).

      We have decided to only offer support for MNAR, since we would recommend MVI only if there is a biological basis for it.

      As mentioned in the response to your minor comment 2, Wei et. al. (https://doi.org/10.1038/s41598-017-19120-0) compared a multitude of missing value imputation methods. They compared six imputation methods (i.e., QRILC, Half-minimum, Zero, RF, kNN, SVD) for MNAR and systematically measured the performance of those imputation methods. They showed that QRILC and Half-Minimum produced much smaller SOR values, showing consistent good performances on data with different numbers of missing variables. This was the reason for us to only provide Half-minimum.

      (11) (Planned)

      11- In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.

      This is a good suggestion and refers to the steps described in Fig. 3a. We will create an overview table for this, add it into the Extended Data Table and refer to it in the results section.

      (12)

      12- Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.

      The reason we decided to use PCA was the standardly used combination with the Hotelling T2 outlier testing. Since PCA is a linear dimensionality reduction technique that preserves the overall variance in the data and has a clear mathematical foundation linked to the covariance structure, it specifically fits the required assumptions of the Hotelling T2 outlier testing. Indeed, Hotelling T2 relies on the properties of the covariance matrix and the assumption of a multivariate Gaussian distribution. UMAP is a non-linear dimensionality reduction technique, which prioritizes preserving local and global structures in a way that often results in good clustering visualization, but it distorts distances between clusters and does not have the same rigorous statistical underpinnings as PCA. In terms of PLS-DA, which focuses on maximizing the covariance between variables and the class labels, even though not commonly done, one could use the optimal latent variables for discrimination and apply Hotelling's T² to those latent variables. Yet, PLS-DA is supervised and actively tries to separate data points in the latent space, which can be misleading for outlier detection where methods like PCA that are unbiased, unsupervised and preserve global variance are advantageous.

      (13)

      13- Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.

      Yes, definitely. In fact, we have used the metadata analysis strategy also with proteomics data and it will work equally with any omics data type.

      (14)

      14- While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.

      We thank the reviewer for the suggestion of the aPEAR graphs. Following this suggestion, we have implemented a new function to enable clustering of the pathways based on overlapping metabolites (cluster_pk()). For more details regarding the method see also our response to Reviewer 1 (Comment 12) and our extended method section "Metabolite-set clustering" (Lines 656-671). We visualize the clustering results as a network graph, which we also included into Fig. 5f.

      The complete result of the KEGG enrichment can be found in Extended Data Table 1, Sheet 13 (Pathway enrichment analysis using KEGG on Young patient subset). The pathways are ranked by p.adjusted value and also include a score (FoldEnrichment) from the fishers exact test (similar to NES scores in GSEA). Here one can find a total of seven pathways with a p.adjusted value For Fig. 5e we narrowed down to these two pathways based on the previous findings of dysregulated dipeptides (Fig. 5d), as we searched for a potential explanation of this observation.

      (15)

      15- Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?

      Downloading and parsing databases takes significant time, especially large ones like RaMP or HMDB might take minutes on a standard laptop. Our local cache speeds up the process by eliminating the need for repeated downloads. In the future, database access will be even faster: according to our plans, all prior knowledge will be accessible in an already parsed format by our own API (omnipathdb.org). The ambiguity analysis, which is a complex data transformation pipeline, and plotting by ggplot2, another key component of MetaProViz, are the slowest parts, especially when performing analysis for the first time when no cache can be used. This means there are a few slow operations which complete in maximum a few dozens of seconds. However, the implementation and speed of these solutions doesn't fall behind what we commonly find in bioinformatics packages, and most importantly, the speed of MetaProViz doesn't pose an obstacle or difficulty regarding an efficient use of it in analysis pipelines.

      (16)

      16- I clap to the authors for automated checks if selected methods are appropriate!

      Thank you, this is something we think is important to ensure correct analysis and circumvent misinterpretation.

      (17)

      17- My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.

      We fully agree that power calculations are very important. Yet, this should ideally happen prior to the user's experiment. MetaProViz analysis starts at a later time-point and power calculations should have been done before. In regards to p-value histogram, we have implemented a similar measure, namely a density plot, which is plotted as a quality control measure within MetaProViz differential analysis function. The density plot is a smoothed version of a histogram that represents the distribution as a continuous probability density function and can be used to assess whether the p-values follow a uniform distribution.

      (18)

      18- Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Regarding the clarity to the pathway enrichment and their functional insights, we have extended the Figure legends of Fig. 4 and 5, clearly state that for the functional interpretation MetalinkDB is the prior knowledge resource we used to identify the links for methionine (Line 367-368), and we have extended our summary statement to highlight that we combine the biological clustering with prior knowledge for the mechanistic insight (Line 380-381).

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This manuscript introduces an R package MetaProViz for metabolomics data analysis (post anotation), aiming to solve a poor-analysis-choices problem and enable more people to do the analysis. MetaProViz not only guides people to select the best statistical method, but also enables to solve previously unsolved problems: e.g. multiple and variable metabolite names in different databases and their connections to prior knowledge. They also created exometabolomics analysis and the needed steps to visualise intra-cell / media processes. The authors demonstrated their new package via kidney cancer (clear-cell renal cell carcinoma dataset, steping one step closer to improve biological interpretability of omics data analysis.

      Major comments affecting conclusions: none.

      Minor comments, important issues that could be addressed and possibly improve the clarity or generally presentation of the tool. Please see all below.

      1. You start with separating and talking about metabolomics and lipidomics, but lipidomics quickly dissapears (especially beyond abstract/intro) - no real need to discuss lipidomics.
      2. You refer to the MetImp4 imputation web tool, but I cannot find an active website, manuscript, or R package for it, and the cited link does not load. This raises doubts about whether the tool is currently usable. Additionally, imputation choice should be guided by biological context and study design, not just by testing a few methods and selecting the one that performs best.
      3. The authors address key metabolomics issues such as ambiguous metabolite names and isoforms, and their focus on resolving mapping ambiguities and translating between database identifiers is highly valuable. However, the larger challenge of de novo identification and the "dark matter" of unannotated metabolites remains unresolved (initiatives as MassIVE might help in the future https://massive.ucsd.edu/ProteoSAFe/ ), and readers may benefit from clearer acknowledgement that MetaProViz does not operate on raw spectral data. The introduction currently emphasizes annotation, but since MetaProViz requires already annotated metabolite tables (and then deals with all the messiness), this space might be better used to frame the interpretability and pathway-analysis challenges that the tool directly addresses.
      4. I also really enjoyed you touching on the point of user-friendly but then inflexible and problem of reproducibility. We truly need well working packages for other bioinformaticians, rather than expecting wet-lab scientists to do all the analysis within the user interface.
      5. It would be helpful to explain why the authors chose cancer/RCC samples for the demonstration. Was it because the dataset included both media and cell measurements? Does the tool perform best when multiple layers of information are available from the same experiment?
      6. Figure 2B: The upset plots effectively show increased overlap after adaptation, but it would be easier to compare changes if the order of the intersection bars in the "adapted" plot matched the original. For example, while total intersections increased (251→285), the PubChem+KEGG overlap decreased (24→5), likely due to reallocation to the full intersection.
      7. In your example of D-alanine and L-alanine - you mention how chirality is important biological feature, but up to this point it's not clear how do you do translation exactly and in which situations this would be treated just as "alanine" and when the more precise information would be retained? You mention RaMP-DB knowledge and one to X mappings as well as your general guidance in the "methods" part, but it would be useful to describe in this publication how you exactly tackled this problem in the ccRCC case.
      8. In one to many mappings, it would be interesting to see quantification how frequently it was happening within a pathway or across pathways. I.e. Would going into pathway analysis "solve" the issue of "lost in translation" or not really?
      9. QC: the coefficient of variation (CV) helps identify features with high variability and thus low detection accuracy. Here it's important to acknowledge that if the feature is very variable between groups it can be extremely important, but if the feature is very variable within the group - only then one would have low trust in the accuracy.
      10. Missing value imputation - while missing not at random is a great way to deal with missingness, it would be great to have options for others (not just MNAR), as missingness is of a complex nature. If a pretty strong decision has been made, it would be good to support this by some supplementary data (i.e. how results change while applying various combinations of missingness and why choosing MNAR seems to be the most robust).
      11. In the pre-processing and imputation stages - it would be interesting to see a summary table of how many features are left after each stage.
      12. Is there a reason not to do UMAP or PSL-DA graphs for outlier detection? Doing more than PCA would help to have more confidence in removing or retaining outliers in the cases where biological relevance is borderline.
      13. Metadata vs metabolite features - can this be used beyond metabolomics (i.e. proteomics, transcriptomics, etc)? It can be always very useful when there are many metadata features and it's hard to pre-select beforehand which ones are the most biologically relevant.
      14. While authors discussed what KEGG pathways were significantly deregulated, it would be interesting to see all the pathways that were affected (e.g. aPEAR "bubble" graphs can show this (https://github.com/kerseviciute/aPEAR) , or something similar to NES scores). I appreciate the trickiness of it, but it would be quite interesting to see how authors e.g. Figure5e narrowed it down to the two pathways and how all the others looked like.
      15. Could you comment on the runtime of the pipeline? In particular, do the additional translation steps and use of multiple databases substantially affect computational speed?
      16. I clap to the authors for automated checks if selected methods are appropriate!
      17. My suggestion would be to also look into power calculation or p-value histogram. In your example you saw some clear signal, but very frequently research studies are under-sampled and while effect can be clearly seen, there are just not enough samples to have statistically significant hits.
      18. Overall functional parts are novel and next step in helping with data interpretability, but I still found it hard to read into functionally clear insights (re to pathways / functional groupings of metabolites) - especially as you have e.g. enzyme-metabolite databases etc. I think clarity there could be improved and would help to get your message more widely across.

      Significance

      This is a great tool and I can't wait to use it on many upcoming metabolomics projects! Authors tackle multiple ongoing issues within the field: from poor selection of statistical methods (they provide guidance or have default safer options) to the messiness of data annotation between databases and improving data interpretability. The field is still evolving quickly, and it's impossible to solve all problems with one package; thus some limitations within the package could be seen as a bit rigid. Nonetheless, this fully steps toward filling an existing methodological gap. All bioinformaticians doing metabolomic analysis, or those learning how to do it, will greatly benefit from this knowledge.

      I myself lead a team of 6 bioinformaticians, and we do analysis for researchers, clinicians, drug discovery, and various companies. We run internal metabolomics pipelines every day and fully sympathise with the problems addressed by the authors.

    3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Schmidt et al report the development of MetaProViz, an integrated R package to process, analyze and visualize metabolomics data, including integration with prior knowledge. The authors then go on to demonstrate utility by analyzing several metabolomes of cell lines, media and patient samples from kidney cancer. The manuscript provides a concise description of key challenges in metabolomics that the authors identify and address in their software. The examples are helpful and illustrative, although I should point out that I lack the expertise to evaluate the R package itself. I only have a few very minor comments.

      Minor comments:

      1. Figure 2D, E: While the schematics are fairly intuitive, a brief figure legend description of what the different scenarios etc. represent would make this easier to grasp.
      2. Fig. 4: The authors briefly state that they integrate prior knowledge to identify the changes in methionine metabolism in kidney cancer, but it is not clear how exactly they contribute to this conclusion. It could be helpful to expand a bit on this to better illustrate how MetaProViz can be used to integrate prior knowledge into the analysis workflow.
      3. Given the functional diversity among metabolites -central to diverse pathways, are key signaling molecules, restricted functions, co-variation within a pathway - I wonder how informative approaches such as PCA or enrichment analyses are for identifying metabolic drivers of a (patho)physiological state. To some extent, this can be addressed by integrating prior knowledge, and it would be helpful if the authors could comment on (and if applicable explain) whether/how this is integrated into MetaProViz.

      Significance

      This is a very significant advance from one of the leading groups in the field that is likely to enhance metabolomics data analysis in the wider community.

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      The author presents a new method for microRNA target prediction based on (1) a publicly available pretrained Sentence-BERT language model that the author fine-tunes using MeSH information and (2) downstream classification analysis for microRNA target prediction. In particular, the author's approach, named "miRTarDS", attempts to solve the microRNA target prediction problem by utilizing disease information (i.e., semantic similarity scores) from their language model. The author then compares the prediction performance with other sequence- and disease-based methods and attempts to show that miRTarDS is superior or at least comparable to existing methods. The author's general approach to this microRNA target prediction problem seems promising, but fails to demonstrate concrete computational evidence that miRTarDS outperforms other existing methods. The author's claim that disease information-based language models are sufficient is unfounded. The manuscript requires substantial rewriting and reorganization for readers with a strong background in biomedical research.

      We appreciate the reviewer’s careful examination of modeling, benchmarking, and interpretation, and we are particularly encouraged that they found the proposed method promising. We will make corresponding revisions to the manuscript based on the reviewer’s comments.

      A major issue related to the author's claim of computational advance of miRTarDS: The author does not introduce existing biomedical-specific language models, and does not compare them against miRTarDS's fine-tuned model. The performance of miRTarDS is largely dependent on the semantic embedding of disease terms. The author shows in Figure 5 that MeSH-based fine-tuning leads to a substantial improvement in MeSH-based correlation compared to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1" without sacrificing a large amount of BIOSSES-based correlation. However, the author does not compare the performance of MeSH- and BIOSSES-based correlation with existing language models such as ChatGPT, BioBERT, PubMedBERT, and more. Also, the substantial improvement in MeSH-based correlation is a mere indication that the MeSH-based fine-tuning strategy was reasonable and not that it's superior to the publicly available pretrained SBERT model "multi-qa-MiniLM-L6-cos-v1".

      We thank the reviewer for the constructive suggestions regarding the benchmarking of language models. We acknowledge that the performance of miRTarDS largely depends on the semantic embeddings of disease terms. So, in the revisions, I will: 1) conduct a literature review to introduce existing biomedical-specific language models, and 2) perform a horizontal comparison between our fine-tuned model and these existing models, to more comprehensively evaluate the model’s capabilities.

      Another major issue is in the author's claim that disease-information from miRTarDS's language model is "sufficient" for accurate microRNA target prediction. Available microRNA targets with experimental evidence are largely biased for those with disease implications that have been reported in the biomedical literature. It's possible that their language model is biased by existing literature that has also been used to build microRNA target databases. Therefore, it is important that the author provides strong evidence that excludes the possibility of data leakage circularity. Similar concerns are prevalent across the manuscript, and so I highly recommend that the author reassess the evaluation frameworks and account for inflated performance, biased conclusions, and self-confirming results.

      We thank the reviewer for the comment. We recognize that existing experimentally validated microRNA targets may be biased toward those reported in biomedical literature as disease‑related. To mitigate this bias, we attempted to extract predicted microRNA targets that share a very similar number of miRNA- and gene‑ disease entries as the experimentally validated microRNA targets using the K‑Nearest Neighbors (KNN) method. Then applied Positive‑Unlabeled (PU) Learning to classify the two groups. PU‑Learning is designed to address scenarios where only a subset of the training data is explicitly labeled as positive, while the remaining data are unlabeled—with the unlabeled set containing both potential positives and true negatives—which is highly suitable for the application context of this manuscript [1]. Preliminary results show that after applying the new data extraction and classification approach, model performance drops to around F1=0.73 (the MISIM method also shows a decline, with F1 around 0.58; detailed code is available on GitHub). The specific reasons for this require further investigation.

      Last but not least, the manuscript requires a deeper and careful description and computational encoding of microRNA biology. I'd advise the author to include an expert in microRNA biology to improve the quality of this manuscript. For example, the author uses the pre-miRNA notation and replaces the mature miRNA notation to maintain computational encoding consistency across databases. However, the mature microRNA notation "the '-3p' or '-5p' is critical as the 3p and 5p mature microRNAs have different seed sequences and thus different mRNA targets. The 3p mature microRNA would most likely not target an mRNA targeted by the 5p mature microRNA.

      We thank the reviewer for the critique and suggestion. We fully agree with the reviewer that the distinction between the 3p and 5p mature strands is critical for determining mRNA targeting, as they possess distinct seed sequences. In our study, we relied on the miRNA–disease associations provided by the HMDD database, which annotates interactions at the pre-miRNA level: “… the enriched functions of each mature miRNA are aggregated to the corresponding miRNA precursor.” [2] Furthermore, existing literature suggests that the pre-miRNA level can be appropriate and informative for disease association analyses: “Compared with the mature miRNA method, the pre-miRNA method is more useful for studying disease association.” [3] We also find that, in some cases, both strands cooperate to regulate the same or complementary pathways [4]. We acknowledge the reviewer’s point as an important consideration for future revision. We plan to consult or collaborate with biologists to enhance the quality of the manuscript in biology.

      Reviewer #2 (Public review):

      This study introduces a novel knowledge-driven approach, miRTarDS, which enables microRNA-Target Interaction (MTI) prediction by leveraging the disease association degree between a miRNA and its target gene. The core hypothesis is that this single feature is sufficient to distinguish experimentally validated functional MTIs from computationally predicted MTIs in a binary classification setting. To quantify the disease association, the authors fine-tuned a Sentence-BERT (SBERT) model to generate embeddings of disease descriptions and compute their semantic similarity. Using only this disease association feature, miRTarDS achieved an F1 score of 0.88 on the test set.

      We thank the reviewers for their positive feedback, especially for their recognition of the novelty of this manuscript.

      Strengths:

      The primary strength is the innovative use of the disease association degree as an independent feature for MTI classification. In addition, this study successfully adapts and fine-tunes the Sentence-BERT (SBERT) model to quantify the semantic similarity between biomedical texts (disease descriptions). This approach establishes a critical pathway for integrating powerful language models and the vast growth in clinical/disease data into biochemical discovery, like MTI prediction.

      We would like to thank the reviewer again for their positive feedback. We appreciate their recognition of the novelty of our work, as well as their acknowledgment that the proposed method paves the way for integrating language models with clinical/disease data into biochemical discovery.

      Weaknesses:

      The main weakness lies in its definition of the ground-truth dataset, which serves as a foundation for methodological evaluation. The study defines the Negative Set as computationally predicted MTIs that lack experimental evidence. However, the absence of experimental validation does not equate to non-functionality. Similarly, the miRAW sets are classified by whether the target and miRNA could form a stable duplex structure according to RNA structure prediction. This definition is biologically irrelevant, as duplex stability does not fully encapsulate the complex in vivo binding of miRNAs within the AGO protein complex.

      We thank the reviewers for their constructive feedback. We have realized that treating predicted MTI as a negative class may pose some issues. Therefore, we have decided to adopt Positive Unlabeled (PU) Learning in subsequent updates. This classification method can be applied to datasets such as ours, which contain only positive classes and lack negative ones [1]. We used the miRAW dataset to enable a horizontal comparison of our method with traditional sequence-based prediction approaches. We acknowledge that miRAW may overlook some biological insights, and we plan to optimize the construction of test datasets in the future. Some preliminary explorations have already been conducted, and the relevant code is available on GitHub.

      Furthermore, we will make the following revisions: 1) We will clearly specify the version of miRBase and incorporate more miRNA-related databases. 2) Conduct a further literature review on miRNA biological mechanisms to enhance the quality of the manuscript in biology. 3) Perform a more comprehensive evaluation of the model’s performance. 4) Attempt to identify some representative MTIs that have been overlooked by existing prediction tools but can be predicted by our proposed method.

      References

      (1) Li, F., Dong, S., Leier, A., Han, M., Guo, X., Xu, J., ... & Song, J. (2022). Positive-unlabeled learning in bioinformatics and computational biology: a brief review. Briefings in Bioinformatics, 23(1), bbab461.

      (2) Huang, Z., Shi, J., Gao, Y., Cui, C., Zhang, S., Li, J., ... & Cui, Q. (2019). HMDD v3. 0: a database for experimentally supported human microRNA–disease associations. Nucleic acids research, 47(D1), D1013-D1017.

      (3) Wang, H., & Ho, C. (2023). The human pre-miRNA distance distribution for exploring disease association. International Journal of Molecular Sciences, 24(2), 1009.

      (4) Mitra, R., Adams, C. M., Jiang, W., Greenawalt, E., & Eischen, C. M. (2020). Pan-cancer analysis reveals cooperativity of both strands of microRNA that regulate tumorigenesis and patient survival. Nature Communications, 11(1), 968.

    1. Reviewer #2 (Public review):

      Summary:

      The manuscript describes a combined computational and experimental approach to investigate the ABHD5 binding to and insertion into membranes.

      Strengths:

      Mutational experiments support computational findings obtained on ABHD5 membrane insertion with enhanced-sampling atomistic simulations.

      Weaknesses:

      While the addressed problem is interesting, I have several concerns, which fall into two categories:

      (A) I see statements throughout the manuscript, e.g. on PNPLA activation, that are not supported by the results.

      (B) The presentation of the computational and experimental results lacks in part clarity and detail.

      Comments and questions on (A):

      (1) I think the following statements in the abstract, which go beyond ABHD5 membrane binding, are not supported by the presented data:

      the addition "to control lipolytic activation" in the 3rd sentence of the abstract.

      further below ".... transforming ABHD5 into an active and membrane-localized regulator".

      (2) The authors state in the Introduction (page numbers and line numbers are missing to be more specific):

      "We hypothesize that binding of ABHD5 alters the nanoscale chemical and biophysical properties of the LD monolayer, which, combined with direct protein-protein interactions, enables PNPLA paralogs to access membrane-restricted substrates. This regulatory mechanism represents a paradigm shift from conventional enzyme-substrate interactions to sophisticated allosteric control systems that operate at membrane interfaces."

      This hypothesis and the suggested paradigm shift are not supported by the data. Protein-protein interactions are not considered. What is meant by "sophisticated allosteric control"?

      (3) The authors state in the Results section:

      "We hypothesize that this TAG nanodomain is critical for ABHD5-activated TAG hydrolysis by PNPLA2." In previous pages, the authors state the location of the nanodomain: "TAG nanodomain under ABHD5".

      If the nanodomain is located under ABHD5, how can it be accessible to PNPLA2? To my understanding, ABHD5 then sterically blocks access of PNPLA2 to the TAG nandomain.

      (4) Another statement: "Our findings suggest that ABHD5-mediated membrane remodeling regulates lipolysis in part by regulating PNPLA2 access to its TAG substrate."

      I don't see how the reported results support this statement (see point 3 above).

      Comments and questions on (B):

      (1) The authors state that the GaMD simulations started "from varying conformations observed during CGMD".

      What is missing is a clear description of the CGMD simulation conformations, and the CG simulations as a whole, prior to the results section on GaMD. The authors use standard secondary and tertiary constraints in the Martini CG simulations. Do the authors observe some (constrained) conformational changes of ABHD5 already in the CG simulations (depending on the strength of the constraints)? Or do the conformational changes occur exclusively in the GaMD simulations? Both are fine, but this needs to be described.

      (2) The authors write: "Three replicas of GaMD were performed."

      Do these replicas lead to similar, or statistically identical, membrane-bound ABHD5 conformations? Is this information, i.e. a statistical analysis of differences in the replica runs, already included in the manuscript?

      (3) The authors state on the hydrogen exchange results:

      "HDX-MS provided orthogonal experimental evidence for the dynamics of the lid. In solution, a peptide (residues 200-226) spanning the lid helix displayed a bimodal isotopic distribution (Fig. S4), indicating the coexistence of different conformations. Upon LD binding, this distribution shifted to a single, low-exchange peak, demonstrating stabilization of the membrane-bound conformation with reduced solvent accessibility. These experimental observations corroborate our MD simulations."

      I find this far too short to be understandable. Also, there are no computational results of ABHD5 in solution that show a bimodal conformational distribution of the lid helix, which is observed in the hydrogen exchange experiments. Which aspects of the MD simulations are corroborated?

    1. Reviewer #1 (Public review):

      Summary:

      The goal of the study was to address the question of the degree to which social position in a group is a stable trait that persists across conditions. Reinwald et al. use a custom-built cage system with automated tracking and continuous testing for social dominance that does not require intervention by the experimenter. Remixing of individuals from different groups revealed that social position was rather stable and not really predictable from other measures that were taken. The authors conclude that social position is multifaceted but dependent on characteristics like personality traits.

      Strengths:

      (1) Reductionistic, highly controlled setting that allows for the control of many confounding variables.

      (2) Very interesting and important question.

      (3) Confirms the emergence of inter-individual behavior-driven differences in inbred mice in a shared environment.

      (4) Innovative paradigm and experimental setup.

      (5) Fresh perspective on an old question that makes the best use of modern technology.

      (6) Intelligent use of behavioral and cognitive covariables to generate a non-social context.

      (7) Bold and almost provocative conclusion, inviting discussion and further elaboration.

      Weaknesses:

      (1) Reductionistic, highly controlled setting that blends out much of the complexity of social behavior in a community.

      (2) The motivation to enter the test tube is not "trait" (or at least not solely a trait) but the basic need to reach food and water; chasing behavior would be less dependent on this stimulus.

      (3) Dominance is only one aspect of sociality, social structure is reduced to rank. The information that might lie in the chasing behavior is not optimally used to explain social behavior beyond the rank measure.

      (4) Focus on rank bears the risk of overgeneralization for readers not familiar with the context.

      (5) Conclusion only valid for the reductionistic setting, in which environment, social and non-social changes only within narrow limits, and in which the mouse population does not face challenges

      (6) Animals are not naive at the beginning of the experiment, but are already several weeks old.

      In summary, this is a wonderful study, but not one that is easy to interpret. The bold conclusion is valid only within the constraints of the study, but nevertheless points in an important direction. The paradigm is clever and could be used for many interesting follow-ups.

      To define social position as a personality trait will elicit strong opposition and much debate; the nuances of the paper might be lost on many readers and call for the (re)-consideration of many concepts that are touched. I find this attitude a strength of the paper, but the approach bears the risk of misunderstanding.

    2. Reviewer #2 (Public review):

      Summary:

      This manuscript presents the "NoSeMaze", a novel automated platform for studying social behavior and cognitive performance in group-housed male mice. The authors report that mice form robust, transitive dominance hierarchies in this environment and that individual social rank remains largely stable across multiple group compositions. They further demonstrate that social dominance and aggressive behaviors, like chasing, are partially dissociable and that dominance traits are independent of non-social cognitive performance. The study includes a genetic manipulation of oxytocin receptor expression in the anterior olfactory nucleus, which showed only transient effects on social rank.

      Strengths:

      (1) Innovative Methodology:<br /> The NoSeMaze platform is a technically elegant and conceptually well-integrated system that enables fully automated, long-term monitoring of both social and cognitive behaviors in large groups of group-housed mice. It combines tube-test-like dominance contests, voluntary chase-escape interactions, and an embedded operant olfactory discrimination task within a single, ethologically relevant environment. This modular design allows for high-throughput, minimally invasive behavioral assessment without the need for repeated handling or artificial isolation.

      (2) Experimental Scale and Rigor:<br /> The study includes 79 male mice and over 4,000 mouse-days of observation across multiple group reshufflings. The use of RFID-based identification, automated data logging, and longitudinal design enables robust quantification of individual trait stability and group-level social structure.

      (3) Multidimensional Behavioral Profiling:<br /> The integration of social (tube dominance, proactive chasing), physical (body weight), and cognitive (olfactory learning task) measures offers a rich, multi-dimensional profile of each individual mouse. The authors' finding that social dominance traits and non-social cognitive performance are largely uncorrelated reinforces emerging models of orthogonal behavioral trait axes or "animal personalities".

      (4) Clarity and Data Analysis:<br /> The analytical framework is well-suited to the study's complexity, with appropriate use of dominance metrics, mixed-effects models, and permutation tests. The analyses are clearly explained, statistically rigorous, and supported by transparent supplementary materials.

      Weaknesses:

      (1) Conceptual Novelty and Prior Work:<br /> While the study is carefully executed and methodologically innovative, several of its core findings reaffirm concepts already established in the literature. The emergence of stable, transitive social hierarchies, the persistence of individual differences in social behavior, and the presence of non-despotic social structures have all been previously reported in mice, including under semi-naturalistic conditions (e.g., Fan et al., 2019; Forkosh et al., 2019). Although this work extends those findings with greater behavioral resolution and scale, the manuscript would benefit from a clearer articulation of what is genuinely novel at the conceptual level, beyond the technological advance.

      (2) Role of OXTR Deletion:<br /> The inclusion of the OXTR manipulation feels somewhat disconnected from the manuscript's central aims. The effects were minimal and transient, and the authors defer full interpretation to a separate study.

      (3) Scope Limitations (Sex and Age):<br /> The study is limited to male mice, and although this is acknowledged, the title and overall framing imply broader generalizability. This sex-specific focus represents a common but problematic bias. Additionally, results from the older mouse cohort are under-discussed; if age had no effect, this should be explicitly stated.

      (4) Ambiguity of Dominance as a Construct:<br /> While the study robustly quantifies social rank and hierarchy structure, the broader functional meaning of "dominance" remains unclear. As in prior work (e.g., Varholick et al., 2019), dominance rank here shows only weak associations with physical attributes (e.g., body weight), cognitive strategy, or neuromodulatory manipulation (OXTR deletion). This recurring pattern, where rank metrics are reliably established yet poorly predictive of other behavioral or biological traits, raises important questions about what such measures actually capture. In particular, it challenges the assumption that outcomes in paradigms like the tube test or chase frequency necessarily reflect dominance per se, rather than other constructs.

    3. Reviewer #3 (Public review):

      Reinwald et al. present the NoSeMaze, a semi-natural behavioral system designed to track social behaviors alongside reinforcement-learning in large groups of mice. Accumulating more than 4,000 days of behavioral monitoring, the authors demonstrate that social rank (determined by tube competitions) is a stable trait across shuffled cohorts and correlated with active chasing behaviors. The system also provides a solid platform for long-term measurements of reinforcement learning, including flexibility, response adaptation, and impulsiveness. Yet, the authors show that social ranking and chasing are mostly independent of these cognitive traits, and both seem mostly independent of oxytocin signaling in the AON.

      Strengths:

      (1) The neuroethological approach for automated tracking of several mice under semi-natural conditions is still rare in social behavioral research and should be encouraged.

      (2) The assessment of dominance by two independent measures, i.e., spontaneous tube competitions and proactive chasing, is innovative and valuable.

      (3) The integration of a long-term reinforcement-learning module into the semi-natural system provides novel opportunities to combine cognitive traits into social personality assessments.

      (4) The open-source system provides a valuable resource for the scientific community.

      Limitations:

      (1) Apparent ambiguity and inconsistency in age structure and cohort participation across rounds, raising concerns about uncontrolled confounds.

      (2) Chasing behavior appears more stable than tube-test competitions (Figure 4D vs. Figure 3D), which challenges the authors' decision to treat tube competitions as the primary basis for hierarchy determination.

      Major concerns:

      (1) Unclear and inconsistent handling of age groups and repeated sampling. The manuscript repeatedly refers to "younger" and "older" adults, but it is unclear whether age was ever controlled for or included in models. Some mice completed only one round, others 2-5 rounds, without explanation of the criteria or balancing.

      (2) Stability of chasing appears stronger than the stability of tube competitions. Figure 4D shows highly consistent chasing behavior across weeks, while Figure 3D shows weaker and more variable correlations for tube-based David scores. This is also evident from Figure 5A-B,D. Thus, it appears that chasing, which serves to quantify dominance in similar semi-natural setups, may be a more reliable and behaviorally meaningful measure of dominance than the incidental tube competitions.

      (3) Unbalanced participation across rounds compromises stability analyses. Stability analyses (e.g., ICCs, round-to-round correlations) assume comparable sampling across individuals. However, some mice contribute 1 round, others 2, 3, 4, and even 5 rounds. This imbalance may inflate stability estimates or confound group reshuffling effects, and the rationale for variable participation is not explained.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Thank you so much for your comprehensive and insightful assessment of our manuscript. We appreciate your recognition of the novelty of our experimental design and the utility of our computational framework for interpreting visual remapping across the lifespan and in clinical populations. We are very grateful for your suggestions regarding the narrative flow, which have helped us to improve the manuscript's focus and coherence. Our responses to your specific concerns are detailed below.

      (1) Relevance of the figure-copy results (pp. 13-15). Is it necessary to include the figure-copy task results within the main text? The manuscript already presents a clear and coherent narrative without this section. The figure-copy task represents a substantial shift from the LOCUS paradigm to an entirely different task that does not measure the same construct. Moreover, the ROCF findings are not fully consistent with the LOCUS results, which introduces confusion and weakens the manuscript's coherence. While I understand the authors' intention to assess the ecological validity of their model, this section does not effectively strengthen the manuscript and may be better removed or placed in the Supplementary Materials.

      We thank the reviewer  for their perspective regarding the narrative flow and the transition between the LOCUS paradigm and the ROCF results. However, we remain keen to retain these findings in the main text, as they provide critical ecological and clinical validation for the computational mechanisms identified in our study.

      We think these results strengthen the manuscript for the following main reasons:

      (1) The ROCF we used is a standard neuropsychological tool for identifying constructional apraxia. Our results bridge the gap between basic cognitive neuroscience and clinical application by demonstrating that specific remapping parameters—rather than general memory precision—predict real-world deficits in patients.

      (2) The finding that our winning model explains approximately 62% of the variance in ROCF copy scores across all diagnostic groups further indicates that these parameters from the LOCUS task represent core computational phenotypes that underpin complex, real-life visuospatial construction (copying drawings).

      (3) Previous research has often observed only a weak or indirect link between drawing ability and traditional working memory measures, such as digit span (Senese et al., 2020). This was previously attributed to “deictic” strategies—like frequent eye and hand movements—that minimise the need to hold large amounts of information in memory (Ballard et al., 1995; Cohen, 2005; Draschkow et al., 2021). While our study was not exclusively designed to catalogue all cognitive contributions to drawing, the findings provide significant and novel evidence indicating that transsaccadic integration is a critical driver of constructional (copying drawing) ability. By demonstrating this link, the results provide evidence to stimulate a new direction for future research, shifting the focus from general memory capacity toward the precision of spatial updating across eye movements.

      In summary, by including the ROCF results in the main text, we provide evidence for a functional role for spatial remapping that extends beyond perceptual stability into the domain of complex visuomotor control. We have expanded on these points throughout the revised manuscript:

      In the Introduction: p.2:

      “The clinical relevance of these spatial mechanisms is underscored by significant disruptions to visuospatial processing and constructional apraxia—a deficit in copying and drawing figures—observed in neurodegenerative conditions such as Alzheimer's disease (AD) and Parkinson's disease (PD).[20,21] This raises a crucial question: do clinical impairments in complex visuomotor tasks stem from specific failures in transsaccadic remapping? If so, the computational parameters that define normal spatial updating should also provide a mechanistic account of these clinical deficits, differentiating them from general age-related decline.”

      p.3: "Finally, by linking these mechanistic parameters to a standard clinical measure of constructional ability (the Rey-Osterrieth Complex Figure task), we demonstrate that transsaccadic updating represents a core computational phenotype underpinning real-world visuospatial construction in both health and neurodegeneration.

      In the Results:

      “To assess whether the mechanistic parameters derived from the LOCUS task represent core phenotypes of real-world visuospatial abilities, we also instructed all participants to complete the Rey-Osterrieth Complex Figure copy task (ROCF; Figure 7A) on an Android tablet using a digital pen (see examples in Figure 7B; all Copy data are available in the open dataset: https://osf.io/95ecp/). The ROCF is a gold-standard neuropsychological tool for identifying constructional apraxia.[29] Historically, drawing performance has shown only weak or indirect correlations with traditional working memory measures.[30] This disconnect has been attributed to active visual-sampling strategies—frequent eye movements that treat the environment as an external memory buffer, minimising the necessity of holding large volumes of information in internal working memory.[3–5]

      We hypothesised that drawing accuracy is primarily constrained by the precision of spatial updating across frequent saccades rather than raw memory capacity. To evaluate the ecological validity of the identified saccade-updating mechanism, we modelled individual ROCF copy scores across all four groups using the estimated (maximum a posteriori) parameters from the winning “Dual (Saccade) + Interference” model (Model 7; Figure 8) as regressors in a Bayesian linear model. Prior to inclusion, each regressor was normalised by dividing by the square root of its variance.

      This model successfully explained 61.99% of the variance in ROCF copy scores, indicating that these computational parameters are strong predictors of real-word constructional ability (Figure 8A). … This highlights the critical role of accurate remapping based on saccadic information; even if the core saccadic update mechanism is preserved across groups (as shown in previous analyses), the precision of this updating process is crucial for complex visuospatial tasks. Moreover, worse ROCF copy performance is associated particularly with higher initial angular encoding error. This indicates that imprecision in the initial registration of angular spatial information contributes to difficulties in accurately reproducing complex visual stimuli.”

      In the Discussion:

      “Importantly, our computational framework establishes a direct mechanistic link between trassaccadic updating and real-world constructional ability. Specifically, higher saccade and angular encoding errors contribute to poorer ROCF copy scores. By mapping these mechanistic estimates onto clinical scores, we found that the parameters derived from our winning model explain approximately 62% of the variance in constructional performance across groups. These findings suggest that the computational parameters identified in the LOCUS task represent core phenotypes of visuospatial ability, providing a mechanistic bridge between basic cognitive theory and clinical presentation.

      This relationship provides novel insights into the cognitive processes underlying drawing, specifically highlighting the role of transsaccadic working memoty.ry. Previous research has primarily focused on the roles of fine motor control and eye-hand coordination in this skill.[4,50–55] This is partly because of consistent failure to find a strong relation between traditional memory measures and copying abili [4,31] For instance, common measures of working memory, such as digit span and Corsi block tasks, do not directly predict ROCF copying performance.[31,56] Furthermore, in patients with constructional apraxia, these memory performance measures often remain relatively preserved despite significant drawing impairments.[56–58] In the literature, this lack of association has often been attributed to “deictic” visual-sampling strategies, characterised by frequent eye movements that treat the environment as an external memory buffer, thereby minimising the need to maintain a detailed internal representation.[4,59] In a real-world copying task, the ROCF requires a high volume of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified here. Recent eye-tracking evidence confirms that patients with AD exhibit significantly more saccades and longer fixations during figure copying compared to controls, potentially as a compensatory response to trassaccadic working memory constraints.[56] This high-frequency sampling—averaging between 150 and 260 saccades for AD patients compared to approximately 100 for healthy controls—renders the task highly dependent on the precision of dynamic remapping signals.[56] To ensure this relationship was not driven by a general "g-factor" or non-spatial memory impairment, we further investigated the role of broader cognitive performance using the ACE-III Memory subscale. We found that the relationship between transsaccadic working memory and ROCF performance remains highly significant, even after controlling for age, education, and ACE-III Memory subscore. This suggests that transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.

      In other words, even when visual information is readily available in the world, the act of copying depends critically on working memory across saccades. This reveals a fundamental computational trade-off: while active sampling strategies (characterised with frequent eye-hand movements) effectively reduce the load on capacity-limited working memory, they simultaneously increase the demand for precise spatial updating across eye movements. By treating the external world as an "outside" memory buffer, the brain minimises the volume of information it must hold internally, but it becomes entirely dependent on the reliability with which that information is remapped after each eye movement. This perspective aligns with, rather contradicts, the traditional view of active sampling, which posits that individuals adapt their gaze and memory strategies based on specific task demands.[3,60] Furthermore, this perspective provides a mechanistic framework for understanding constructional apraxia; in these clinical populations, the impairment may not lie in a reduced memory "span," but rather in the cumulative noise introduced by the constant spatial remapping required during the copying process.[58,61]

      Beyond constructional ability, these findings suggest that the primary evolutionary utility of high-resolution spatial remapping lies in the service of action rather than perception. While spatial remapping is often invoked to explain perceptual stability,[11–13,15] the necessity of high-resolution transsaccadic memory for basic visual perception is debated.[13,62–64] A prevailing view suggests that detailed internal models are unnecessary for perception, given the continuous availability of visual information in the external world.[13,44] Our findings support an alternative perspective, aligning with the proposal that high-resolution transsaccadic memory primarily serves action rather than perception.[13] This is consistent with the need for precise localisation in eye-hand coordination tasks such as pointing or grasping.[65] Even when unaware of intrasaccadic target displacements, individuals rapidly adjust their reaching movements, suggesting direct access of the motor system to remapping signals.66 Further support comes from evidence that pointing to remembered locations is biased by changes in eye position,[67] and that remapping neurons reside within the dorsal “action” visual pathway, rather than the ventral “perception” visual pathway.[13,68,69] By demonstrating a strong link between transsaccadic working memory and drawing (a complex fine motor skill), our findings suggest that precise visual working memory across eye movements plays an important role in complex fine motor control.”

      (2) Model fitting across age groups (p. 9).

      It is unclear whether it is appropriate to fit healthy young and healthy elderly participants' data to the same model simultaneously. If the goal of the model fitting is to account for behavioral performance across all conditions, combining these groups may be problematic, as the groups differ significantly in overall performance despite showing similar remapping costs. This suggests that model performance might differ meaningfully between age groups. For example, in Figure 4A, participants 22-42 (presumably the elderly group) show the best fit for the Dual (Saccade) model, implying that the Interference component may contribute less to explaining elderly performance.

      Furthermore, although the most complex model emerges as the best-fitting model, the manuscript should explain how model complexity is penalized or balanced in the model comparison procedure. Additionally, are Fixation Decay and Saccade Update necessarily alternative mechanisms? Could both contribute simultaneously to spatial memory representation? A model that includes both mechanisms-e.g., Dual (Fixation) + Dual (Saccade) + Interference-could be tested to determine whether it outperforms Model 7 to rule out the sole contribution of complexity.

      We thank you for the opportunity to expand upon and clarify our modelling approach. Our decision to use a common generative model for both young and older adults was grounded in the empirical finding that there was no significant interaction between age group and saccade condition for either location or colour memory. While older adults demonstrated lower baseline precision, the specific "saccade cost" remained remarkably consistent across cohorts. This was the justification we proceeded on to use of a common model to assess quantitative differences in parameter estimates while maintaining a consistent mechanistic framework for comparison.

      Moreover, our winning model nests simpler models as special cases, providing the flexibility to naturally accommodate groups where certain components—such as interference—might play a reduced role. This ultimately confirms that the mechanisms for age-related memory deficits in this task reflect more general decline rather than a qualitative failure of the saccadic remapping process.

      This approach is further supported by the properties of the Bayesian model selection (BMS) procedure we used, which inherently penalises the inclusion of unnecessary parameters. Unlike maximum likelihood methods, BMS compares marginal likelihoods, representing the evidence for a model integrated over its entire parameter space. This follows the principle of Bayesian Occam’s Razor, where a model is only favoured if the improvement in fit justifies the additional parameter space; redundant parameters instead "dilute" the probability mass and lower the model evidence.

      Consequently, we contend that a hybrid model combining fixation and saccade mechanisms is unnecessary, as we have already adjudicated between alternative mechanisms of equal complexity. Specifically, Model 6 (Dual Fixation + Interference) and Model 7 (Dual Saccade + Interference) possess an identical number of parameters. The fact that Model 7 emerged as the clear winner—providing substantial evidence against Model 6 with a Bayes Factor of 6.11—demonstrates that our model selection is driven by the specific mechanistic account of the data rather than a simple preference for complexity.

      We have revised the Results and Discussion sections of the manuscript to state these points more explicitly for readers and have included references to established literature regarding the robustness of marginal likelihoods in guarding against overfitting.

      In the Results,

      “By fitting these models to the trial-by-trial response data from all healthy participants (N=42), we adjudicated between competing mechanisms to determine which best explained participant performance (Figure 4). We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[25–27] The analysis yielded a strong result: the “Dual (Saccade) + Interference” model (Model 7 in Table 1) emerged as the winning model, providing substantial evidence against the next best alternative with a Bayes Factor of 6.11.”

      In the Discussion:

      “Our framework employs Variational Laplace, a method used to recover computational phenotypes in clinical populations like those with substance use disorders,[34,35] and the models we fit using this procedure feature time-dependent parameterisation of variance—conceptually similar to the widely-used Hierarchical Gaussian Filter.[36–39] Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[25–27,40] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      Minor point: On p. 9, line 336, Figure 4A does not appear to include the red dashed vertical line that is mentioned as separating the age groups.

      Thank you for pointing out this inconsistency. We apologise for the oversight; upon further review, we concluded that the red dashed vertical line was unnecessary for the clear presentation of the data. We have therefore removed the line from Figure 4A and deleted the corresponding sentence in the figure caption.

      (3) Clarification of conceptual terminology.

      Some conceptual distinctions are unclear. For example, the relationship between "retinal memory" and "transsaccadic memory," as well as between "allocentric map" and "retinotopic representation," is not fully explained. Are these constructs related or distinct? Additionally, the manuscript uses terms such as "allocentric map," "retinotopic representation," and "reference frame" interchangeably, which creates ambiguity. It would be helpful for the authors to clarify the relationships among these terms and apply them consistently.

      Thank you for pointing this out. We have revised the manuscript to ensure that these terms are applied with greater precision and consistency. Our revisions standardise the terminology based on the following distinctions:

      Reference frames: We distinguish between the eye-centred reference frame (coordinate systems that shift with gaze) and the world-centred reference frame (coordinate systems anchored to the environment).

      Retinotopic representation vs. allocentric map: We clarify that retinotopic representations are encoded within an eye-centred reference frame and are updated with every ocular movement. Conversely, the allocentric map is anchored to stable environmental features, remaining invariant to the observer’s gaze direction or position.

      Retinotopic memory vs. transsaccadic memory: We have removed the term "retinal memory" to avoid ambiguity. We now consistently use retinotopic memory to describe the persistence of visual information in eye-centred coordinates within a single fixation. In contrast, transsaccadic memory refers to the higher-level integration of visual information across saccades, which involves the active updating or remapping of representations to maintain stability.

      To incorporate these clarifications, we have implemented the following changes:

      In the Introduction, the second paragraph has been entirely rewritten to establish these definitions at the outset, providing a clearer theoretical framework for the study.

      “Central to this enquiry is the nature of the coordinate system used for the brain's internal spatial representation. Does the brain maintain a single, world-centred (allocentric) map, or does it rely on a dynamic, eye-centred (retinotopic) representation?[11,13,15,16] In the latter system, retinotopic memory preserves spatial information within a fixation, whereas transsaccadic memory describes the active process of updating these representations across eye movements to achieve spatiotopic stability—the perception of a stable world despite eye movements.[11,16–18] If spatial stability is indeed reconstructed through such remapping, the mechanism remains unresolved: do we retain memories of absolute fixation locations, or do we reconstruct these positions from noisy memories of the intervening saccade vectors? We can test these hypotheses by analysing when and where memory errors occur. Assuming that memory precision declines over time,[19] the resulting error distributions should reveal the specific variables that are represented and updated across each saccade.”

      In the Results, the opening section of the Results has been reorganised to align with this terminology. We have ensured that the hypotheses and behavioural data—specifically the definition of "saccade cost"—are introduced using this consistent conceptual vocabulary to improve the overall coherence of the narrative.

      (4) Rationale for the selective disruption hypothesis (p. 4, lines 153-154). The authors hypothesize that "saccades would selectively disrupt location memory while leaving colour memory intact." Providing theoretical or empirical justification for this prediction would strengthen the argument.

      We have revised the Results to state the hypothesis more explicitly and expanded the Discussion to provide a robust theoretical and empirical rationale:

      In the Results,

      “This design allowed us to isolate and quantify the unique impact of saccades on spatial memory, enabling us to test competing hypotheses regarding spatial representation. If spatial memory were solely underpinned by an allocentric mechanism, precision should remain comparable across all conditions as the representation would be world-centred and unaffected by eye movements. Thus, performance in the no-saccade condition should be comparable to the two-saccade condition. Conversely, if spatial memory relies on a retinotopic representation requiring active updating across eye movements, the two-saccade condition was anticipated to be the most challenging due to cumulative decay in the memory traces used for stimulus reconstruction after each saccade.[22] Critically, we hypothesised that this saccade cost would be specific to the spatial domain; while location requires active remapping via noisy oculomotor signals, non-spatial features like colour are not inherently tied to coordinate transformations and should therefore remain stable (see more in Discussion below).

      Meanwhile, the no-saccade condition was expected to yield the most accurate localisation, relying solely on retinotopic information (retinotopic working memory). These predictions were confirmed in young healthy adults (N = 21, mean age = 24.1 years, ranged between 19 and 34). A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(2.2,43.9)=33.2, p<0.001, partial η²=0.62), indicating substantial impairment after eye movements (Figure 2A). In contrast, colour memory remained remarkably stable across all saccade conditions (Figure 2B; F(2.2, 44.7) = 0.68, p=0.53, partial η² =0.03).

      This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.

      Critically, our comparison between spatial and colour memory does not rely on the absolute magnitude of errors, which are measured in different units (degrees of visual angle vs. radians). Instead, we assessed the relative impact of the same saccadic demand on each feature within the same trial. While location recall showed a robust saccade cost, colour recall remained statistically unchanged. To ensure this null effect was not due to a lack of measurement sensitivity, we examined the recency effect; recall performance for the second item was predicted to be better than for the first stimulus in each condition.[23,24] As expected, colour memory for Item 2 was significantly more accurate than for Item 1 (F(1,20) = 6.52, p = 0.02, partial η² = 0.25), demonstrating that the task was sufficiently sensitive to detect standard working memory fluctuations despite the absence of a saccade-induced deficit.”

      In the Discussion, we now write that on p.18:

      “A clear finding was the specificity of the saccade cost to spatial features; it was not observed for non-spatial features like colour, even in neurodegenerative conditions. This discrepancy challenges notions of fixed visual working memory capacity unaffected by saccades.16,44–46 The differential impact on spatial versus non-spatial features in transsaccadic memory aligns with the established "what" and "where" pathways in visual processing.32,33 For objects to remain unified, object features must be bound to stable representations of location across saccades.19 One possibility is that remapping updates both features and location through a shared mechanism, predicting equal saccadic interference for both colour and location in the present study.

      However, our findings suggest otherwise. One potential concern is whether this dissociation simply reflects the inherent spatial noise introduced by fixational eye movements (FEMs), such as microssacades and drifts.47 Because locations are stored in a retinotopic frame, fixational instability necessarily shifts retinal coordinates over time. However, the "saccade cost" here was defined as the error increase relative to a no-saccade baseline of equal duration; because both conditions are subject to the same fixational drift, any FEM-induced noise is effectively subtracted out. Thus, despite the ballistic and non-Gaussian nature of FEMs,48 they cannot account for the fact the saccade cost in the spatial memory, but total absence in the colour domain. Another possibility is that this dissociation reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.

      The fact that identical eye movements—executed simultaneously and with identical vectors—systematically degraded spatial precision while sparing colour suggests a feature-specific susceptibility to transsaccadic remapping. This supports the view that the computational process of updating an object’s location involves a vector-subtraction mechanism—incorporating noisy oculomotor commands (efference copies)—that introduces specific spatial variance. Because this remapping is a coordinate transformation, the resulting sensorimotor noise does not functionally propagate to non-spatial feature representations. Consequently, features like colour may be preserved or automatically remapped without the precision loss associated with spatial updating.11,49 Our paradigm thus provides a refined tool to investigate the architecture of transsaccadic working memory across distinct object features.”

      (5) Relationship between saccade cost and individual memory performance (p. 4, last paragraph).

      The authors report that larger saccades were associated with greater spatial memory disruption. It would be informative to examine whether individual differences in the magnitude of saccade cost correlate with participants' overall/baseline memory performance (e.g. their memory precision in the no-saccade condition). Such analyses might offer insights into how memory capacity/ability relates to resilience against saccade-induced updating.

      We have now conducted the correlation analysis to determine whether baseline memory capacity (no-saccade condition) predicts resilience to saccade-induced updating. The results indicate that these two factors are independent.

      To clarify the nature of the saccade-induced impairment, we have updated the text as follows:

      p.4: “This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.”

      p.5: “Further analysis examined whether individual differences in baseline memory precision (no-saccade condition) predicted resilience to saccadic disruption. Crucially, individual saccade costs (defined as the precision loss relative to baseline) did not correlate with baseline precision (rho = 0.20, p = 0.20). This suggests that the noise introduced by transsaccadic remapping acts as an independent, additive source of variance that is not modulated by an individual’s underlying memory capacity. These findings imply a functional dissociation between the mechanisms responsible for maintaining a representation and those involved in its coordinate transformation.”

      (6) Model fitting for the healthy elderly group to reveal memory-deficit factors (pp. 11-12). The manuscript discusses model-based insights into components that contribute to spatial memory deficits in AD and PD, but does not discuss components that contribute to spatial memory deficits in the healthy elderly group. Given that the EC group also shows impairments in certain parameters, explaining and discussing these outcomes of the EC group could provide additional insights into age-related memory decline, which would strengthen the study's broader conclusions.

      This is a very good point. We rewrote the corresponding results section (p.12-13):

      “Modelling reveals the sources of spatial memory deficits in healthy aging and neurodegeneration - To understand the source of the observed deficits, we applied the winning ‘Dual (Saccade) + Interference’ model the data from all participants (YC, EC, AD, and PD). By fitting the model to the entire dataset, we obtained estimates of the parameters for each individual, which then formed the basis for our group-level analysis. To formally test for group differences, we used Parametric Empirical Bayes (PEB), a hierarchical Bayesian approach that compares parameter estimates across groups while accounting for the uncertainty of each estimate [28]. This allowed us to identify which specific cognitive mechanisms, as formalised by the model parameters, were affected by age and disease.

      The Bayesian inversion used here allows us to quantify the posterior mode and variance for each parameter and the covariance for each parameter. From these, we can compute the probabilities that pairs of parameters differ from one another, which we report as P(A>B)—meaning the posterior probability that the parameter for group A was greater than that for group B.

      We first examined the specific parameters differentiating healthy elderly (EC) from young controls (YC) to isolate the factors contributing to non-pathological, age-related decline. The analysis revealed that healthy ageing is primarily characterised by a significant increase in Radial Decay (P(EC > YC) = 0.995), a heightened susceptibility to Interference (P(EC > YC) = 1.000), and a reduction in initial Angular Encoding precision (P(YC < EC) = 0.002; Figure 6). These results suggest that normal ageing degrades the fidelity of the initial memory trace and its resilience over time, while the core computational process of updating information across saccades remains intact.

      Beyond these baseline ageing effects, our clinical cohorts exhibited more severe and condition-dependent impairments. Radial decay showed a clear, graded impairment: AD patients had a greater decay rate than PD patients (P(AD > PD) = 1.000), who in turn were more impaired than the EC group (P(PD > EC) = 0.996). A similar graded pattern was observed for Interference, where AD patients were most susceptible (P(AD > PD) = 0.999), while the PD and EC groups did not significantly differ (P(PD > EC) = 0.532).

      Patients with AD also showed a tendency towards greater angular decay than controls (P(AD > EC) = 0.772), although this fell below the 95% probability threshold. This effect was influenced by a lower decay rate in the PD group compared to the EC group (P(PD < EC) = 0.037). In contrast, group differences in encoding were less pronounced. While YC exhibited significantly higher precision than all other groups, AD patients showed significantly higher angular encoding error than PD patients (P(AD > PD) = 0.985), though neither group differed significantly from the EC group.

      Crucially, parameters related to the saccade itself—saccade encoding and saccade decay—did not differentiate the groups. This indicates that neither healthy ageing nor the early stages of AD and PD significantly impair the fundamental machinery for transsaccadic remapping. Instead, the visuospatial deficits in these conditions arise from specific mechanistic failures: a faster decay of radial position information and increased susceptibility to interference, both of which are present in healthy ageing but significantly amplified by neurodegeneration.”

      In the Discussion, we added:

      “Although saccade updating was an essential component of the winning model, its two key parameters—initial encoding error and decay rate during maintenance—did not significantly differ across groups. This indicates that the core computational process of updating spatial information based on eye movements is largely preserved in healthy aging and neurodegeneration.

      Instead, group differences were driven by deficits in angular encoding error (precision of initial angle from fixation), angular decay, radial decay (decay in memory of distance from fixation), and interference susceptibility. This implies a functional and neuroanatomical dissociation: while the ventral stream (the “what” pathway) shows an age-related decline in the quality and stability of stored representations, the dorsal-stream (the “where” pathway) parietal-frontal circuits responsible for coordinate transformations remain functionally robust.[31–34] These spatial updating mechanisms appear resilient to the normal ageing trajectory and only break down when challenged by the specific pathological processes seen in Alzheimer’s or Parkinson’s disease.”

      (7) Presentation of saccade conditions in Figure 5 (p. 11). In Figure 5, it may be clearer to group the four saccade conditions together within each patient group. Since the main point is that saccadic interference on spatial memory remains robust across patient groups, grouping conditions by patient type rather than intermixing conditions would emphasize this interpretation.

      There are several valid ways to present these plots, but we chose this format because it allows for a direct visual comparison of the post-hoc group differences within each specific task demand. This arrangement clearly illustrates the graded impairment from young controls through to patients with Alzheimer’s disease across every condition. This structure also directly mirrors our two-way ANOVA, which identified significant main effects for both Group and Condition, but crucially, no significant Group x Condition interaction. We felt that grouping the data by participant group would force readers to look across four separate clusters to compare the slopes, making the stability of the saccadic remapping mechanism much harder to grasp at a glance.

      Reviewer #1 (Recommendations for the authors):

      (1) Formatting of statistical parameters.

      The formatting of statistical symbols should be consistent throughout the manuscript. Some instances of F, p, and t are italicized, while others are not. All statistical symbols should be italicized.

      Thank you for pointing this out. We have audited the manuscript. While we have revised the text to address these instances throughout the Results and Methods sections, any remaining minor formatting inconsistencies will be corrected during the final typesetting stage.

      (2) Minor typographical issues.

      (a) Line 532: "are" should be "be."

      (b) Line 654: "cantered" should be "centered."

      (c) Line 213: In "(p(bonf) < 0.001, |t| {greater than or equal to} 5.94)," the t value should be reported with its degrees of freedom, and t should be reported before p. The same applies to line 215.

      Thank you for your careful reading. All corrected.

      Reviewer #2 (Public review):

      We thank you for your positive feedback regarding our eye-tracking methodology and computational approach. We appreciate your critical insights into the feature-specific disruption hypothesis and the task structure. We have substantially revised the results and discussion about the saccadic interference on colour memory. Below we will answer your suggestions point-by-point:

      Reviewer #2 (Recommendations for the authors):

      (1) The study treats colour and location errors as comparable when arguing that saccades selectively disrupt spatial but not colour memory. However, these measures are defined in entirely different units (degrees of visual angle vs radians on a colour wheel) and are not psychophysically or statistically calibrated. Baseline task difficulty, noise level, or dynamic range do not appear to be calibrated or matched across features. As a result, the null effect of saccades on colour could reflect lower sensitivity or ceiling effects rather than implicit feature-specific robustness.

      We agree that direct comparisons of absolute error magnitudes across different dimensions are not appropriate. Our argument for feature-specific disruption relies not on the scale of errors, but on the presence or absence of a saccade cost within identical trials. In our within-subject design, the same saccade vectors produced a systematic increase in location error while leaving colour error statistically unchanged. To address sensitivity, we observed that colour memory was sufficiently precise to show a significant recency effect (p = 0.02). To further quantify the evidence for the null effect, we performed Bayesian repeated measures ANOVAs, which yielded a BF10 = 0.22. This provides substantial evidence that saccades do not disrupt colour precision, regardless of baseline sensitivity.

      We have substantially revised this in Results, Methods and Discussion:

      In the Results:

      “This design allowed us to isolate and quantify the unique impact of saccades on spatial memory, enabling us to test competing hypotheses regarding spatial representation. If spatial memory were solely underpinned by an allocentric mechanism, precision should remain comparable across all conditions as the representation would be world-centred and unaffected by eye movements. Thus, performance in the no-saccade condition should be comparable to the two-saccade condition. Conversely, if spatial memory relies on a retinotopic representation requiring active updating across eye movements, the two-saccade condition was anticipated to be the most challenging due to cumulative decay in the memory traces used for stimulus reconstruction after each saccade.[22] Critically, we hypothesised that this saccade cost would be specific to the spatial domain; while location requires active remapping via noisy oculomotor signals, non-spatial features like colour are not inherently tied to coordinate transformations and should therefore remain stable (see more in Discussion below).

      Meanwhile, the no-saccade condition was expected to yield the most accurate localisation, relying solely on retinotopic information (retinotopic working memory). These predictions were confirmed in young healthy adults (N = 21, mean age = 24.1 years, ranged between 19 and 34). A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(2.2,43.9)=33.2, p<0.001, partial η²=0.62), indicating substantial impairment after eye movements (Figure 2A). In contrast, colour memory remained remarkably stable across all saccade conditions (Figure 2B; F(2.2, 44.7) = 0.68, p=0.53, partial η² =0.03).

      This “saccade cost”—the loss of memory precision following an eye movement—indicates that spatial representations require active updating across saccades rather than being maintained in a static, world-centred reference frame.

      Critically, our comparison between spatial and colour memory does not rely on the absolute magnitude of errors, which are measured in different units (degrees of visual angle vs. radians). Instead, we assessed the relative impact of the same saccadic demand on each feature within the same trial. While location recall showed a robust saccade cost, colour recall remained statistically unchanged. To ensure this null effect was not due to a lack of measurement sensitivity, we examined the recency effect; recall performance for the second item was predicted to be better than for the first stimulus in each condition.[23,24] As expected, colour memory for Item 2 was significantly more accurate than for Item 1 (F(1,20) = 6.52, p = 0.02, partial η² = 0.25), demonstrating that the task was sufficiently sensitive to detect standard working memory fluctuations despite the absence of a saccade-induced deficit.”

      In the Methods, at the beginning of “Statistical Analysis”, we added

      “Because location and colour recall involve different scales and units, all analyses were performed independently for each feature to avoid cross-dimensional magnitude comparisons.” (p25)

      In the Discussion, we added:

      “A potential concern is whether the observed dissociation between colour and location reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.”

      (2) Colour and then location are probed serially, without a counter-balanced order. This fixed response order could introduce a systematic bias because location recall is consistently subject to longer memory retention intervals and cognitive interference from the colour decision. The observed dissociation-saccades impair location but not colour, and may therefore reflect task structure rather than implicit feature-specific differences in trans-saccadic memory.

      Thank you for the insightful observation regarding our fixed response order. We acknowledge that that a counterbalanced design is typically preferred to mitigate potential order effects. However, we chose this consistent sequence to ensure the task remained accessible for cognitively impaired patients (i.e., the Alzheimer’s disease (AD) and Parkinson’s disease (PD) cohorts). Conducting an eye-tracking memory task with cognitively impaired patients is challenging, as they may struggle with task engagement or forget complex instructions. During the design phase, we prioritised a consistent structure to reduce the cognitive load and task-switching demands that typically challenge these cohorts.

      Critically, because the saccade cost is a relative measure calculated by comparing conditions with identical timings, any bias from the fixed order is present in both the baseline and saccade trials. The disruption we report is therefore a specific effect of eye movements that goes beyond the noise introduced by the retention interval or the preceding colour report.

      We added the following text in the Methods – experimental procedure (p.22):

      “Recall was performed in a fixed order, with colour reported before location. This sequence was primarily chosen to minimise cognitive load and task-switching demands for the two neurological patient cohorts, ensuring the paradigm remained accessible for individuals with AD and PD. While this order results in a slightly longer retention interval for location recall, the saccade cost was identified by comparing location error across experimental conditions with similar timings but varying saccadic demands.”

      (3) Relatedly, because spatial representations are retinotopic, fixational eye movements (FEMs - microsaccades and drift) displace the retinal coordinates of encoded positions, increasing apparent spatial noise with time delays. Colour memory, however, is feature-based and unaffected by small retinal translations. Thus, any between-condition or between-group differences in FEMs could selectively inflate location error and the associated model parameters (encoding noise, decay, interference), while leaving colour error unchanged. Note that FEMs tend to be slightly ballistic [1,2], hence not well modelled with a Gaussian blur.

      This is a very insightful point. We have now addressed this in detail within the discussion:

      “However, our findings suggest otherwise. One potential concern is whether this dissociation simply reflects the inherent spatial noise introduced by fixational eye movements (FEMs), such as microssacades and drifts.[46] Because locations are stored in a retinotopic frame, fixational instability necessarily shifts retinal coordinates over time. However, the "saccade cost" here was defined as the error increase relative to a no-saccade baseline of equal duration; because both conditions are subject to the same fixational drift, any FEM-induced noise is effectively subtracted out. Thus, despite the ballistic and non-Gaussian nature of FEMs,n [47] they cannot account for the fact the saccade cost in the spatial memory, but total absence in the colour domain. Another possibility is that this dissociation reflects differences in baseline task difficulty or dynamic range. Yet, the presence of a robust recency effect in colour memory (Figure 2B) confirms that our paradigm was sensitive to memory-dependent variance and was not limited by floor or ceiling effects.”

      (4) There is no in silico demonstration that the modelling framework can recover the true generating model from synthetic data or recover accurate parameters under realistic noise levels, which can be challenging in generative models with a hierarchical structure (as per [3], for example). Figure 8b shows that the parameters possess substantial posterior covariance, which raises concerns as to whether they can be reliably disambiguate.

      Many thanks for this comment. We have added a simple recovery analysis as detailed below but are also keen to ensure we fully answer your question—which has more to do with empirical rather than simulated data—and make clear the rationale for this analysis in this instance.

      We added this in Supplementary Materials:

      “Model validation and recovery analysis

      The following section provides a detailed technical assessment of the model inversion scheme, focusing on the discriminability of the model space and the identifiability of individual parameters.

      Recovery analyses of this sort are typically used prior to collecting data to allow one to determine whether, in principle, the data are useful in disambiguating between hypotheses. In this sense, they have a role analogous to a classical power calculation. However, their utility is limited when used post-hoc when data have already been collected, as the question of whether the models can be disambiguated becomes one of whether non-trivial Bayes factors can be identified from those data.

      The reason for including a recovery analysis here is not to identify whether the model inversion scheme identifies a ‘true’ model. The concept of ‘true generative models’ commits to a strong philosophical position which is at odds with the ‘all models are wrong, but some are useful’ perspective held by many in statistics, e.g., (So, 2017). Of note, one can always confound a model recovery scheme by generating the same data in a simple way, and in (one of an infinite number of) more complex ways. A good model inversion scheme will always recover the simple model and therefore would appear to select the ‘wrong’ model in a recovery analysis. However, it is still the best explanation for the data. For these reasons, we do not necessarily expect ‘good’ recoverability in all parameter ranges. This is further confounded by the relationship between the models we have proposed—e.g., an interference model with very low interference will look almost identical to a model with no interference. The important question here is whether they can be disambiguated with real data.

      Instead, the value of a post-hoc recovery analysis here is to evaluate whether there was a sensible choice of model space—i.e., that it was not a priori guaranteed that a single model (and, specifically, the model we found to be the best explanation for the data) would explain the results of all others. To address this, for each model, we simulated 16 datasets, each of which relied upon parameters sampled from the model priors, which included examples of each of the experimental conditions. We then fit each of these datasets to each of the 7 models to construct the confusion matrix shown in the lower panel of Supplementary Figure 3, by accumulating evidence over each of the 16 participants generated according to each ‘true’ model (columns) for each of the possible explanatory models (rows). This shows that no one model, for the parameter ranges sampled here, explains all other datasets. Interestingly, our ‘winning’ model in the empirical analysis is not the best explanation for any of the datasets simulated (including its own). This is reassuring, in that it implies this model winning was not a foregone conclusion and is driven by the data—not just the choice of model space.”

      Your point about the posterior covariance is well founded. As we describe in Supplementary Materials, this is an inherent feature of inverse problems (analogous to EEG source localisation). However, the fact that our posterior densities move significantly away from the prior expectations demonstrates that the data are indeed informative. By adopting a Bayesian framework, we are able to explicitly quantify this uncertainty rather than ignoring it, providing a more transparent account of parameter identifiability. We have added the following in the same section of Supplementary Materials:

      “This problem is an inverse problem—inferring parameters from a non-linear model. We therefore expect a degree of posterior covariance between parameters and, consequently, that they cannot be disambiguated with complete certainty. While some degree of posterior covariance is inherent to inverse models—including established methods like EEG source localisation—the fact that many of the parameters are estimated with posterior densities that do not include their prior expectations implies the data are informative about these.

      The advantage of the Bayesian approach we have adopted here is that we can explicitly quantify posterior covariance between these parameters, and therefore the degree to which they can be disambiguated. While the posterior covariance matrices from empirical data are the relevant measure here, we can better understand the behaviour of the model inversion scheme in relation to the specific models used using the model recovery analysis reported in Supplementary figure 3.

      The middle panel of the figure is key, along with the correlation coefficients reported in the figure caption. Here, we see at least a weak positive correlation (in some cases much stronger) for almost all parameters and limited movement from prior expectations for those parameters that are less convincingly recovered. This reinforces that the ability of the scheme to recover parameters is best assessed in terms of the degree of movement of posterior from prior values following fitting to empirical data.”

      (5) The authors employ Bayes factors (BFs) to disambiguate models, but BFs would also strengthen the claims that location, but not colour, is impacted by saccades. Despite colour being a circular variable, colour error is analysed using ANOVA on linearised differences (radians). The authors should also arguably use circular statistics, such as the von Mises distribution, for the analysis of colour.

      Regarding the use of circular statistics, you are correct that such error distributions are not suitable for ANOVA, and it is better to use circular statistics. However, for the present dataset, we used the mean absolute angular error per condition (ranging from 0 to π radians), which represents the shortest distance on the colour wheel between the target and the response.

      This approach effectively linearises the measure by removing the 2π wrap-around boundary. because the observed errors were relatively small and did not cluster near the π boundary—even in the patient cohorts (Figure 5B)—the "wrap-around" effect of circular space is negligible. Moreover, by analysing the mean error across trials for each condition, rather than trial-wise data, we invoke the Central Limit Theorem. This ensures that the distribution of these means is approximately normal, satisfying the fundamental assumptions of ANOVA. Due to these reasons, we adopted simpler linear models. We confirmed that the data did not violate the assumptions of linear statistics. In this low-noise regime, linear and circular models converge on the same conclusions. This has been revised in Methods:

      “For colour memory, we calculated the absolute angular error, defined as the shortest distance on the colour wheel between the target and the reported colour (range 0 to π radians). For the primary statistical analyses, we utilised the mean absolute error per condition for each participant. By analysing these condition-wise means rather than trial-wise raw data, we invoke the Central Limit Theorem, which ensures that the sampling distribution of these means approximates normality. Because the absolute errors in this paradigm were relatively small and did not approach the π boundary (Figure 5B) even in the clinical cohorts, the data were treated as a continuous measure in our linear ANOVAs and regression models. Moreover, because location and colour recall involve different scales and units, all analyses were performed independently for each feature to avoid cross-dimensional magnitude comparisons.”

      We have also now integrated Bayesian repeated measures ANOVA throughout the manuscript. The Results section for the young healthy adults now reads (p. 4):

      “A repeated measures ANOVA revealed a significant main effect of saccades on location memory (F(3, 20) = 51.52, p < 0.001, partial η²=0.72), with Bayesian analysis providing decisive evidence for the inclusion of the saccade factor (BF<sub>incl</sub> = 3.52 x 10^13, P(incl|data) = 1.00). In contrast, colour memory remained remarkably stable across all saccade conditions (F(3, 20) = 0.57, p = 0.64, partial η² =0.03). This null effect was supported by Bayesian analysis, which provided moderate evidence in favour of the null hypothesis (BF<sub>01</sub> = 8.46, P(excl|data) = 0.89), indicating that the data were more than eight times more likely under the null model than a model including saccade-related impairment.”

      For elderly healthy adults:

      “In contrast, colour memory remained unaffected by saccade demands (F(3, 20) = 0.57, p = 0.65, partial η² =0.03), again supported by the Bayesian analysis: BF<sub>01</sub> = 8.68, P(excl|data) = 0.90.”

      For patient cohorts:

      “Bayesian repeated measures ANOVAs further supported this dissociation, providing moderate evidence for the null hypothesis in the AD group (BF<sub>01</sub> = 3.35, P(excl|data) = 0.77) and weak evidence in the PD group (BF<sub>01</sub> = 2.23, P(excl|data) = 0.69). This indicates that even in populations with established neurodegeneration, the detrimental impact of eye movements is specific to the spatial domain.”

      Related description is also updated in Methods – Statistical Analysis.

      Minor:

      (1) The modelling is described as computational but is arguably better characterised as a heuristic generative model at Marr's algorithmic level. It does not derive from normative computational principles or describe an implementation in neural circuits.

      We appreciate your perspective on the classification of our model within Marr’s hierarchy. We agree that our framework is best characterised as an algorithmic-level generative model. Our objective was to identify the mechanistic principles governing transsaccadic updating rather than to provide a normative derivation or a specific circuit-level implementation.

      To ensure readers do not over-interpret the term ‘computational’, we have added a clarifying statement in the Discussion acknowledging the algorithmic nature of the model. Interestingly, we note that a model predicated on this form of spatial diffusion implies a neural field representation with a spatial connectivity kernel whose limit approximates the second derivative of a Dirac delta function. While a formal neural field implementation is beyond the scope of the present work, our algorithmic results provide the necessary constraints for such future biophysical models.

      p.20: “While we describe the present framework as 'computational', it is more precisely characterised as an algorithmic-level generative model within Marr’s hierarchy. Our focus was on defining the rules of spatial integration and the sources of eye-movement-induced noise, rather than deriving these processes from normative principles or defining their specific neural implementation.”

      (2) I did not find a description of the recruitment and characterization of the AD and PD patients.

      Apologies for this omission. We have now included a detailed description of participant recruitment and clinical characterisation in the Methods section and also updated Table 2:

      “A total of 87 participants completed the study: 21 young healthy adults (YC), 21 older healthy adults (EC), 23 patients with Parkinson’s disease (PD), and 22 patients with Alzheimer’s disease (AD). Their demographic and clinical details are summarised in Table 2. Initially, 90 participants were recruited (22 YC, 21 EC, 25 PD, 22 AD); however, three individuals (1 YC and 2 PD) were excluded from all analyses due to technical issues during data acquisition.

      All participants were recruited locally in Oxford, UK. None were professional artists, had a history of psychiatric illness, or were taking psychoactive medications (excluding standard dopamine replacement therapy for PD patients). Young participants were recruited via the University of Oxford Department of Experimental Psychology recruitment system. Older healthy volunteers (all >50 years of age) were recruited from the Oxford Dementia and Ageing Research (OxDARE) database.

      Patients with PD were recruited from specialist clinics in Oxfordshire. All had a clinical diagnosis of idiopathic Parkinson's disease and no history of other major neurological or psychiatric conditions. While specific dosages of dopamine replacement therapy (e.g., levodopa equivalent doses) were not systematically recorded, all patients were tested while on their regular medication regimen ('ON' state).

      Patients with PD were recruited from clinics in the Oxfordshire area. All had a clinical diagnosis of idiopathic Parkinson’s disease and no history of other major neurological or psychiatric illnesses. While all patients were tested in their regular medication ‘ON’ state, the specific pharmacological profiles—including the exact types of medication (e.g., levodopa, dopamine agonists, or combinations) and dosages—were not systematically recorded. The disease duration and PD severity were also un-recorded for this study.

      Patients with AD were recruited from the Cognitive Disorders Clinic at the John Radcliffe Hospital, Oxford, UK. All AD participants presented with a progressive, multidomain, predominantly amnestic cognitive impairment. Clinical diagnoses were supported by structural MRI and FDG-PET imaging consistent with a clinical diagnosis of AD dementia (e.g., temporo-parietal atrophy and hypometabolism).69 All neuroimaging was reviewed independently by two senior neurologists (S.T. and M.H.).

      Global cognitive function was assessed using the Addenbrooke’s Cognitive Examination-III (ACE-III).70 All healthy participants scored above the standard cut-off of 88, with the exception of one elderly participant who scored 85. In the PD group, two participants scored below the cut-off (85 and 79). In the AD group, six participants scored above 88; these individuals were included based on robust clinical and radiological evidence of AD pathology rather than their ACE-III score alone.”

      (3) YA and OA patients appear to differ in gender distribution.

      We acknowledge the difference in gender distribution between the young (71.4% female) and older adult (57.1% female) cohorts. However, we do not anticipate that gender influences the fundamental computational mechanisms of retinotopic maintenance or transsaccadic remapping. These processes represent low-level visuospatial functions for which there is no established evidence of gender-specific differences in precision or coordinate transformation. We have ensured that the gender distribution for each cohort is clearly listed in the demographics table (Table 2) for full transparency.

      Thank you very much for very insightful feedback!

      Reviewer #3 (Public review):

      Thank you for the positive feedback regarding our inclusion of clinical groups and the identification of computational phenotypes that differentiate these cohorts.

      To address your concerns about the model, we have clarified our use of Bayesian Model Selection, which inherently penalises model complexity to ensure that our results are not driven solely by the number of parameters. We will also provide further evidence regarding model generalisability to address the concern of overfitting.

      Regarding the link with the ROCF, we have revised the manuscript to better highlight the specific relationship between our transsaccadic parameters and the ROCF data and better motivate the inclusion of these results in the main text.

      Below is our response to your suggestions point-by-point:

      (1) The models tested differ in terms of the number of parameters. In general, a larger number of parameters leads to a better goodness of fit. It is not clear how the difference in the number of parameters between the models was taken into account. It is not clear whether the modelling results could be influenced by overfitting (it is not clear how well the model can generalize to new observations).

      To ensure our results were not driven by the number of parameters, we utilised random-effects Bayesian Model Selection (BMS) to adjudicate between our candidate models. Unlike maximum likelihood methods, BMS relies on the marginal likelihood (model evidence), which inherently balances model fit against parsimony—a principle known as the Occam’s Razor (Rasmussen and Ghahramani, 2000). In this framework, a model is only preferred if the improvement in fit justifies the additional parameter space; redundant parameters actually lower model evidence by diluting the probability mass. We would be happy to point toward literature that discusses how these marginal likelihood approximations provide a more robust guard against overfitting than standard metrics like BIC or AIC (MacKay, 2003; Murray and Ghahramani, 2005; Penny, 2012).

      The fact that the "Dual (Saccade) + Interference" model (Model 7) emerged as the winner—with a Bayes Factor of 6.11 against the next best alternative—demonstrates that its complexity was statistically justified by its superior account of the trial-by-trial data.

      Furthermore, to address the risk of overfitting, we established the generalisability of these parameters by using them to predict performance on an independent clinical task. These parameters successfully explained ~62% of the variance in ROCF copy scores—a very distinct, real-world task--confirming that they represent robust computational phenotypes rather than idiosyncratic fits to the initial dataset.

      In the Results (p10):

      “We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[25–27]”

      In the Discussion (p17):

      “Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[25–27,42] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      (2) Results specificity: it is not clear how specific the modelling results are with respect to constructional ability (measured via the Rey-Osterrieth Complex Figure test). As with any cognitive test, performance can also be influenced by general, non-specific abilities that contribute broadly to test success.

      We agree that constructional performance is influenced by both specific mechanistic constraints and general cognitive abilities. To isolate the unique contribution of transsaccadic updating, we therefore performed a partial correlation analysis across the entire sample. We examined the relationship between location error in the two-saccades condition (our primary behavioural measure of transsaccadic memory) and ROCF copy scores. Even after partialling out the effects of global cognitive status (ACE-III total score), age, and years of education, the correlation remained highly significant (rho = -0.39, p < 0.001).

      This suggests that our model captures a specific computational phenotype—the precision of spatial updating during active visual sampling—rather than acting as a proxy for non-specific cognitive decline. This mechanistic link explains why traditional working memory measures (e.g., digit span or Corsi blocks) frequently fail to predict drawing performance; unlike those tasks, figure copying requires thousands of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified by our modelling framework.

      We added the following text in the Discussion (p19):

      “We also found that the relationship between transsaccadic working memory and ROCF performance remains highly significant (rho = -0.39, p < 0.001), even after controlling for age, education, and global cognitive status (ACE-III total score). Consequently, transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.[57]”

      Reviewer #3 (Recommendations for the authors):

      (1) The authors mention in the introduction the following: "One key hypothesis is that we use working memory across visual fixations to update perception dynamically", citing the following manuscript:

      Harrison, W. J., Stead, I., Wallis, T. S. A., Bex, P. J. & Mattingley, J. B. A computational 906 account of transsaccadic attentional allocation based on visual gain fields. Proc. Natl. 907 Acad. Sci. U.S.A. 121, e2316608121 (2024).

      However, the manuscript above does not refer explicitly to the involvement of working memory in transaccadic integration of object location in space. Rather, it takes advantage of recent evidence showing how the true location of a visual object is represented in the activity of neurons in primary visual cortex ( A. P. Morris, B. Krekelberg, A stable visual world in primate primary visual cortex. Curr. Biol. 29, 1471-1480.e6 (2019) ). The model hypothesizes that true locations of objects are readily available, and then allocates attention in real-world coordinates, allowing efficient coordination of attention and saccadic eye movements.

      Thank you for clarification. As suggested, we have now included the citation of Morris & Krekelberg (2019) to acknowledge the evidence for stable object locations within the primary visual cortex.

      (2) The authors in the introduction and the title use the terms 'transaccadic memory' and 'spatial working memory'. However, it is not clear whether these can be used interchangeably or are reflecting different constructs.

      Classical measures of visuo-spatial working memory are derived from the Corsi task (or similar), where the location of multiple objects is displayed and subsequently remembered. In such tasks, eye movements and saccades are not generally considered, only memory performance, representing the visuo-spatial span.

      Transaccadic memory tasks are instead explicitly measuring the performance on remembered object locations of features across explicit eye movements, usually using a very limited number of objects (1 or 2, as is the case for the current manuscript).

      While the two constructs share some features, it is not clear whether they represent the same underlying ability or not, especially because in transaccadic tasks, participants are required to perform one or more saccades, thus representing a dual-task case.

      I think the relationship between 'transaccadic memory' and 'spatial working memory' should be clarified in the manuscript.

      Thank you. Yes, we have added this within the Methods - Measurement of saccade cost to clarify that spatial working memory is the broad cognitive construct responsible for short-term maintenance, whereas transsaccadic memory is the specific, dynamic process of remapping representations to maintain stability across eye movements.

      In Methods (p.22):

      “Within this framework, it is important to distinguish between the broad construct of spatial working memory and the specific process of transsaccadic memory. While spatial working memory refers to the general ability to maintain spatial information over short intervals, transsaccadic memory describes the dynamic updating of these representations—termed remapping—to ensure stability across eye movements. Unlike classical 'static' measures of spatial working memory, such as the Corsi block task which focuses on memory span, transsaccadic memory tasks explicitly require the integration of stored visual information with motor signals from intervening saccades. Our paradigm treats transsaccadic updating as a core computational process within spatial working memory, where eye-centred representations are actively reconstructed based on noisy memories of the intervening saccade vectors.”

      (3) In Figure 1, the second row indicates the presentation of item 2. Indeed, in the condition 'saccade-after-item-1', the target in the second row of Figure 1 is displaced, as expected. This clarifies the direction and amplitude of the first saccade requested. However, from Figure 1, it is hard to understand the amplitude and direction of the second requested saccade. I think the figure should be updated, giving a full description of the direction and amplitude of the second saccade as well ('saccade-after-item-2' and 'two-saccades' conditions).

      We agree that making the figure legend more self-contained is beneficial for the reader. While the specific physical parameters and the trial sequence for each condition are detailed in the Results and Methods sections, we have now updated the legend for Figure 1 to explicitly define these details. Specifically, we have clarified that the colour wheel itself served as the target for the second instructed saccade (i.e., the movement from the second fixation cross to the colour wheel location). We have also included the quantitative constraint that all saccade vectors were at least 8.5 degrees of visual angle in amplitude. Given the limited space within a figure legend, we hope these concise additions provide the transparency requested without interrupting the conceptual flow of the diagram.

      Updated Figure 1 legend:

      “Participants were asked to fixate a white cross, wherever it appeared. They had to remember the colour and location of a sequence of two briefly presented coloured squares (Item 1 and 2), each appearing within a white square frame. They then fixated a colour wheel wherever it appeared on the screen, which served as the target for the second instructed saccade (i.e., a movement from the second fixation cross to the colour wheel location). This cued recall of a specific square (Item 1 or Item 2 labelled within the colour wheel). Participants selected the remembered colour on the colour wheel which led to a square of that colour appearing on the screen. They then dragged this square to its remembered location on the screen. Saccadic demands were manipulated by varying the locations of the second frame and the colour wheel, resulting in four conditions in their reliance on retinotopic versus transsaccadic memory: (1) No-Saccade condition providing a baseline measure of within-fixation precision as no eye movements were required. (2) Saccade After Item 1; (3) Saccade After Item 2; (4) Saccades after both items (Two Saccades condition). In all conditions requiring eye movements, saccade vectors were constrained to a minimum amplitude of 8.5° (degrees of visual angle). While the No-Saccade condition isolates retinotopic working memory, conditions (2) to (4) collectively quantify the impact of varying saccadic demands and timings on the maintenance of spatial information, thereby assessing the efficacy of the transsaccadic updating process.”

      (4) The authors write: "Eye tracking analysis confirmed high compliance: participants correctly maintained fixation or executed saccades as instructed on the vast majority of trials (83% {plus minus} 14%). Non-compliant trials were excluded 136 from further analysis." 14% of excluded trials are a substantial fraction of trials, given the task requirements. Is this proportion of excluded trials different between experimental groups, and are experimental groups contributing equally to this proportion?

      We thank the reviewer for pointing this out, and we apologise for the confusion. The 83% trial number was actually across all four cohorts, and all conditions, and it was actually above 90% for YC, EC and even AD, but dropped to 60 ish in PD group.

      We now have conducted a full analysis of compliant trial counts using a mixed ANOVA (4 saccade conditions x 4 cohorts). This analysis revealed a main effect of group (F(3, 80) = 8.06, p < 0.001), which was driven by lower compliance in the PD cohort (mean approx. 25.4 trials per condition) compared to the AD, EC, and YC cohorts (means ranging from 35.8 to 38.9 trials per condition). Crucially, however, the interaction between group and condition was not statistically significant (p = 0.151). This indicates that the relative impact of saccade demands on trial retention was consistent across all four groups.

      Because our primary behavioural measure—the saccade cost—is a within-subject comparison of impairment across conditions, these differences in absolute trial numbers do not introduce a systematic bias into our findings. Furthermore, even with the higher attrition in the PD group, we retained a sufficient number of high-quality trials (minimum mean of ~23 trials in the most demanding condition) to support robust trial-by-trial parameter estimation and valid statistical inference. We have updated the Results and Methods to reflect these details.

      In Results (p4):

      “To mitigate potential confounds, we monitored eye position throughout the experiment. Eye-tracking analysis confirmed high compliance in healthy adults, who followed instructions on the vast majority of trials (Younger Adults: 97.2 ± 5.2 %; Older Adults: 91.3 ± 20.4 %). The mean difference between these groups was negligible, representing just 1.25 trials per condition, and was not statistically significant (t(80) = 0.16, p = 1.000; see more in Methods – Eyetracking data analysis). Non-compliant trials were excluded from all further analyses.”

      In Methods (p27):

      “Eye-tracking analysis confirmed high compliance overall, with participants correctly maintaining fixation or executing saccades on the vast majority of trials (83% across all participants). A mixed ANOVA revealed a main effect of group on trial retention (F(3, 80) = 8.06, p < 0.001, partial η² = 0.23), primarily due to lower compliance in the PD cohort (YC: 97±4%; EC: 91±10%; AD: 95±5%; PD: 63±38%). Importantly, there was no significant interaction between group and saccade condition (F(3.36, 80) = 1.78, p = 0.15, partial η² = 0.008), suggesting that trial attrition was not disproportionately affected by specific task demands in any group.

      We acknowledge that this reduced trial count in the PD group represents a limitation for across-cohort comparison. However, the absolute number of compliant trials in PD group (mean approx. 25 per condition) remained sufficient for robust trial-by-trial parameter estimation. Furthermore, the lack of a significant group-by-condition interaction confirms that the results reported for this cohort remain valid and that our primary finding of a selective spatial memory deficit is robust to these differences in data retention.”

      (5) Modelling

      (a) Degrees of freedom, cross-validation, number of parameters.

      I appreciate the effort in introducing and testing different models. Models of increase in complexity and are based on different assumptions about the main drivers and mechanisms underlying the dependent variable. The models differ in the number of parameters. How are the differences in the number of parameters between models taken into account in the modelling analysis? Is there a cost associated with the extra parameters included in the more complex models?

      (b) Cross-validation and overfitting.

      Overfitting can occur when a model learns the training data but cannot generalize to novel datasets. Cross-validation is one approach that can be used to avoid overfitting. Was cross-validation (or other approaches) implemented in the fitting procedure against overfitting? Otherwise, the inference that can be derived from the modelled parameters can be limited.

      To address your concerns regarding model complexity and overfitting, we would like to clarify our use of Bayesian Model Selection (BMS). Unlike frequentist methods that often rely on cross-validation to assess generalisability, we used random-effects BMS based on the marginal likelihood (model evidence). This approach inherently implements Bayesian Occam’s Razor by integrating out the parameters. Under this framework, the use of the marginal likelihood for model selection provides a mathematically equivalent safeguard to frequentist cross-validation, as it evaluates the model's ability to generalise across the entire parameter space rather than just finding a maximum likelihood fit for the training data. Thus, models are penalised not just for the absolute number of parameters, but for their overall functional flexibility. A more complex model is only preferred if the improvement in model fit is substantial enough to outweigh this inherent penalty. The emergence of Model 7 as the winner (Bayes Factor = 6.11 against the next best alternative) confirms that its additional complexity is statistically justified.

      Furthermore, in this study we provided an external validation of these recovered parameters by demonstrating that they explain 62% of the variance in an independent, real-world, clinical task (ROCF copy). This empirical evidence confirms that our model captures robust mechanistic phenotypes rather than idiosyncratic noise. We have updated the Results and Discussion to explicitly state these.

      In Results: (p10)

      “We used random-effects Bayesian model selection to identify the most plausible generative model. This process relies on the marginal likelihood (model evidence), which inherently balances model fit against complexity—a principle often referred to as Occam’s razor.[26–28]”

      In Discussion: (p17)

      “Importantly, the risk of overfitting is mitigated by the Bayesian Model Selection framework; by utilising the marginal likelihood for model comparison, the procedure inherently penalises excessive model complexity and promotes generalisability.[26–28,43] This generalisability was further evidenced by the model's ability to predict performance on the independent ROCF task, confirming that these parameters represent robust mechanistic phenotypes rather than idiosyncratic fits to the initial dataset.”

      (6) n. of participants.

      (a) The authors write the following: "A total of healthy volunteers (21 young adults, mean age = 24.1 years; 21 older adults, mean age = 72.4 years) participated in this study. Their demographics are shown in Table 1. All participants were recruited locally in Oxford." However, Table 1 reports the data from more than 80 participants, divided into 4 groups. Details about the PD and AD groups are missing. Please clarify.

      We apologize for this lack of clarity in the text. We have rewrote and expand the “Participants” section and corrected Table 2 in the Methods section to reflect the correct number of participants.

      In Methods (p20):

      “A total of 87 participants completed the study: 21 young healthy adults (YC), 21 older healthy adults (EC), 23 patients with Parkinson’s disease (PD), and 22 patients with Alzheimer’s disease (AD). Their demographic and clinical details are summarised in Table 2. Initially, 90 participants were recruited (22 YC, 21 EC, 25 PD, 22 AD); however, three individuals (1 YC and 2 PD) were excluded from all analyses due to technical issues during data acquisition.

      All participants were recruited locally in Oxford, UK. None were professional artists, had a history of psychiatric illness, or were taking psychoactive medications (excluding standard dopamine replacement therapy for PD patients). Young participants were recruited via the University of Oxford Department of Experimental Psychology recruitment system. Older healthy volunteers (all >50 years of age) were recruited from the Oxford Dementia and Ageing Research (OxDARE) database.

      Patients with PD were recruited from specialist clinics in Oxfordshire. All had a clinical diagnosis of idiopathic Parkinson's disease and no history of other major neurological or psychiatric conditions. While specific dosages of dopamine replacement therapy (e.g., levodopa equivalent doses) were not systematically recorded, all patients were tested while on their regular medication regimen ('ON' state).

      Patients with PD were recruited from clinics in the Oxfordshire area. All had a clinical diagnosis of idiopathic Parkinson’s disease and no history of other major neurological or psychiatric illnesses. While all patients were tested in their regular medication ‘ON’ state, the specific pharmacological profiles—including the exact types of medication (e.g., levodopa, dopamine agonists, or combinations) and dosages—were not systematically recorded. The disease duration and PD severity were also un-recorded for this study.

      Patients with AD were recruited from the Cognitive Disorders Clinic at the John Radcliffe Hospital, Oxford, UK. All AD participants presented with a progressive, multidomain, predominantly amnestic cognitive impairment. Clinical diagnoses were supported by structural MRI and FDG-PET imaging consistent with a clinical diagnosis of AD dementia (e.g., temporo-parietal atrophy and hypometabolism).[70] All neuroimaging was reviewed independently by two senior neurologists (S.T. and M.H.).

      Global cognitive function was assessed using the Addenbrooke’s Cognitive Examination-III (ACE-III).[71] All healthy participants scored above the standard cut-off of 88, with the exception of one elderly participant who scored 85. In the PD group, two participants scored below the cut-off (85 and 79). In the AD group, six participants scored above 88; these individuals were included based on robust clinical and radiological evidence of AD pathology rather than their ACE-III score alone.”

      (b) As modelling results rely heavily on the quality of eye movements and eye traces, I believe it is necessary to report details about eye movement calibration quality and eye traces quality for the 4 experimental groups, as noisier data could be expected from naïve and possibly older participants, especially in case of clinical conditions. Potential differences in quality between groups should be discussed in light of the results obtained and whether these could contribute to the observed patterns.

      Thank you for pointing this out. We have revised the Methods about how calibration was done:

      (p27) “Prior to the experiment, a standard nine-point calibration and validation procedure was performed. Participants were instructed to fixate a small black circle with a white centre (0.5 degrees) as it appeared sequentially at nine points forming a 3 x 3 grid across the screen. Calibration was accepted only if the mean validation error was below 0.5 degrees and the maximum error at any single point was below 1.0 degree. If these criteria were not met, or if the experimenter noticed significant gaze drift between blocks, the calibration procedure was repeated. This calibration ensured high spatial accuracy across the entire display area, facilitating the precise monitoring of fixations on item frames and saccadic movements to the response colour wheel.”

      Moreover, as detailed in our response to Point 4, while the PD group exhibited lower compliance, there was no interaction between group and saccade condition for compliance (p = 0.151). This confirms that any noise or trial attrition was distributed evenly across experimental conditions. Consequently, the observed "saccade cost" (the difference in error between conditions) is not an artefact of unequal noise but represents a genuine mechanistic impairment in spatial updating. We have updated the Methods to clarify this distinction.

      Furthermore, our Bayesian framework explicitly estimates precision (random noise) as a distinct parameter from updating cost (saccade cost). This allows the model to partition the variance: even if a clinical group is "noisier" overall, this is captured by the precision parameter, ensuring it does not inflate the specific estimate of saccade-driven memory impairment.

      (7) Figure 5. I suggest reporting these results using boxplots instead of barplots, as the former gives a better overview of the distributions.

      We appreciate the suggestion to use boxplots to better illustrate data distributions. However, we have chosen to retain the current bar plot format due to the visual and statistical complexity of our 4 x 4 x 2 experimental design. Figure 5 represents 16 distinct distributions across four groups and four conditions for both location and colour measures; employing boxplots/violins for this density of data would significantly increase visual clutter and make the figure difficult to parse.

      Furthermore, the primary objective of this figure is to reflect the statistical analysis and illustrate group differences in overall performance and highlight the specific finding that patients with AD were significantly more impaired across all conditions compared to YC, EC, and PD groups. Our statistical focus remains on the mean effects—specifically the significant main effect of group (F(3, 318) = 59.71, p < 0.001) and the critical null-interaction between group and condition (p = 0.90). The error measure most relevant to these comparisons is the standard error of the mean (SEM), rather than the interquartile range (IQR). We think that bar plots provide the most straightforward and scannable representation of these mean differences and the consistent pattern of decay across cohorts for the final manuscript layout.

      To address the reviewer’s request for distributional transparency, we have provided a version of Figure 5 using grouped boxplots in the supplementary material (Supplementary figure 2). We note, however, that the spread of raw data points in these plots does not directly reflect the variance associated with our within-subject statistical comparisons.

      (8) Results specificity, trans-saccadic integration and ROCF. The authors demonstrate that the derived model parameters account for a significant amount of variability in ROCF performance across the experimental groups tested (Figure 8A). However, it remains unclear how specific the modelling results are with respect to the ROCF.

      The ROCF is generally interpreted as a measure of constructional ability. Nevertheless, as with any cognitive test, performance can also be influenced by more general, non-specific abilities that contribute broadly to test success. To more clearly link the specificity between modelling results and constructional ability, it would be helpful to include a test measure for which the model parameters would not be expected to explain performance, for example, a verbal working memory task.

      I am not necessarily suggesting that new data should be collected. However, I believe that the issue of specificity should be acknowledged and discussed as a potential limitation in the current context.

      We appreciate this important point regarding the discriminant validity of our findings. We agree that cognitive performance in clinical populations is often influenced by a general "g-factor" or non-specific executive decline. However, we chose the ROCF Copy task specifically because it is a hallmark clinical measure of constructional ability that effectively serves as a real-world transsaccadic task, requiring participants to integrate spatial information across hundreds of saccades between the model figure and the drawing surface.

      To address the reviewer’s concern regarding specificity, we leveraged the fact that all participants completed the ACE-III, which includes a dedicated verbal memory component (the ACE Memory subscale). We conducted a partial correlation analysis and found that the relationship between transsaccadic working memory and ROCF copy performance remains highly significant (rho = -0.46, p < 0.001), even after controlling for age, education, and the ACE-III Memory subscale score. This suggests that the link between transsaccadic updating and constructional ability is mechanistically specific rather than a byproduct of global cognitive impairment. We have substantially revised the Discussion to highlight this link and the supporting statistical evidence.

      We first updated the last paragraph of Introduction:

      “Finally, by linking these mechanistic parameters to a standard clinical measure of constructional ability (the Rey-Osterrieth Complex Figure task), we demonstrate that transsaccadic updating represents a core computational phenotype underpinning real-world visuospatial construction in both health and neurodegeneration.”

      The new section in Discussion highlighting the ROCF copy link:

      “Importantly, our computational framework establishes a direct mechanistic link between trassaccadic updating and real-world constructional ability. Specifically, higher saccade and angular encoding errors contribute to poorer ROCF copy scores. By mapping these mechanistic estimates onto clinical scores, we found that the parameters derived from our winning model explain approximately 62% of the variance in constructional performance across groups. These findings suggest that the computational parameters identified in the LOCUS task represent core phenotypes of visuospatial ability, providing a mechanistic bridge between basic cognitive theory and clinical presentation.

      This relationship provides novel insights into the cognitive processes underlying drawing, specifically highlighting the role of transsaccadic working memory. Previous research has primarily focused on the roles of fine motor control and eye-hand coordination in this skill.[4,50–55] This is partly because of consistent failure to find a strong relation between traditional memory measures and copying ability.[4,31] For instance, common measures of working memory, such as digit span and Corsi block tasks, do not directly predict ROCF copying performance.[31,56] Furthermore, in patients with constructional apraxia, these memory performance often remain relatively preserved despite significant drawing impairments.[56–58] In literature, this lack of association has often been attributed to “deictic” visual-sampling strategies, characterised by frequent eye movements that treat the environment as an external memory buffer, thereby minimising the need to maintain a detailed internal representation.[4,59] In a real-world copying task, the ROCF requires a high volume of saccades, making it uniquely sensitive to the precision of the dynamic remapping signals identified here. Recent eye-tracking evidence confirms that patients with AD exhibit significantly more saccades and longer fixations during figure copying compared to controls, potentially as a compensatory response to trassaccadic working memory constraints.[56] This high-frequency sampling—averaging between 150 and 260 saccades for AD patients compared to approximately 100 for healthy controls—renders the task highly dependent on the precision of dynamic remapping signals.[56] We also found that the relationship between transsaccadic working memory and ROCF performance remains highly significant (rho = -0.46, p < 0.001), even after controlling for age, education, and ACE-III Memory subscore. Consequently, transsaccadic updating may represent a discrete computational phenotype required for visuomotor control, rather than a non-specific proxy for global cognitive decline.[58]

      In other words, even when visual information is readily available in the world, the act of drawing performance depends critically on working memory across saccades. This reveals a fundamental computational trade-off: while active sampling strategies (characterised with frequent eye-hand movements) effectively reduce the load on capacity-limited working memory, they simultaneously increase the demand for precise spatial updating across eye movements. By treating the external world as an "outside" memory buffer, the brain minimises the volume of information it must hold internally, but it becomes entirely dependent on the reliability with which that information is remapped after each eye movement. This perspective aligns with, rather contradicts, the traditional view of active sampling, which posits that individuals adapt their gaze and memory strategies based on specific task demands.[3,60] Furthermore, this perspective provides a mechanistic framework for understanding constructional apraxia; in these clinical populations, the impairment may not lie in a reduced memory "span," but rather in the cumulative noise introduced by the constant spatial remapping required during the copying process.[58,61]

      Beyond constructional ability, these findings suggest that the primary evolutionary utility of high-resolution spatial remapping lies in the service of action rather than perception. While spatial remapping is often invoked to explain perceptual stability,[11–13,15] the necessity of high-resolution transsaccadic memory for basic visual perception is debated.[13,62–64] A prevailing view suggests that detailed internal models are unnecessary for perception, given the continuous availability of visual information in the external world.[13,44] Our findings support an alternative perspective, aligning with the proposal that high-resolution transsaccadic memory primarily serves action rather than perception.[13] This is consistent with the need for precise localisation in eye-hand coordination tasks such as pointing or grasping.[65] Even when unaware of intrasaccadic target displacements, individuals rapidly adjust their reaching movements, suggesting direct access of the motor system to remapping signals.[66] Further support comes from evidence that pointing to remembered locations is biased by changes in eye position,[67] and that remapping neurons reside within the dorsal “action” visual pathway, rather than the ventral “perception” visual pathway.[13,68,69] By demonstrating a strong link between transsaccadic working memory and drawing (a complex fine motor skill), our findings suggest that precise visual working memory across eye movements plays an important role in complex fine motor control.”

      We are deeply grateful to the reviewers for their meticulous reading of our manuscript and for the constructive feedback provided throughout this process. Your insights have significantly enhanced the clarity and rigour of our work.

      In addition to the changes requested by the reviewers, we wish to acknowledge a reporting error identified during the revision process. In the original Results section, the repeated measures ANOVA statistics for YC included Greenhouse-Geisser corrections, and the between-subjects degrees of freedom were incorrectly reported as within-subjects residuals. Upon re-evaluation of the data, we confirmed that the assumption of sphericity was not violated; therefore, we have removed the unnecessary Greenhouse-Geisser corrections and corrected the degrees of freedom throughout the Results and Methods sections. We have ensured that these statistical updates are reflected accurately in the revised manuscript and that they do not alter the significance or interpretation of any of our primary findings.

      We hope that these revisions address all the concerns raised and provide a more robust account of our findings. We look forward to your further assessment of our work.