10,000 Matching Annotations
  1. Oct 2025
    1. “Nosotros al menos tuvimos un asteroide, ¿cuál es vuestra excusa?”

      Responde a las siguientes preguntas de comprensión sobre el artículo y luego comprueba si lo has hecho bien contrastándolas con las claves que encontrarás más abajo.

      Preguntas de comprensión: 1. ¿Cuál es el propósito del cortometraje mencionado en el artículo?

      1. ¿Qué mensaje quiere transmitir el dinosaurio en el plenario de Nueva York?

      2. ¿Por qué la ONU eligió a un dinosaurio para su campaña?

      3. ¿Qué revela el informe de la ONU sobre los subsidios a los combustibles fósiles?

      4. ¿Qué se podría hacer con el dinero que se gasta en subsidios a los combustibles fósiles, según el artículo?

    2. La publicación de este informe se presenta en la misma semana en la que la ONU ha enmendado los planes de los Gobiernos sobre cambio climático y ha pedido que deben duplicar sus promesas de recorte de emisión de gases invernadero para esta década si se quieren mitigar los efectos del calentamiento global. Desde el pasado nueve de agosto, cuando el grupo de expertos sobre el Cambio Climático (IPCC) publicase su informe de evaluación, la conclusión es clara: el cambio climático es una realidad y es la crisis que define nuestra era.
      1. Busca en este párrafo sinónimos de: corregir, reducción, disminuir, época
    3. Se calcula que la contaminación atmosférica por sí sola provoca cada año siete millones de muertes prematuras en el mundo, el 90% de las cuales se producen en los países en desarrollo, según un informe de la OMS. Las reformas de los subsidios a los combustibles fósiles beneficiarían, por tanto, a la salud y el bienestar humano, además de contribuir a reducir las emisiones de CO₂. En un estudio realizado en 26 países en desarrollo por el PNUD, se constató que la eliminación de las ayudas al carbón, petróleo y gas podría reducir las emisiones en un promedio del 6,4% para 2025, respecto de un escenario sin cambios.

      ¿A qué hacen referencia las siguientes cifras: 90% y 6,4%?

    4. “Esperamos que esta investigación catalice la conversación sobre el papel fundamental que puede tener la reforma para impulsar transiciones ecológicas y justas en todos los países”.

      Ésta sería la conclusión de las ideas presentadas en el artículo.

    5. el mundo gasta 423.000 millones de dólares (307.862 millones de euros) al año en subsidiar los combustibles fósiles. Este dinero, según este mismo estudio, podría cubrir el coste de las vacunas covid-19 para cada persona en el planeta, o financiar tres veces la cantidad anual necesaria para erradicar la pobreza extrema mundial.

      Otro ejemplo de otras alternativas en las qué se podría invertir el dinero destinado a subsidiar los combustibles fósiles.

    6. una campaña que quiere visibilizar cómo los subsidios a los combustibles fósiles están retrasando el progreso contra el cambio climático

      Otro ejemplo de cómo la inactividad de los gobiernos en torno a los combustibles fósiles está controbuyendo al cambio climático.

    7. “El corto es divertido y atractivo, pero los temas que trata no podrían ser más serios”, ha señalado Ulrika Modeer, directora de la Oficina de Relaciones Externas y Promoción del PNUD. “Queremos que el corto entretenga, pero también queremos concienciar sobre lo crítica que es la situación. El mundo debe profundizar la acción climática si queremos tener éxito en mantener nuestro planeta seguro para las futuras generaciones”.

      En este párrafo hay muchos ejemplos de cómo expresar y justificar opinión en torno al propósito del anuncio y su impacto en el tipo de audiencia al que va dirigido.

    8. “Nosotros al menos tuvimos un asteroide”, dice desde la tribuna de oradores el dinosaurio, para incidir en que el calentamiento del planeta es una amenaza bien conocida que no pilla por sorpresa a nadie. “¿Cuál es vuestra excusa?”, incide.

      Aquí se explica el eslogan de la campaña.

    9. este corto comienza con la irrupción de uno de estos animales en el icónico plenario de la sede de Nueva York de esta organización, provocando el pánico entre los delegados del mundo.

      Aquí se describe la escena del anuncio de forma concisa.

    1. EmpleoEntrarRegistrarseMi PerfilSalir DIRECTO Las inundaciones dejan más de 2.600 edificios y 534 kilómetros cuadrados afectados ESCUDO LABORAL El Gobierno prohibirá despedir por la DANA y anuncia ERTE y permisos retribuidos Qué es lo que más se valora al elegir una empresa para trabajar

      Preguntas de comprensión y de opinión:

      1. Remuneración salarial: ¿Puedes dar un sinónimo?

      2. ¿Qué motivaba más a los empleados, la remuneración o la seguridad laboral?

      3. Sin mirar, intenta escuchar y escribir este párrafo del texto https://voca.ro/1hNONgkHoioM

      4. La conciliación entre vida laboral y familiar: ¿puedes explicar en qué consiste dicha conciliación?

      5. Los trabajadores sitúan la honestidad en primer puesto, seguido de la fiabilidad, la sinceridad, la inteligencia y la seguridad. ¿opinas lo mismo?

      6. En su opinión, la edad de jubilación ideal serían los 60 años. ¿Ocurre lo mismo hoy en día en tu país?

    2. La remuneración salarial es el aspecto que más valoran los trabajadores a la hora de elegir una empresa para trabajar, por delante de la seguridad laboral a largo plazo y de las perspectivas de futuro, aspectos que en 2013 se situaron a la cabeza de las motivaciones de los empleados, según el informe Employer Branding presentado esta semana por Randstad.De esta forma, los salarios escalan dos posiciones respecto a la edición anterior de este informe y vuelven a ser lo que más valoran los trabajadores cuando buscan una empresa donde desarrollar su actividad profesional.

      Sin mirar, intenta escuchar y escribir este mismo párrafo https://voca.ro/1hNONgkHoioM

    1. Esta tradición sueca puede parecer polémica, pero tiene las mejores intenciones. Primero, se le pide al novio que abandone el salón para que todos los invitados hombres, específicamente solteros, le den un beso en la mejilla a la novia. Después, la novia debe retirarse y las invitadas solteras llenan de besos al novio. Tradicionalmente, esta costumbre se realiza para cederle suerte a los solteros o solteras que buscan matrimonio.

      Sin mirar, intenta escuchar y escribir esta párrafo para ayudarte a memorizar estructuras https://voca.ro/1aaMDvao2yVg

    2. en la cual es tradicional que las novias usen el color blanco de la cabeza a los pies,

      es tradicional que = es costumbre que Estas dos frases son sinónimas. Es importante que aprendas a encontrar sinónimos, a resumir, reformular y explicar ideas/conceptos con tus propias palabras.

    3. Primero, se le pide al novio que abandone el salón para que todos los invitados hombres, específicamente solteros, le den un beso en la mejilla a la novia.

      He seleccionado esto por la expresión "se le pide al novio que + subjuntivo (abandone) y porque me parece muy gracioso que se le pida al novio que salga para que todos puedan dar un beso (en la mejilla) a la novia, ¿es para que no se ponga celoso?

    1. Author Response

      Reviewer #1 (Public Review):

      [...] Genes expressed in the same direction in lowland individuals facing hypoxia (the plastic state) as what is found in the colonised state are defined as adaptative, while genes with the opposite expression pattern were labelled as maladaptive, using the assumption that the colonised state must represent the result of natural selection. Furthermore, genes could be classified as representing reversion plasticity when the expression pattern differed between the plasticity and colonised states and as reinforcement when they were in the same direction (for example more expressed in the plastic state and the colonised state than in the ancestral state). They found that more genes had a plastic expression pattern that was labelled as maladaptive than adaptive. Therefore, some of the genes have an expression pattern in accordance with what would be predicted based on the plasticity-first hypothesis, while others do not.

      Thank you for a precise summary of our work. We appreciate the very encouraging comments recognizing the value of our work. We have addressed concerns from the reviewer in greater detail below.

      Q1. As pointed out by the authors themselves, the fact that temperature was not included as a variable, which would make the experimental design much more complex, misses the opportunity to more accurately reflect the environmental conditions that the colonizer individuals face at high altitude. Also pointed out by the authors, the acclimation experiment in hypoxia lasted 4 weeks. It is possible that longer term effects would be identifiable in gene expression in the lowland individuals facing hypoxia on a longer time scale. Furthermore, a sample size of 3 or 4 individuals per group depending on the tissue for wild individuals may miss some of the natural variation present in these populations. Stating that they have a n=7 for the plastic stage and n= 14 for the ancestral and colonized stages refers to the total number of tissue samples and not the number of individuals, according to supplementary table 1.

      We shared the same concerns as the reviewer. This is partly because it is quite challenging to bring wild birds into captivity to conduct the hypoxia acclimation experiments. We had to work hard to perform acclimation experiments by taking lowland sparrows in a hypoxic condition for a month. We indeed have recognized the similar set of limitations as the review pointed out and have discussed the limitations in the study, i.e., considering hypoxic condition alone, short time acclimation period, etc. Regarding sample sizes, we have collected cardiac muscle from nine individuals (three individuals for each stage) and flight muscle from 12 individuals (four individuals for each stage). We have clarified this in Supplementary Table 1.

      Q2. Finally, I could not find a statement indicating that the lowland individuals placed in hypoxia (plastic stage) were from the same population as the lowland individuals for which transcriptomic data was already available, used as the "ancestral state" group (which themselves seem to come from 3 populations Qinghuangdao, Beijing, and Tianjin, according to supplementary table 2) nor if they were sampled in the same time of year (pre reproduction, during breeding, after, or if they were juveniles, proportion of males or females, etc). These two aspects could affect both gene expression (through neutral or adaptive genetic variation among lowland populations that can affect gene expression, or environmental effects other than hypoxia that differ in these populations' environments or because of their sexes or age). This could potentially also affect the FST analysis done by the authors, which they use to claim that strong selective pressure acted on the expression level of some of the genes in the colonised group.

      The reviewer asked how individual tree sparrows used in the transcriptomic analyses were collected. The individuals used for the hypoxia acclimation experiment and represented the ancestral lowland population were collected from the same locality (Beijing) and at the same season (i.e., pre-breeding) of the year. They are all adults and weight approximately 18g. We have clarified this in the Supplementary Table S1 and Methods. We did not distinguish males from females (both sexes look similar) under the assumption that both sexes respond similarly to hypoxia acclimation in their cardiac and flight muscle gene expression.

      The Supplementary Table 2 lists the individuals that were used for sequence analyses. These individuals were only used for sequence comparisons but not for the transcriptomic analyses. The population genetic structure analyzed in a previously published study showed that there is no clear genetic divergence within the lowland population (i.e., individuals collected from Beijing, Tianjing and Qinhuangdao) or the highland population (i.e., Gangcha and Qinghai Lake). In addition, there was no clear genetic divergence between the highland and lowland populations (Qu et al. 2020).

      Author response image 1.

      Population genetic structure of the Eurasian Tree Sparrow (Passer montanus). The genetic structure generated using FRAPPE. The colors in each column represent the contribution from each subcluster (Qu et al. 2020). Yellow, highland population; blue, lowland population.

      Q4. Impact of the work There has been work showing that populations adapted to high altitude environments show changes in their hypoxia response that differs from the short-term acclimation response of lowland population of the same species. For example, in humans, see Erzurum et al. 2007 and Peng et al. 2017, where they show that the hypoxia response cascade, which starts with the gene HIF (Hypoxia-Inducible Factor) and includes the EPO gene, which codes for erythropoietin, which in turns activates the production of red blood cell, is LESS activated in high altitude individuals compared to the activation level in lowland individuals (which gives it its name). The present work adds to this body of knowledge showing that the short-term response to hypoxia and the long term one can affect different pathways and that acclimation/plasticity does not always predict what physiological traits will evolve in populations that colonize these environments over many generations and additional selection pressure (UV exposure, temperature, nutrient availability). Altogether, this work provides new information on the evolution of reaction norms of genes associated with the physiological response to one of the main environmental variables that affects almost all animals, oxygen availability. It also provides an interesting model system to study this type of question further in a natural population of homeotherms.

      Erzurum, S. C., S. Ghosh, A. J. Janocha, W. Xu, S. Bauer, N. S. Bryan, J. Tejero et al. "Higher blood flow and circulating NO products offset high-altitude hypoxia among Tibetans." Proceedings of the National Academy of Sciences 104, no. 45 (2007): 17593-17598. Peng, Y., C. Cui, Y. He, Ouzhuluobu, H. Zhang, D. Yang, Q. Zhang, Bianbazhuoma, L. Yang, Y. He, et al. 2017. Down-regulation of EPAS1 transcription and genetic adaptation of Tibetans to high-altitude hypoxia. Molecular biology and evolution 34:818-830.

      Thank you for highlighting the potential novelty of our work in light of the big field. We found it very interesting to discuss our results (from a bird species) together with similar findings from humans. In the revised version of manuscript, we have discussed short-term acclimation response and long-term adaptive evolution to a high-elevation environment, as well as how our work provides understanding of the relative roles of short-term plasticity and long-term adaptation. We appreciate the two important work pointed out by the reviewer and we have also cited them in the revised version of manuscript.

      Reviewer #2 (Public Review):

      This is a well-written paper using gene expression in tree sparrow as model traits to distinguish between genetic effects that either reinforce or reverse initial plastic response to environmental changes. Tree sparrow tissues (cardiac and flight muscle) collected in lowland populations subject to hypoxia treatment were profiled for gene expression and compared with previously collected data in 1) highland birds; 2) lowland birds under normal condition to test for differences in directions of changes between initial plastic response and subsequent colonized response. The question is an important and interesting one but I have several major concerns on experimental design and interpretations.

      Thank you for a precise summary of our work and constructive comments to improve this study. We have addressed your concerns in greater detail below.

      Q1. The datasets consist of two sources of data. The hypoxia treated birds collected from the current study and highland and lowland birds in their respective native environment from a previous study. This creates a complete confounding between the hypoxia treatment and experimental batches that it is impossible to draw any conclusions. The sample size is relatively small. Basically correlation among tens of thousands of genes was computed based on merely 12 or 9 samples.

      We appreciate the critical comments from the reviewer. The reviewer raised the concerns about the batch effect from birds collected from the previous study and this study. There is an important detail we didn’t describe in the previous version. All tissues from hypoxia acclimated birds and highland and lowland birds have been collected at the same time (i.e., Qu et al. 2020). RNA library construction and sequencing of these samples were also conducted at the same time, although only the transcriptomic data of lowland and highland tree sparrows were included in Qu et al. (2020). The data from acclimated birds have not been published before.

      In the revised version of manuscript, we also compared log-transformed transcript per million (TPM) across all genes and determined the most conserved genes (i.e., coefficient of variance ≤  0.3 and average TPM ≥ 1 for each sample) for the flight and cardiac muscles, respectively (Hao et al. 2023). We compared the median expression levels of these conserved genes and found no difference among the lowland, hypoxia-exposed lowland, and highland tree sparrows (Wilcoxon signed-rank test, P<0.05). As these results suggested little batch effect on the transcriptomic data, we used TPM values to calculate gene expression level and intensity. This methodological detail has been further clarified in the Methods and we also provided a new supplementary Figure (Figure S5) to show the comparative results.

      Author response image 2.

      The median expression levels of the conserved genes (i.e., coefficient of variance ≤ 0.3 and average TPM ≥ 1 for each sample) did not differ among the lowland, hypoxia-exposed lowland, and highland tree sparrows (Wilcoxon signed-rank test, P<0.05).

      The reviewer also raised the issue of sample size. We certainly would have liked to have more individuals in the study, but this was not possible due to the logistical problem of keeping wild bird in a common garden experiment for a long time. We have acknowledged this in the manuscript. In order to mitigate this we have tested the hypothesis of plasticity following by genetic change using two different tissues (cardiac and flight muscles) and two different datasets (co-expressed gene-set and muscle-associated gene-set). As all these analyses show similar results, they indicate that the main conclusion drawn from this study is robust.

      Q2. Genes are classified into two classes (reversion and reinforcement) based on arbitrarily chosen thresholds. More "reversion" genes are found and this was taken as evidence reversal is more prominent. However, a trivial explanation is that genes must be expressed within a certain range and those plastic changes simply have more space to reverse direction rather than having any biological reason to do so.

      Thank you for the critical comments. There are two questions raised we should like to address them separately. The first concern centered on the issue of arbitrarily chosen thresholds. In our manuscript, we used a range of thresholds, i.e., 50%, 100%, 150% and 200% of change in the gene expression levels of the ancestral lowland tree sparrow to detect genes with reinforcement and reversion plasticity. By this design we wanted to explore the magnitudes of gene expression plasticity (i.e., Ho & Zhang 2018), and whether strength of selection (i.e., genetic variation) changes with the magnitude of gene expression plasticity (i.e., Campbell-Staton et al. 2021).

      As the reviewer pointed out, we have now realized that this threshold selection is arbitrarily. We have thus implemented two other categorization schemes to test the robustness of the observation of unequal proportions of genes with reinforcement and reversion plasticity. Specifically, we used a parametric bootstrap procedure as described in Ho & Zhang (2019), which aimed to identify genes resulting from genuine differences rather than random sampling errors. Bootstrap results suggested that genes exhibiting reversing plasticity significantly outnumber those exhibiting reinforcing plasticity, suggesting that our inference of an excess of genes with reversion plasticity is robust to random sampling errors. We have added these analyses to the revised version of manuscript, and provided results in the Figure 2d and Figure 3d.

      Author response image 3.

      Figure 2a (left) and Figure 2b (right). Frequencies of genes with reinforcement and reversion plasticity (>50%) and their subsets that acquire strong support in the parametric bootstrap analyses (≥ 950/1000).

      In addition, we adapted a bin scheme (i.e., 20%, 40% and 60% bin settings along the spectrum of the reinforcement/reversion plasticity). These analyses based on different categorization schemes revealed similar results, and suggested that our inference of an excess of genes with reversion plasticity is robust. We have provided these results in the Supplementary Figure S2 and S4.

      Author response image 4.

      (A) and Figure S4 (B). Frequencies of genes with reinforcement and reversion plasticity in the flight and cardiac muscle. (A) For genes identified by WGCNA, all comparisons show that there are more genes showing reversion plasticity than those showing reinforcement plasticity for both the flight and cardiac msucles. (B) For genes that associated with muscle phentoypes, all comparisons show that there are more genes showing reversion plasticity than those showing reinforcement plasticity for the flight muscle, while more than 50% of comparisons support an excess of genes with reversion plasticity for the cardiac muscle. Two-tailed binomial test, NS, non-significant; , P < 0.05; , P < 0.01; **, P < 0.001.

      The second issue that the reviewer raised is that the plastic changes simply have more space to reverse direction rather than having any biological reason to do so. While a causal reason why there are more genes with expression levels being reversed than those with expression levels being reinforced at the late stages is still contentious, increasingly many studies show that genes expression plasticity at the early stage may be functionally maladapted to novel environment that the species have recently colonized (i.e., lizard, Campbell-Staton et al. 2021; Escherichia coli, yeast, guppies, chickens and babblers, Ho and Zhang 2018; Ho et al. 2020; Kuo et al. 2023). Our comparisons based on the two genesets that are associated with muscle phenotypes corroborated with these previous studies and showed that initial gene expression plasticity may be nonadaptive to the novel environments (i.e., Ghalambor et al. 2015; Ho & Zhang 2018; Ho et al. 2020; Kuo et al. 2023; Campbell-Staton et al. 2021).

      Q3. The correlation between plastic change and evolved divergence is an artifact due to the definitions of adaptive versus maladaptive changes. For example, the definition of adaptive changes requires that plastic change and evolved divergence are in the same direction (Figure 3a), so the positive correlation was a result of this selection (Figure 3d).

      The reviewer raised an issue that the correlation between plastic change and evolved divergence is an artifact because of the definition of adaptive versus maladaptive changes, for example, Figure 3d. We agree with the reviewer that the correlation analysis is circular because the definition of adaptive and maladaptive plasticity depends on the direction of plastic change matched or opposed that of the colonized tree sparrows. We have thus removed previous Figure 3d-e and related texts from the revised version of manuscript. Meanwhile, we have changed Figure 3a to further clarify the schematic framework.

    1. Esta tradición tiene origen en la costumbre romana de dar una «scarsella», o una bolsa de cuero, llena de lentejas que tradicionalmente estaban atadas al cinturón. En estos tiempos antiguos, se esperaba que las lentejas se transformaran en monedas de oro, trayendo riqueza al portador de la scarsella. En el pasado, las lentejas eran entregadas como regalo el 31 de diciembre,

      ¿Sabrías explicar el origen y el significado de alguna tradición de tu país?

    2. los italianos suelen servir lentejas ya que se considera que esta comida trae suerte y propiedades en el nuevo año

      ¿Sabes si se come o se bebe algo en particular en alguna fiesta o celebración de tu país? ¿Cómo se llama? ¿Qué ingredientes tiene?

    1. Briefing : Secourisme en Santé Mentale et Fonction Publique : Agir Ensemble !

      Résumé

      Ce document de synthèse analyse les échanges tenus lors des "Rencontres PSSM France #2", axées sur le déploiement du secourisme en santé mentale au sein de la fonction publique.

      Les discussions ont mis en lumière la croissance exponentielle du programme Premiers Secours en Santé Mentale (PSSM) en France, soutenu par des objectifs gouvernementaux ambitieux, visant 300 000 secouristes formés d'ici 2027 et 750 000 d'ici 2030.

      L'initiative, portée par l'association PSSM France, se distingue par son approche citoyenne, sa base scientifique solide (méthode australienne Mental Health First Aid) et son impact mesurable sur la déstigmatisation des troubles psychiques.

      Le déploiement au sein des trois fonctions publiques, encadré par la circulaire du 23 février 2022, constitue un levier stratégique majeur pour toucher un large écosystème professionnel et citoyen.

      Les retours d'expérience des collectivités, des universités et d'organismes comme l'UROPS témoignent d'une appropriation réussie et d'un impact concret sur le terrain, renforçant le "pouvoir d'agir" des agents et des étudiants.

      Face à ce succès, PSSM France ancre sa stratégie future, "En Route vers 2030", sur deux piliers : une haute exigence de qualité et une mesure d'impact robuste.

      Cette ambition se concrétise par le lancement de projets de recherche d'envergure (SÉSAME, Père-aidance étudiants en médecine) visant à produire des données probantes françaises et à affiner le programme.

      Le développement du module PSSM Ado, actuellement en phase pilote, répond à l'enjeu crucial de la santé mentale des jeunes et confirme la volonté d'adapter l'outil aux publics les plus vulnérables.

      La démarche globale s'inscrit dans une vision de transformation sociétale, visant à construire une culture de la solidarité et du "vivre ensemble" face à la souffrance psychique.

      --------------------------------------------------------------------------------

      1. Croissance et Ambition du Programme PSSM en France

      Le programme de Premiers Secours en Santé Mentale, porté en France par l'association PSSM France depuis 2018, connaît une dynamique de croissance exceptionnelle, soutenue par un engagement constant des pouvoirs publics et une reconnaissance internationale.

      1.1. Une Dynamique de Croissance Exponentielle

      Les objectifs fixés par le ministère de la Santé ont été systématiquement atteints et dépassés, témoignant d'une forte mobilisation sur tout le territoire.

      Objectif Initial

      Date de Réalisation

      Nouvel Objectif

      Date de Réalisation/Cible

      60 000 secouristes fin 2023

      Juin 2023

      150 000 secouristes fin 2025

      Novembre 2024 (un an en avance)

      150 000 secouristes fin 2025

      Novembre 2024

      300 000 secouristes d'ici 2027

      Annoncé en juin 2024

      À la date des rencontres, les chiffres confirment cette tendance :

      Plus de 230 000 secouristes formés sur l'ensemble du territoire.

      Près de 2 000 formateurs accrédités.

      • L'objectif stratégique de l'association, fixé dans son projet "En route vers 2030", est d'atteindre 750 000 secouristes en 2030.

      Cette croissance s'observe dans tous les secteurs d'activité : pénitentiaire, protection de la jeunesse, éducation nationale, santé, mais aussi dans le monde économique, les collectivités et le secteur social.

      1.2. Un Soutien Institutionnel Continu

      Dès sa création en 2018 par l'INFIP, Santé Mentale France et l'UNAFAM, le projet a bénéficié d'un soutien "constant" du ministère de la Santé (DGS, Délégué ministériel), de Santé Publique France et des Agences Régionales de Santé (ARS).

      Cet appui s'est traduit par l'inscription du programme dans plusieurs feuilles de route ministérielles sur la santé mentale depuis la ministre Agnès Buzyn.

      La circulaire du 23 février 2022, cosignée par les ministres de la Santé et de la Fonction Publique, a marqué une étape décisive en officialisant le déploiement de la sensibilisation et de la formation au sein des trois fonctions publiques.

      1.3. Un Cadre International et Scientifique

      La France fait partie d'un réseau international de 47 pays déployant le programme Mental Health First Aid (MHFA), né en Australie il y a 25 ans.

      Cette communauté mondiale rassemble plus de 10 millions de secouristes.

      Le programme repose sur une "méthode fondée sur les preuves scientifiques solides, régulièrement évaluées et améliorées", notamment via le modèle de consensus Delphi.

      Plus de 100 études, dont de nombreuses études randomisées contrôlées et quatre méta-analyses, attestent de son efficacité à l'échelle mondiale.

      2. Philosophie et Portée du Programme PSSM

      Au-delà des chiffres, PSSM est présenté comme une démarche citoyenne visant une transformation profonde de l'approche de la santé mentale.

      2.1. Une Démarche Citoyenne et de Démocratisation

      Le programme PSSM est décrit comme un "projet citoyen" qui porte des "valeurs de solidarité, citoyenneté et de soutien entre pairs". Son objectif fondamental est de :

      Changer les représentations autour de la souffrance psychique.

      Lutter contre la stigmatisation et lever les tabous.

      Renforcer la solidarité et créer des communautés bienveillantes.

      Apprendre à aider, c'est-à-dire "comment se tourner vers l'autre" tout en prenant soin de sa propre santé mentale.

      Comme le souligne Muriel Vidalin, présidente de PSSM France, l'objectif est de s'inscrire "dans une société où le vivre ensemble a du sens".

      2.2. Qualité, Évaluation et Mesure d'Impact

      Le projet stratégique 2025-2030 de PSSM France repose sur deux axes indissociables :

      1. "Pas de déploiement de premier secours en santé mentale sans une haute exigence de qualité."

      2. "Pas de qualité sans évaluation et mesure d'impact robuste."

      Cette exigence s'inscrit dans l'ADN du programme international et se traduit par une volonté de contribuer à la production de données probantes françaises via des études et des publications, qui feront l'objet de la deuxième table ronde.

      La reconnaissance de la formation au répertoire spécifique de France Compétences et son éligibilité future au CPF (dès 2026) ancrent durablement cette démarche dans le paysage de la formation professionnelle.

      3. Le Déploiement dans la Fonction Publique

      La circulaire du 23 février 2022 a structuré le déploiement du PSSM dans la fonction publique, un "écosystème essentiel" pour diffuser une culture de prévention et de soin.

      3.1. Cadre et Objectifs de la Circulaire

      La circulaire vise un double enjeu :

      Former des agents de divers secteurs pour renforcer leur capacité d'intervention auprès des publics.

      Sensibiliser les collègues et collaborateurs pour lutter contre l'isolement et la stigmatisation en milieu de travail.

      Le dispositif s'articule en plusieurs niveaux :

      1. Sensibilisation (1/2 journée) : Pour tous les agents des trois fonctions publiques.

      2. Modules en ligne sur Mentor :

      ◦ "Prenez soin de votre santé mentale" (1h30) pour comprendre les enjeux et les bonnes pratiques.    ◦ "Agissons pour la santé mentale" (2h45) pour mobiliser dans une démarche citoyenne.

      3. Formation de secouriste (14h) : Pour les agents volontaires.

      4. Formation de formateur : Pour autonomiser la formation en interne.

      Le texte insiste sur la mutualisation des ressources, la concertation avec le dialogue social et l'appui sur la médecine du travail.

      3.2. Premiers Chiffres et Retours

      Bien que les données ne soient pas encore exhaustives, notamment pour la fonction publique territoriale, les premiers chiffres montrent une mobilisation significative :

      Fonction Publique / Ministère

      Nombre de Secouristes Formés

      Justice - Milieu Pénitentiaire

      1 657 agents

      Justice - Protection Judiciaire de la Jeunesse (PJJ)

      1 853 secouristes

      Éducation Nationale

      4 643 secouristes (>40% au module Jeune)

      Fonction Publique Territoriale

      5 600 agents (via CNFPT ou collectivités)

      Ministères (État)

      4 600 agents (Économie, Sociaux, Justice, etc.) inscrits aux modules Mentor en avril 2025.

      Les retours qualitatifs sont très positifs : 92 % des participants aux formations estiment pouvoir réinvestir tout ou partie de la formation dans leur activité professionnelle.

      4. Retours d'Expérience et Applications Concrètes

      La première table ronde a permis d'illustrer la manière dont différents acteurs de la fonction publique se sont approprié le programme PSSM.

      Conseils Locaux de Santé Mentale (CLSM) : L'expérience du Val-d'Oise montre comment une coordination inter-CLSM a permis un déploiement structuré à l'échelle départementale.

      Les points clés sont : le ciblage de publics prioritaires (périnatalité), l'adaptation des formats (pour les usagers des Groupes d'Entraide Mutuelle), et la création de groupes hétérogènes favorisant "le lien social et l'ouverture".

      Collectivités Territoriales (Ville de Lille) : Forte de son CLSM ancien, la ville a formé un cadre municipal pour devenir formateur interne.

      Les formations mélangent volontairement les publics (travailleurs sociaux, usagers, enseignants, associations) pour favoriser l'interconnaissance.

      Le projet a permis de développer des "ambassadrices santé" issues des quartiers prioritaires, qui deviennent à leur tour formatrices, renforçant leur pouvoir d'agir.

      UROPS (Union Régime Obligatoire Prévention Santé) : Cet organisme propose des programmes de prévention aux administrations de la fonction publique d'État.

      Ils ont intégré PSSM pour répondre aux besoins spécifiques de populations comme les agents pénitentiaires ou du ministère de l'Intérieur. 1 700 agents ont été formés depuis 2023, avec une soixantaine d'actions à venir.

      L'UROPS met un accent particulier sur la nécessité d'une "évaluation à froid" pour mesurer l'utilisation réelle des compétences acquises.

      Milieu Universitaire (Université de Bordeaux) : L'université, déjà très engagée sur la santé mentale, a vu PSSM comme un "potentialisateur de l'engagement des étudiants".

      Près de 4 000 secouristes y ont été formés, avec une offre constante de deux formations par semaine, toujours complètes.

      Le choix a été fait de former les soignants de l'espace santé étudiant pour qu'ils dispensent la formation, et de créer des groupes mixtes (étudiants, enseignants, personnel) pour "faire tomber les barrières".

      L'effet le plus marquant est le renforcement du pouvoir d'agir des étudiants : plus de la moitié ont aidé au moins une personne, et 25% entre 5 et 10 personnes.

      5. Recherche et Évaluation : Mesurer l'Impact du Programme

      La deuxième partie des rencontres a mis en exergue la stratégie de PSSM France de s'inscrire dans une démarche rigoureuse d'évaluation scientifique.

      5.1. Contexte et Ambitions

      Le conseil scientifique et pédagogique (CSP) de PSSM France a pour mission de garantir la qualité et la reproductibilité du modèle, condition indispensable à la recherche.

      L'objectif est de produire des données françaises pour compléter le corpus international et de tendre vers la démonstration ultime : prouver qu'une personne secourue par un secouriste PSSM voit "sa trajectoire et son pronostic modifiés".

      5.2. Projets de Recherche en Cours

      Deux initiatives majeures sont sur le point de démarrer :

      Projet SÉSAME : Menée par le professeur Arnaud Carré, cette étude à grande échelle vise à évaluer l'efficacité du programme standard en France.

      Son originalité réside dans la mesure des mécanismes psychologiques sous-jacents chez les secouristes et formateurs (empathie, régulation des émotions, compassion).

      L'étude, à la fois rétrospective et prospective (avec un suivi à 6 mois), permettra de mieux caractériser les effets du programme selon les profils et de contribuer à son amélioration continue.

      Projet Père-aidance Étudiants en Santé Mentale : Porté par l'ISNI (Intersyndicale Nationale des Internes) et le Pr. Édouard Lone (CH Le Vinatier), ce projet part du constat de la vulnérabilité des étudiants en médecine. Le protocole consiste à former 10% d'une promotion d'internes de Lyon en tant que secouristes PSSM.

      L'étude évaluera l'impact de la présence de ces pairs-aidants sur le bien-être (burnout, dépression) de l'ensemble de la promotion sur un an. L'objectif final est d'institutionnaliser cette formation dans le cursus médical.

      5.3. Le Programme PSSM Ado

      Face à l'enjeu majeur de la santé mentale des jeunes, PSSM France a développé un module spécifique pour les adolescents, actuellement en phase d'expérimentation.

      Objectif : Briser la solitude des jeunes face à leurs propres troubles ou ceux de leurs camarades, et leur donner des outils pour agir.

      Format adapté : 3 sessions de 50 minutes au collège et 3 sessions de 90 minutes au lycée.

      Prérequis : 10% des personnels de l'établissement (enseignants, administratifs, service) doivent être formés au préalable au module PSSM Jeune.

      Premiers retours : Une phase pilote auprès de 500 élèves a montré un accueil très positif et une forte participation, les jeunes se sentant directement concernés par le sujet.

      Une expérimentation plus large est prévue avec le soutien de l'Éducation Nationale.

      6. Conclusion et Perspectives

      En conclusion, Claire Compagnon, membre du collège de la Haute Autorité de Santé (HAS), a salué l'action de PSSM France comme une contribution essentielle face à l'enjeu majeur de santé publique que représente la santé mentale.

      Elle a rappelé l'engagement de la HAS à travers son propre programme de travail, qui vise à construire des parcours cohérents, à promouvoir la pair-aidance et à renforcer les droits des personnes.

      L'engagement de PSSM France dans la recherche et l'évaluation a été particulièrement souligné comme une démarche "essentielle".

      Dans un contexte où le développement de la prévention doit s'appuyer sur des données probantes pour guider les décisions des pouvoirs publics, les initiatives de PSSM France sont perçues comme un modèle pour accélérer le "virage préventif" en France.

      L'ambition partagée est de passer "de la parole à l'action" pour faire de la santé mentale une priorité concrète et collective.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      The manuscript by Chiu et al describes the modification of the Zwitch strategy to efficiently generate conditional knockouts of zebrafish betapix. They leverage this system to identify a surprising glia-exclusive function of betapix in mediating vascular integrity and angiogenesis. Betapix has been previously associated with vascular integrity and angiogenesis in zebrafish, and betapix function in glia has also been proposed. However, this study identifies glial betapix in vascular stability and angiogenesis for the first time.

      The study derives its strength from the modified CRISPR-based Zwitch approach to identify the specific role of glial betapix (and not neuronal, mural, or endothelial). Using RNA-in situ hybridization and analysis of scRNA-Seq data, they also identify delayed maturation of neurons and glia and implicate a reduction in stathmin levels in the glial knockouts in mediating vascular homeostasis and angiogenesis. The study also implicates a betapix-zfhx3/4-vegfa axis in mediating cerebral angiogenesis.

      There is both technical (the generation of conditional KOs) and knowledge-related (the exclusive role of glial betapix in vascular stability/angiogenesis) novelty in this work that is going to benefit the community significantly.

      While the text is well written, it often elides details of experiments and relies on implicit understanding on the part of the reader. Similarly, the figure legends are laconic and often fail to provide all the relevant details.

      Thanks for this reviewer on his/her overall supports on our manuscript. We have now revised the manuscript text and figure legends making them to have all relevant details as much as we can. 

      Specific comments:

      (1) While the evidence from cKO's implicating glial betapix in vascular stability/angiogenesis is exciting, glia-specific rescue of betapix in the global KOs/mutants (like those performed for stathmin) would be necessary to make a water-tight case for glial betapix.

      We fully agree with the reviewer that it would be ideal to examine glia-specific rescue of betaPix in its global KOs. At the same time, it is difficult to achieve optimal transient expression of betaPix by injecting plasmid clone of gfap:betaPix while it takes long time to establish stable transgenic line gfap:betaPix for rescuing mutant phenotypes. We would like to pursue this line of researches in the future.

      (2) Splice variants of betapix have been shown to have differential roles in haemorrhaging (Liu, 2007). What are the major glial isoforms, and are there specific splice variants in the glial that contribute to the phenotypes described?

      We agree that it would be important to address whether any specific splice variants in glia contribute to betaPix mutant phenotypes. Previous studies have shown that the isoform a of betaPix is ubiquitously expressed across various tissues, while isoforms b, c, and d are predominantly expressed in the nervous system. In mice, the expression level of isoform betaPix-d is essential for the neurite outgrowth and migration. In the nervous system, we have not assessed glial specific betaPix isoforms directly. Our current data cannot rule out whether specific isoform is involved in its function in glial responses. The Zwitch cassette of betaPix resides on intron 5, thus disrupting all transcripts when Cre is activated. However, we are fully aware of the potential of identifying glial betaPix isoform with direct downstream targets. Further studies to dissect their roles in cerebral vascular development and diseases are part of our future plans.

      (3) Liu et al, 2012 demonstrated reduced proliferation of endothelial cells in bbh fish and linked it to deficits in angiogenesis. Are there proliferation/survival defects in endothelial cells in the glial KOs?

      We thank the reviewer for highlighting endothelial cell phenotypes in betaPix mutants. We are aware of endothelial cells might directly link to the mutant defects in angiogenesis. We assessed and quantified endothelial migration by measuring the length of developing central arteries, but we did not examine endothelial cell proliferation/survival defects in glial KOs. In our scRNA-seq analysis, the proportion of endothelial cells reduced among betaPix deficiency, indicating that endothelial cell proliferation/survival might decrease in mutants. In this endothelial cell cluster, we found disrupted transcriptional landscape in a set of angiogenic associated genes (Figure 6M). While these analysis highlights altered angiogenic transcriptome profile in endothelial cells of betaPix knockouts, we acknowledge that our study does not directly address proliferation/survival phenotypes in endothelial cells, which warrants future investigations on the role of betaPix in regulating glia-endothelial cell interaction.  

      Reviewer #2 (Public review):

      Summary:

      Using a genetic model of beta-pix conditional trap, the authors are able to regulate the spatio-temporal depletion of beta-pix, a gene with an established role in maintaining vascular integrity (shown elsewhere). This study provides strong in vivo evidence that glial beta-pix is essential to the development of the blood-brain barrier and maintaining vascular integrity. Using genetic and biochemical approaches, the authors show that PAK1 and Stathmins are in the same signaling axis as beta-pix, and act downstream to it, potentially regulating cytoskeletal remodeling and controlling glial migration. How exactly the glial-specific (beta-pix driven-) signaling influences angiogenesis or vascular integrity is not clear.

      Strengths:

      (1) Developing a conditional gene-trap genetic model which allows for tracking knockin reporter driven by endogenous promoter, plus allowing for knocking down genes. This genetic model enabled the authors to address the relevant scientific questions they were interested in, i.e., a) track expression of beta-pix gene, b) deletion of beta-pix gene in a cell-specific manner.

      (2) The study reveals the glial-specific role of beta-pix, which was unknown earlier. This opens up avenues for further research. (For instance, how do such (multiple) cell-specific signaling converge onto endothelial cells which build the central artery and maintain the blood-brain barriers?)

      We thank this reviewer for his/her overall supports on our work.

      Weaknesses:

      Major:

      (1) The study clearly establishes a role of beta-pix in glial cells, which regulates the length of the central artery and keeps the hemorrhages under control. Nevertheless, it is not clear how this is accomplished.

      (a) Is this phenotype (hemorrhage) a result of the direct interaction of glial cells and the adjacent endothelial cells? If direct, is the communication established through junctions or through secreted molecules?

      Thanks for this critical question. We attempted to address this issue by performing live imaging using light-sheet confocal microscopy, but failed to achieve sub-cellular resolution. We don’t have data to address this critical issue that warrants future investigations. 

      (b) The authors do not exclude the possibility that the effects observed on endothelial cells (quantified as length of central artery) could be secondary to the phenotype observed with deletion of glial beta-pix. For instance, can glial beta-pix regulate angiogenic factors secreted by peri-vascular cells, which consequently regulate the length of the central artery or vascular integrity?

      Thank the reviewer for this critical point. While we found the major defects of endothelial cell migration quantified by the central artery length, could not rule out the participation of signals from other peri-vascular cells. We fully agree that it will be important to address the cell-type specific relationship by angiogenic factors. Of note, degradation of extracellular matrix and focal adhesion is critical for the hemorrhagic phenotypes of bbh mutants. In a previous published study in our group, we found that suppressing the globally induced MEK/ERK/MMP9 signaling in bbh mutants significantly decreases hemorrhages. Accordingly, we edited a paragraph in the Discussion section on pages 24-25. We plan to continue investigating whether the complex interactions in the perivascular space contribute to vascular integrity disruption, as well as the cross-talks among different cell types during vascular development in these mutants. We believe that our model of glial specific betaPix function will guide us to further study cellular interactions in the follow-up studies.

      (c) The pictorial summary of the findings (Figure 7) does not include Zfhx or Vegfa. The data do not provide clarity on how these molecules contribute (directly or indirectly) to endothelial cell integrity. Vegfaa is expressed in the central artery, but the expression of the receptor in these endothelial cells is not shown. Similarly, all other experimental analyses for Zfhx and Vegfa expression were performed in glial cells. More experimental evidence is necessary to show the regulation of angiogenesis (of endothelial cells) by glial beta-pix. Is the Vegfaa receptor present on central arteries, and how does glial depletion of beta-pix affect its expression or response of central artery endothelial cells (both pertaining to angiogenesis and vascular integrity).

      Thank this reviewer for pointing out this critical issue. We have now revised the pictorial summary including Zfhx or Vegfa information in Figure 7. The key receptors of VEGF-A ligand are VEGFR-1 and VEGFR-2. In zebrafish, expression of Vegfr-2, as known as kdrl, is well-documented at endothelial cells including the hindbrain central arteries. We fully agree that it would indeed be of great value to assess changes of kdrl expression pattern after betaPix deficiency in vivo. It warrants future investigations to address how the VEGFA-VEGFR2 signaling in endothelial cells is altered in betaPix mutants.

      (2) Microtubule stabilization via glial beta-pix, claimed in Figure 5M, is unclear. Magnified images for h-betapix OE and h-stmn-1 glial cells are absent. Is this migration regulated by beta-pix through its GEF activity for Cdc42/Rac?

      We have now revised Figure 5M to include magnified images for h-betaPIX and h-STMN1 overexpression groups. It has been shown that there is a positive feedback loop of microtubule regulation consisting of Rac1-Pak1-Stathmin at the cell edge (Zeitz and Kierfeld, 2014 Biophys J.). Previous studies have shown betaPix activates Rac1 through its GEF activity and also regulates the activity of Pak1 via direct binding. As reported by Kwon et al., betaPix-d isoform promotes neurite outgrowth via the PAK-dependent inactivation of Stathmin1. In this work, we did not assess binding activity of betaPix to Rac1 or Pak1. Nevertheless, our data on the rescue experiments via IPA-3 suggest that betaPix deficiency impaired migration through Pak1 signaling. 

      (3) Hemorrhages are caused by compromised vascular integrity, which was not measured (either qualitatively or quantitatively) throughout the manuscript. The authors do measure the length of the central artery in several gene deletion models (2I, 3C. 5F/J, 6G/K), which is indicative of artery growth/ angiogenesis. How (if at all) defects in angiogenesis are an indication of hemorrhage should be explained or established. Do these angiogenic growth defects translate into junctional defects at later developmental time points? Formation and maintenance of endothelial cell junctions within the hemorrhaging arteries should be assessed in fish with deleted beta-pix from astrocytes.

      We appreciate the reviewer’s point and agree that this is a key aspect we need to clarify. To address junctional defects in our model, we re-examined the scRNA-seq data and found mild downregulation of junction protein claudin-5a (cldn5a) levels in the transcriptome analysis of the endothelial cluster (Author response image 1). We agree in principle that single cell RNA sequencing findings should be validated by immunostaining. While we did not measure junctional defects directly in this work, we have previously reported comparable tight junction protein zonula occludens-1 (ZO1) expression between siblings and bbh mutants (Yang et al., 2017 Dis Model Mech). In zebrafish, functionally characterized blood brain barrier (BBB) is only identified after 3 dpf. The lack of mature BBB might be due to the immature status of barrier signature at this developmental stage. Hemorrhage phenotype occurred around 40 hpf, and hematomas would be almost completely absorbed at later stage since most mutants recover and survive to adulthood. Thus future studies are needed to address the junctional characteristics on the cellular and molecular level in later developmental stages of betaPix mutants.   

      Author response image 1.

      Violin plots showing cdh5, cldn5a, cldn5b and oclna expression levels in endothelial sub-cluster. ctrl, control siblings; ko, betaPix knockouts (CRISPR mutants); 1d or 2d, 1 or 2 days post fertilization.

      (4) More information is required about the quality control steps for 10X sequencing (Figure 4, number of cells, reads, etc.). What steps were taken to validate the data quality? The EC groups, 1 and 2-days post-KO are not visible in 4C. One appreciates that the progenitor group is affected the most 2 days post-KO. But since the effects are expected to be on the endothelial cell group as well (which is shown in in vivo data), an extensive analysis should be done on the EC group (like markers for junctional integrity, angiogenesis, mesenchymal interaction, etc.). Are Stathmins limited to glial cells? Are there indicators for angiogenic responses in endothelial cells?

      Thank the reviewer for these critical suggestions. The detailed statements about the quality control steps for 10X sequencing are now provided in the Materials and Methods section. We validate the data quality through multiple steps, including verification of the number of viable cells used in experiment, assessment of peak shapes and fragment sizes of scRNA-seq libraries, confirmation of sufficient cell counts and sequencing reads for data analyses, and implementation of stringent filtering steps to exclude low-quality cells. Stathmins expressions as shown in Violin plots in Figure 4E and stmn1a, stmn1b and stmn4l expressions in UMAP plots in Figure S6C. These expressions are not limited to glial cells but distributed more widely among zebrafish tissues. We would like to point out that despite the small amount, the endothelial cell clusters are presented in Figure 4C with color brown. The proportions of EC groups split by four sample are visualized in Figure S6B and shown significant reduction among betaPix knockouts at 2 dpf, which had similar trend as glial progenitors. In addition, gene ontology analysis identified a set of down-regulated angiogenic genes expression in endothelial cluster (Figure 6M). We realize our interpretation of endothelial cell phenotypes was not sufficiently clear in this work and have now added sentences to the manuscript text on pages 16-17. As noted above, future studies are needed to address how glial betaPix regulates endothelial cell and BBB function. 

      Reviewing Editor Comments:

      comments on your manuscript. Addressing comments 1-3 from Reviewer 1 and comment 1 and its subparts from Reviewer 2 (major weaknesses) will significantly improve the manuscript by reinforcing the cell autonomous requirement of betaPix and also gain mechanistic insights. In addition, extensive proofreading and editing of the text, as well as changes to the figure, figure legends, and the discussion as indicated by both reviewers, will improve the readability and clarity of this manuscript.

      Thanks for Reviewing Editor on his/her supports on this manuscript. As noted above, we are trying to address the reviewers’ comments using the data we obtained in this work, as well as our plans for future investigations. We have now made extensive proofreading and editing of manuscript text and figure legends for improving the readability and clarity of this manuscript.

      Reviewer #1 (Recommendations for the authors):

      (1) The Discussion is written like an introduction with very little engagement with the data generated in the manuscript. The role of betapix-Pak-stathmin and betapix-zfhx3/4-vegfaa is barely discussed and contextualised vis-à-vis the current knowledge in the field.

      We appreciate the reviewer’s critical comments regarding the Discussion section. We have now revised the manuscript text on pages 20-23 to address the role of betapix-Pak-stathmin and betapix-zfhx3/4-vegfaa axis with contributions from this work.

      (2) Line 145: "light sheet microscopy" - explain that this was only for experiments involving fluorescence. Currently, it reads as if the data presented in Figures 1D and E are also obtained via light sheet microscopy. E.g., the paragraph starting on line 139 does not say what line was imaged (and what it labels) to reach the conclusions reached. This detail is not there even in the associated figure legend. Similarly, line 153 discusses radial glia, but there is no indication that these were labelled using Tg (GFAP:GFP) except in the figure annotation. There are various instances of such omissions throughout the text, and they should be remedied to indicate what each line is and what it labels, at least in the first instance.

      Thank the reviewer for their thoughtful points. In this revised version, we have incorporated more statements of the objectives and methodologies in the text in pages 8-9. We hope that the revised manuscript can better present the data with clarifying methodologies and materials used in this work. 

      (3) Figure 1E legend: What is the haemorrhage percentage? Is it the number of embryos per experiment showing hemorrhage? Indicate in the text. In the right panel, what is the number of embryos used? Please ensure all numbers (number of embryos, experiments, etc) used to plot any data in the set of figures in the entire manuscript are clearly indicated.

      Thank the reviewer for the suggestion. In this revised version, we have incorporated more detailed statements in figures and figure legends in the manuscript to show the numbers of embryos used.

      (4) The Discussion section suddenly introduces the blood-brain barrier and extensively discusses it. However, while cerebral haemorrhage can disrupt the BBB and exacerbate the effects of the haemorrhage, this manuscript does not suggest that a weakened BBB is the cause of haemorrhages in betapix mutants. More likely, betapix stabilises and maintains vascular integrity, and loss of this function causes haemorrhaging and subsequent disruption of the BBB. The glial function noted in this study is likely to be distinct from the glial function in BBB development and maintenance. The authors do not show any direct evidence for the latter. These should be shortened, and only relevant aspects facilitating contextualisation of data generated in this manuscript should be retained.

      We have now revised the Discussion section to reduce the introduction of blood-brain barrier and add statements according to the suggestions from both reviewers. We hope that the revisions provide a more relevant and balanced discussion.

      (5) Is the scratch assay in Figure 5 controlled for differences in cell proliferation among the different manipulations?

      We plated the same numbers of cells and cultured them in the same condition. Before conducted scratch assay we replaced medium with serum-free culture medium to reduce the effect from cell proliferation among the different manipulation groups. 

      (6) In the glioblastoma experiments involving betapix KD, does stathmin RNA/protein decrease? What about Ser 16 phosphorylation (as shown for neurons in Kwon et al, 2020)?

      STMN1 RNA was down-regulated by betaPIX deficiency, which was rescued by betaPIX overexpression in glial cells (Author response image 2). These results are similar to those from in vivo analysis (Figure 5A, 5B and S7A). We agree with the reviewer that it would been ideal to examine Ser 16 phosphorylation of Stathmin in our models. However, we believe that our data have established Stathmins function downstream to betaPix.

      Author response image 2.

      qRT-PCR analysis showing that betaPIX over-expression (betaPix OE) rescued STMN1 expression in betaPIX siRNA knockdown (betaPix KD) in U251 cells. Data are presented in mean ± SEM; one-way ANOVA analysis with Dunnett's test, individual P values mentioned in the figure

      (7) How was the rescue of betapix in glioblastoma cells with siRNA-mediated betapix knockdown performed? Is this by betapix-resistant cDNA? Further, no information about isoforms of betapix (both for siRNA-mediated KD and rescue) or stathmin is provided.

      As similar to our Zwitch method that disrupting all betaPix transcripts in vivo, the knockdown of human betaPIX were designed to target conserved region of all transcripts in glioblastoma cell lines. And the rescue human betaPIX were obtained from the U251 cDNA library, ideally all isoforms enriched in the glioblastoma cell line would be isolated. The missing details are now provided in the Materials and Methods section, page 26. 

      (8) It is unclear what the authors' thoughts are on the decrease in stathmin observed and the functional outcome of this decrease. The Discussion could benefit from this.

      Thanks. We have now incorporated a new paragraph in the Discussion section at pages 21-22 addressing that down-regulated expression of Stathmins is associated with functional outcome of this decrease.

      (9) Zfhx4 mRNA injection is performed on bbh and betapixKO (is this a global or glial KO?) and found to rescue haemorrhaging. While vegfaa mRNA increases, it is formally possible that the rescue is not due to the increase in vegfaa (or that vegfaa is sufficient). Injection of vegfaa mRNA could address this issue.

      Zfhx4 mRNA injection was performed on bbh mutants and global betapix knockouts (crispr mutants). To avoid confusion, we have now included a sentence highlighting global knockout mutants used for this rescue experiment. For the second part, we acknowledge that this study cannot definitively prove the necessity of increased vegfaa levels in the rescue experiment. However, our data established Zhfx3/4 as novel downstream effectors to betaPix in cerebral vessel development. And these effects might partly be linked to angiogenic responses regulated by Zhfx3/4. In this revised version, we carefully proposed that Vegfaa signals act downstream of betaPix-Zfhx3/4 axis and highlighted the weakness of our manuscript on not fully investigating sufficiency of Vegfaa in the Discussion section at page 24. We intend to pursue more extensive analysis in our follow-up studies.

      (10) A significant part of the manuscript looks at angiogenesis/vascularisation, however, the title of the paper only reflects vessel integrity (which can be distinct from angiogenesis).

      Thanks. We have now changed the title to: Glial betaPix is essential for blood vessel development in the zebrafish brain

      (11) Line 366: The BBB abbreviation is used without indicating the full form. Perhaps this can be introduced in the preceding sentence.

      We have now edited the following sentence: “The maturation hallmark of central nervous system (CNS) vasculature is acquisition of blood brain barrier (BBB) properties, establishing a stable environment ...” in lines 386-387, Discussion section.

      (12) Line 371: "rupture" and not "rapture".

      We thank the reviewer for pointing out the spelling error, and have now made this correction. 

      (13) Line 416: "is enriched" instead of "enriches"?

      We have now edited as: “...end feet that is enriched with aquaporin-4 ...” in line 411, page 19. 

      (14) The sentence in lines 121-123 should be simplified.

      We have now revised this sentence as the following: “A previous work has shown that bubblehead (bbh<sup>fn40a</sup>) mutant has a global reduction in betaPix transcripts, and bbh<sup>m292</sup> mutant has a hypomorphic mutation in betaPix, thus establishing that betaPix is responsible for bubblehead mutant phenotypes [10]”. 

      (15) No mention in the text of what o-dianisine labels.

      We have now edited the following sentence: “By using o-dianisidine staining to label hemoglobins, we found severe brain hemorrhages ...” in lines 131-133.

      (16) Line 165: Sentence requires improvement. Perhaps "Vascularisation of the central arteries in the zebrafish hindbrain ...".

      We have now edited this sentence as: “Vascularisation of the central arteries in the zebrafish hindbrain starts at 29 hpf.” in this revised version (line 176). 

      (17) Line 184: Why is "hematopoiesis" mentioned? The genesis of blood cells is not tested anywhere in the manuscript.

      Thanks. We have now edited this statement as: “IPA-3 treatment had no effect on heamorrhage induction in betaPix<sup>ct/ct</sup> control siblings.” 

      (18) Line 222-223: Improve "increasing trends". Perhaps "increased relative proportions". Clarify "progenitors" means neuronal and glial progenitors.

      We have now edited this statement: “we found that most neuronal clusters increased relative proportions ...” in this revised version.

      (19) Line 232-233: "arrow indicates" - perhaps "indicated by the arrow"? Also, the arrow indicating gfap needs to be mentioned in the Figure S6A legend. Cannot understand what is meant by "as of its enriched gfap".

      We have now edited in the text as: “Figure S6A, indicated by the arrow”, and added “Box area and arrow highlighting gfap expressions.” in Figure S6 legend. To avoid confusion, we have revised "as of its enriched gfap" sentence as the following: “We next focused on the progenitor cluster owing to the enriched gfap expression and the significantly reduced numbers of cells in this cluster by betaPix deficiency.”

      (20) Line 239 - 240: While the sentence says "... revealed three major categories:", well, more than 3 are mentioned subsequently.

      To avoid possible confusion in the text, we have now removed the sub-category examples and presented the data as: “three major categories: epigenetic remodeling, microtubule organizations and neurotransmitter secretion/transportation (Figure 4D).” 

      (21) Line 252: Stathmins negatively regulate microtubule stability. Why are they referred to as "microtubule polymerization genes stathmins"?

      We are thankful to the reviewer for pointing out this error, and we have now made correction in the text as “microtubule-destabilizing protein Stathmins”.

      (22) Line 262-265: The citation used to indicate concurrence with mouse data is disingenuous. That study did not show a reduction in stathmin levels upon betapix loss. Rather, it showed an increase in Ser16 phosphorylation on stathmin, which reduces stathmin's microtubule destabilising function. Please elaborate on the difference between the two studies.

      We completely agree with the reviewer’s statement that in the cited article, increased Ser16 phosphorylation on stathmin reduces its microtubule destabilising function. While that study did not show a reduction in Stathmin levels, others have shown that transcriptionally downregulated Stathmins are associated with the impaired neuronal and glial development. We have now revised the Discussion section by adding a new paragraph to address the disrupted homeostasis of Stathmins in these previous studies and their possible association with our data. We hope that these changes we made can clarify this issue. 

      (23) Line 310: While ZFHX3 levels are reduced in betapix mutants and KD in glioblastomas, were ZFHX3 and 4 up- or downregulated in the scRNA-Seq data?

      Thanks for this critical point. Indeed, our results showed that ZFHX3 and 4 down-regulated in the glial progenitor cluster in the scRNA-Seq data (Figure S8A) in betaPix knockouts and the FACS-sorted glia cells (Figure S8B). 

      (24) Line 317: "... betaPix acts upstream to Zfhx3/4-VEGFA signaling in regulating angiogenesis ...". While this is established later, the data at the time of this sentence does not warrant this claim.

      We agree with the reviewer’s statement and restated this sentence in the following way: “Zfhx3/4 might act as downstream effector of betaPix.”

      Reviewer #2 (Recommendations for the authors):

      (1) The images shown in 2E/H, 3B, 6F/J can use a schematic that helps readers to understand what to expect or look for. Splitting up the channels may also help in visualizing the vasculature clearly.

      Thank the reviewer for these suggestions. In this revised version, we have included schematic diagrams in the figures and incorporated more detailed statements in the legends.

      (2) Many times, arrows are pointing to structures (2E/H, 3B), but are not explained clearly (neither in the text nor in the legends). In 3B, the arrow is pointing to a negative space.

      (3) Legends are minimalistic and do not provide much information. The reader is left to interpret the data on their own.

      We apologize for not explaining the figures in enough details. In this revised version, we have now incorporated more detailed statements in the figure legends and have adjusted arrows in all figures.

      (4) The text needs heavy proofreading. For example:

      (a) Line 208- the title does not seem appropriate since the following text does not discuss Stathmins at all, which comes later.

      We agree with the reviewer’s statement and restated the title in the following way: “Single-cell transcriptome profiling reveals that gfap-positive progenitors were affected in betaPix knockouts.”

      (b) There is no mention of Figure 7 throughout the text.

      (c) Figure 7 does not include Zfhx or Vegfaa.

      Thank the reviewer for pointing out these errors. We have now revised Figure 7 and incorporated it to corresponding paragraphs in the Discussion section. 

      (5) The discussion seems incoherent in its current state.

      We have now revised the Discussion section according to the suggestions from both reviewers. We hope these revisions adequately address your concerns.

      (6) Please include some of the following points, if possible, in the discussion.

      (a) How is GEF activity of Rac/Cdc42 expected to be affected in beta-pix KO fishes?

      (b) What are the possible different ways the angiogenic pathways merge onto endothelial cells? Or do the authors imagine this process to be entirely driven by glial cells (directly)?

      We would like to thank the reviewer for his/her invaluable suggestions. We have now revised the Discussion section and hope that these changes can provide better and more balanced discussion. Since we have no data directly related to GEF activity of Rac/Cdc42 that might be affected in betaPix mutants, as well as have very limited data showing how glial betaPix regulates cerebral endothelial cells and BBB function, we would like to have the Discussion focused on the CRISPR-induced KI and cKO technologies, glial betaPix function and brain hemorrhage, and the putative role of betaPix-Zfhx3/4-VEGF function in central artery development. 

      References:

      Daub, H., Gevaert, K., Vandekerckhove, J., Sobel, A., and Hall, A. (2001). Rac/Cdc42 and p65PAK regulate the microtubule-destabilizing protein stathmin through phosphorylation at serine 16. J Biol Chem 276, 1677-1680. 10.1074/jbc.C000635200.

      Kim S, Park H, Kang J, Choi S, Sadra A, Huh SO. β-PIX-d, a Member of the ARHGEF7 Guanine Nucleotide Exchange Factor Family, Activates Rac1 and Induces Neuritogenesis in Primary Cortical Neurons. Exp Neurobiol. 2024;33(5):215-224. doi:10.5607/en24026

      Kwon Y, Jeon YW, Kwon M, Cho Y, Park D, Shin JE. βPix-d promotes tubulin acetylation and neurite outgrowth through a PAK/Stathmin1 signaling pathway [published correction appears in PLoS One. 2020 May 13;15(5):e0233327. doi: 10.1371/journal.pone.0233327.]. PLoS One. 2020;15(4):e0230814. Published 2020 Apr 6. doi:10.1371/journal.pone.0230814

      Kwon Y, Lee SJ, Shin YK, Choi JS, Park D, Shin JE. Loss of neuronal βPix isoforms impairs neuronal morphology in the hippocampus and causes behavioral defects. Anim Cells Syst (Seoul). 2025;29(1):57-71. Published 2025 Jan 8. doi:10.1080/19768354.2024.2448999

      Wittmann, T., Bokoch, G.M., and Waterman-Storer, C.M. (2004). Regulation of microtubule destabilizing activity of Op18/stathmin downstream of Rac1. J Biol Chem 279, 6196-6203.10.1074/jbc.M307261200.

      Zeitz, M., and Kierfeld, J. (2014). Feedback mechanism for microtubule length regulation by stathmin gradients. Biophys J 107, 2860-2871.10.1016/j.bpj.2014.10.056.

    1. Cardumem is proposed, in the context of the Grafoscopio community, where we have experimented with digital metatools and the notion of interpersonal wikis as a way to collect and care for personal and community knowledge and memory. Because of the connections of members in the Grafoscopio community with places in other communities and academia, our practices and infrastructures has been tested in different contexts: linguistic revitalizing for indigenous communities in the Colombian Amazonas, Role Playing games, diagnosis of community learning needs in information and communication technologies, and examples (1, 2) of personal blikis (blogs + wikis), among others.

      ¿Cómo puede el enfoque de Cardumem, basado en los metatools y la programación en Lua/YueScript, influir en la forma en que gestionamos y preservamos el conocimiento dentro de las comunidades académicas y sociales, especialmente en contextos del Sur Global? Me cuestiono cómo podríamos usar este tipo de metaherramientas para fortalecer la memoria de comunidades indígenas, campesinas o urbanas, sin imponerles estructuras externas. Tal vez la clave esté en que Cardumem no busca imponer un modelo, sino abrir la posibilidad de construir nuestras propias maneras de registrar y compartir el saber, desde el Sur, con nuestras voces y en nuestros propios lenguajes digitales.

    1. "las encuestas deben incorporar variables que obligue a quien las responde a emitir una opinión diferenciada entre los inmigrantes y las inmigrantes" pg.6

      Esté párrafo es certero porque la mayoría de las veces damos por hecho cosas o circunstancias que no las deberíamos dar por hecho, pero en el lenguaje cotidiano existen muchas palabras masculinas, eso nos lleva a la hora de hacer una encuesta que nuestras palabras se construyan desde un punto masculino. Sin embargo, no esta bien ya que las mujeres son una parte importante de la inmigración. Por lo tanto, no se debe excluir a las mujeres al revés integrarlas porque son parte de la sociedad y de las poblaciones mundiales.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary: 

      The idea is appealing, but the authors have not sufficiently demonstrated the utility of this approach.

      Strengths: 

      Novelty of the approach, potential impli=cations for discovering novel interactions

      Weaknesses:

      The Duong had introduced their highly elegant peptidisc approach several years ago. In this present work, they combine it with thermal proteome profiling (TPP) and attempt to demonstrate the utility of this combination for identifying novel membrane protein-ligand interactions.

      While I find this idea intriguing, and the approach potentially useful, I do not feel that the authors had sufficiently demonstrated the utility of this approach. My main concern is that no novel interactions are identified and validated. For the presentation of any new methodology, I think this is quite necessary. In addition, except for MsbA, no orthogonal methods are used to support the conclusions, and the authors rely entirely on quantifying rather small differences in abundances using either iBAQ or LFQ.

      We thank the reviewer for their thoughtful comments. In this revision, we have experimentally addressed the reviewer’s concerns in three ways:

      (1) To demonstrate the utility of our MM-TPP method over the detergent-based TPP workflow (termed DB-TPP), we performed a side-by-side comparison using ATP–VO₄ at 51 °C (Figure 3B and Figure 4A). From the DB-TPP dataset, 7.4% of all identified proteins were annotated as ATP-binding, while 6.4% of proteins differentially stabilized were annotated as ATP-binding. In contrast, in the MM-TPP dataset, 9.3% of all identified proteins were annotated as ATP-binding proteins, while 17% of proteins differentially stabilized were annotated as ATP-binding. The lack of enrichment in the detergent-based approach indicates that the observed differences are likely stochastic, rather than a result of specific ATP–VO₄-mediated stabilization as found with MM-TPP. For instance, several key proteins—BCS1, P2RY6, SLC27A2, ABCB1, ABCC2, and ABCC9— found differentially stabilized using the MM-TPP method showed no such pattern in the DB-TPP dataset. This divergence strongly supports the specificity and utility of our Peptidisc approach. 

      (2) To demonstrate that MM-TPP can resolve not only the broader effects of ATP–VO₄ but also specific ligand–protein interactions, we employed 2-methylthio-ADP (2-MeS-ADP), a selective agonist of the P2RY12 receptor [PMID: 24784220]. In that case, we observed clear thermal stabilization of P2RY12, with more than 6-fold increase in stability at both 51 °C and 57 °C (–log₁₀ p > 5.97; Figure 4B and Figure S4). Notably, no other proteins—including the structurally related but non-responsive P2RY6 receptor- showed comparable stabilization fold change at these temperatures.

      (3) To further probe the reproducibility of the method, we performed an independent MMTPP evaluation with ATP–VO₄ at 51 °C using data-independent acquisition (DIA), in contrast to the data-dependent acquisition (DDA) approach used in the initial study (Figure S5). Overall, 7.8% of all identified proteins were annotated as ATP-binding, and as before, this proportion increased to 17% among proteins with log₂ fold changes greater than 0.5. Specifically, BCS1 and SLC27A2 exhibited strong stabilization (log₂ fold change > 1), while P2RY6, ABCB11, ABCC2, and ABCG2 showed moderate stabilization (log₂ fold changes between 0.5 and 1), and consistent with previous results, P2RX4 was destabilized, with a log₂ fold change below –1. These findings support the consistency and reproducibility of the method across distinct data acquisition methods.

      My main concern is that no novel interactions are identified and validated. For the presentation of any new methodology, I think this is quite necessary.  

      The primary objective of our study is to establish and benchmark the MM-TPP workflow using known targets, rather than to discover novel ligand–protein interactions. Identifying new binders requires extensive screening and downstream validations, which we believe is beyond the scope of this methodological report. Instead, our study highlights the sensitivity and reliability of the MM-TPP approach by demonstrating consistent and reproducible results with well-characterized interactions.

      We respectfully disagree with the notion that introducing a new methodology must necessarily include the discovery of novel interactions. For instance, Martinez Molina et al. [PMID: 23828940] introduced the cellular thermal shift assay (CETSA) by validating established targets such as MetAP2 with TNP-470 and CDK2 with AZD-5438, without identifying novel protein–ligand pairs. Similarly, Kalxdorf et al. [PMID: 33398190] published their cell-surface thermal proteome profiling (CS-TPP) using Ouabain to stabilize the Na⁺/K⁺-ATPase pump in K562 cells, and SB431542 to stabilize its canonical target JAG1. In fact, when these methods revealed additional stabilizations, these were not validated but instead interpreted through reasoning grounded in the literature. For instance, they attributed the SB431542-induced stabilization of MCT1 to its reported role in cell migration and tumor invasiveness, and explained that SLC1A2 stabilization is related to the disruption of Na⁺/K⁺-ATPase activity by Ouabain. In the same way, our interpretation of ATP-VO₄–mediated stabilization of Mao-B is justified by predictive AlphaFold-3 rather than direct orthogonal assays, which are beyond the scope of our methodological presentation. 

      Collectively, the influential studies cited above have set methodological precedents by prioritizing validation and proof-of-concept over merely finding uncharacterized binders. In the same spirit, our work is centred on establishing MM-TPP as a robust platform for probing membrane protein–ligand interactions in a water-soluble format. The discovery of novel binders remains an exciting future direction—one that will build upon the methodological foundation laid by the present study.

      In addition, except for MsbA, no orthogonal methods are used to support the conclusions, and the authors rely entirely on quantifying rather small differences in abundances using either iBAQ or LFQ.

      We deliberately began this study with our model protein, MsbA, examined under both native and overexpressed conditions, to establish an adequation between MMTPP (Figure 2D) and biochemical stability assays (Figure 2A). This validation has provided us with the foundation to confidently extend MM-TPP to the mouse organ proteome. To demonstrate the validity of our workflow, we have used ATP-VO₄ because it has expected targets. 

      We note that orthogonal validation often requires overproduction and purification of the candidate proteins, including suitable antibodies, which is a true challenge for membrane proteins. Here, we demonstrate that MM-TPP can detect ligand-induced thermal shifts directly in native membrane preparations, without requiring protein overproduction or purification. We also emphasize several influential studies in TPP, including Martinez Molina et al. (PMID: 23828940) and Fang et al. (PMID: 34188175), which focused primarily on establishing and benchmarking the methodology, rather than on extensive orthogonal validation. In the same spirit, our study prioritizes methodological development, and accordingly, several orthogonal validations are now included in this revision.

      [...] and the authors rely entirely on quantifying rather small differences in abundances using either iBAQ or LFQ.

      To clarify, all analyses on ligand-induced stabilization or destabilization were carried out using LFQ values. The sole exception is on Figure 2B, where we used iBAQ values to depict the relative abundance of proteins within a single sample; this to show MsbA's relative level within the E. coli peptidisc library.

      Respectfully, we disagree with the assertion that we are “quantifying rather small differences in abundances using either iBAQ or LFQ.” We were able to clearly distinguish between stabilizations driven by specific ligands binding to their targets versus those caused by non-specific ligands with broader activity. This is further confirmed by comparing 2-MeS-ADP, a selective ligand for P2RY12, with ATP-VO₄, a highly promiscuous ligand, and AMP-PNP, which exhibits intermediate breadth. When tested in triplicate at 51 °C, 2-MeS-ADP significantly altered the thermal stability of 27 proteins,  AMP-PNP 44 proteins, and ATP-VO₄ 230 proteins, consistent with the expectation that broader ligands stabilize more proteins nonspecifically. Importantly, 2-MeS-ADP produced markedly stronger stabilization of its intended target, P2RY12 (–log<sub>10</sub>p = 9.32), than the top stabilized proteins for ATP–VO₄ (DNAJB3, –log₁₀p = 5.87) or AMP-PNP (FTH1, p = 5.34). Moreover, 2-MeS-ADP did not significantly stabilize proteins that were consistently stabilized by the broad ligands, such as SLC27A2, which was strongly stabilized by both ATP-VO<sub>4</sub> and AMP-PNP (–log<sub>10</sub> p>2.5). Together, these findings demonstrate that MMTPP can robustly distinguish between broad-spectrum and target-specific ligands, with selective ligands inducing stronger and more physiologically meaningful stabilization at their intended targets compared to promiscuous ligands.

      Finally, we emphasize that our findings are not marginal, but meet quantitative and statistical rigor consistent with best practices in proteomics. We apply dual thresholds combining effect size (|log₂FC| ≥ 1, i.e., at least a two-fold change) with statistical significance (FDR-adjusted p ≤ 0.05)—criteria commonly used in proteomics methodology studies (e.g., PMID: 24942700, 38724498). Moreover, the stabilization and destabilization events we report are reproducible across biological replicates (n = 3), consistent across adjacent temperatures for most targets, and technically robust across acquisition modes (DDA vs. DIA). Taken together, these results reflect statistically valid and biologically meaningful effects, fully aligned with standards set by prior published proteomics studies.

      Furthermore, the reported changes in abundances are solely based on iBAQ or LFQ analysis. This must be supported by a more quantitative approach such as SILAC or labeled peptides. In summary, I think this story requires a stronger and broader demonstration of the ability of peptidisc-TPP to identify novel physiologically/pharmacologically relevant interactions.

      With respect to labeling strategies, we deliberately avoided using TMT due to concerns about both cost and potential data quality issues. Some recent studies have documented the drawbacks of TMT in contexts directly relevant to our work. For example, a benchmarking study of LiP-MS workflows showed that although TMT increased proteome depth and reduced technical variance, it was less accurate in identifying true drug–protein interactions and produced weaker dose–response correlations compared with label-free DIA approaches [PMID: 40089063]. More broadly, technical reviews have highlighted that isobaric tagging is intrinsically prone to ratio compression and reporterion interference due to co-isolation and co-fragmentation of peptides, which flatten measured fold-changes and obscure biologically meaningful differences [PMID: 22580419, 22036744]. In terms of SILAC, the technique requires metabolic incorporation of heavy amino acids, which is feasible in cultured cells but not in physiologically relevant tissues such as the liver organ used here. SILAC mouse models exist, but they are expensive and time-consuming [PMID: 18662549, 21909926]. We are not a mouse lab, and introducing liver organ SILAC labeling in our workflow is beyond the scope of these revisions. We also note that several hallmark TPP studies have been successfully carried out using label-free quantification [PMID: 25278616, 26379230, 33398190, 23828940], establishing this as an accepted and widely applied approach in the field. 

      To further support our conclusions, we added controls showing that detergent solubilization of mouse liver membranes followed by SP4 cleanup fails to detect ATP-VO₄– mediated stabilization of ATP-binding proteins, underscoring the necessity of Peptidisc reconstitution for capturing ligand-induced thermal stabilization. We also present new data demonstrating selective stabilization of the P2Y12 receptor by its agonist 2-MeS-ADP, providing orthogonal, receptor-specific validation within the MM-TPP framework. Finally, an orthogonal DIA acquisition on separate replicates confirmed robust ATP-vanadate stabilization of ATP-binding proteins, including BCS1l and SLC27A2. Together, these additions reinforce that the observed stabilizations are genuine, physiologically relevant ligand–protein interactions and highlight the unique advantage of the Peptidisc-based workflow in capturing such events.

      Cited Reference:

      24784220: Zhang J, Zhang K, Gao ZG, et al. Agonist-bound structure of the human P2Y₁₂ receptor. Nature.  2014;509(7498):119-122. doi:10.1038/nature13288. 

      23828940: Martinez Molina D, Jafari R, Ignatushchenko M, et al. Monitoring drug target engagement in cells and tissues using the cellular thermal shift assay. Science. 2013;341(6141):84-87. doi:10.1126/science.1233606.

      33398190: Kalxdorf M, Günthner I, Becher I, et al. Cell surface thermal proteome profiling tracks perturbations and drug targets on the plasma membrane. Nat Methods. 2021;18(1):84-91. doi:10.1038/s41592-020-01022-1.

      34188175: Fang S, Kirk PDW, Bantscheff M, Lilley KS, Crook OM. A Bayesian semi-parametric model for thermal proteome profiling. Commun Biol. 2021;4(1):810. doi:10.1038/s42003-021-02306-8.

      24942700: Cox J, Hein MY, Luber CA, Paron I, Nagaraj N, Mann M. Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics. 2014;13(9):2513-2526. doi:10.1074/mcp.M113.031591.

      38724498: Peng H, Wang H, Kong W, Li J, Goh WWB. Optimizing differential expression analysis for proteomics data via high-performing rules and ensemble inference. Nat Commun. 2024;15(1):3922. doi:10.1038/s41467-02447899-w. 

      40089063: Koudelka T, Bassot C, Piazza I. Benchmarking of quantitative proteomics workflows for limited proteolysis mass spectrometry. Mol Cell Proteomics. 2025;24(4):100945. doi:10.1016/j.mcpro.2025.100945.

      22580419: Christoforou AL, Lilley KS. Isobaric tagging approaches in quantitative proteomics: the ups and downs. Anal Bioanal Chem. 2012;404(4):1029-1037. doi:10.1007/s00216-012-6012-9. 

      22036744: Christoforou AL, Lilley KS. Isobaric tagging approaches in quantitative proteomics: the ups and downs. Anal Bioanal Chem. 2012;404(4):1029-1037. doi:10.1007/s00216-012-6012-9. 

      18662549: Krüger M, Moser M, Ussar S, et al. SILAC mouse for quantitative proteomics uncovers kindlin-3 as an essential factor for red blood cell function. Cell. 2008;134(2):353-364. doi:10.1016/j.cell.2008.05.033.

      21909926: Zanivan S, Krueger M, Mann M. In vivo quantitative proteomics: the SILAC mouse. Methods Mol Biol. 2012;757:435-450. doi:10.1007/978-1-61779-166-6_25. 

      25278616: Kalxdorf M, Becher I, Savitski MM, et al. Temperature-dependent cellular protein stability enables highprecision proteomics profiling. Nat Methods. 2015;12(12):1147-1150. doi:10.1038/nmeth.3651.

      26379230: Savitski MM, Reinhard FBM, Franken H, et al. Tracking cancer drugs in living cells by thermal profiling of the proteome. Science. 2015;346(6205):1255784. doi:10.1126/science.1255784. 

      33452728: Leuenberger P, Ganscha S, Kahraman A, et al. Cell-wide analysis of protein thermal unfolding reveals determinants of thermostability. Science. 2020;355(6327):eaai7825. doi:10.1126/science.aai7825. 

      23066101: Savitski MM, Zinn N, Faelth-Savitski M, et al. Quantitative thermal proteome profiling reveals ligand interactions and thermal stability changes in cells. Nat Methods. 2013;10(12):1094-1096. doi:10.1038/nmeth.2766.  

      30858367: Piazza I, Kochanowski K, Cappelletti V, et al. A machine learning-based chemoproteomic approach to identify drug targets and binding sites in complex proteomes. Nat Commun. 2019;10(1):1216. doi:10.1038/s41467019-09199-0. 

      Reviewer #2 (Public Review):

      Summary:

      The membrane mimetic thermal proteome profiling (MM-TPP) presented by Jandu et al. seems to be a useful way to minimize the interference of detergents in efficient mass spectrometry analysis of membrane proteins. Thermal proteome profiling is a mass spectrometric method that measures binding of a drug to different proteins in a cell lysate by monitoring thermal stabilization of the proteins because of the interaction with the ligands that are being studied. This method has been underexplored for membrane proteome because of the inefficient mass spectrometric detection of membrane proteins and because of the interference from detergents that are used often for membrane protein solubilization.

      Strengths:

      In this report the binding of ligands to membrane protein targets has been monitored in crude membrane lysates or tissue homogenates exalting the efficacy of the method to detect both intended and off-target binding events in a complex physiologically relevant sample setting.

      The manuscript is lucidly written and the data presented seems clear. The only insignificant grammatical error I found was that the 'P' in the word peptidisc is not capitalized in the beginning of the methods section "MM-TPP profiling on membrane proteomes". The clear writing made it easy to understand and evaluate what has been presented. Kudos to the authors.

      Weaknesses:

      While this is a solid report and a promising tool for analyzing membrane protein drug interactions, addressing some of the minor caveats listed below could make it much more impactful.

      The authors claim that MM-TPP is done by "completely circumventing structural perturbations invoked by detergents[1] ". This may not be entirely accurate, because before reconstitution of the membrane proteins in peptidisc, the membrane fractions are solubilized by 1% DDM. The solubilization and following centrifugation step lasts at least for 45 min. It is less likely that all the structural perturbations caused by DDM to various membrane proteins and their transient interactions become completely reversed or rescued by peptidisc reconstitution.

      We thank the reviewer for this insightful comment. In response, we have revised the sentence and expanded the discussion to clarify that the Peptidisc provides a complementary approach to detergent-based preparations for studying membrane proteins, preserving native lipid–protein interactions and stabilization effects that may be diminished in detergent.

      To further address the structural perturbations invoked by detergents, and as already detailed to our response to Reviewer 1, we have compared the thermal profile of the Peptidisc library to the mouse liver membranes solubilized with 1% DDM, after incubation with ATP–VO₄ at 51 °C (Figure 4A). The results with the detergent extract revealed random patterns of stabilization and destabilization, with only 6.4% of differentially stabilized proteins being ATP-binding—comparable to the 7.4% observed in the background. In contrast, in the Peptidisc library, 17% of differentially stabilized proteins were ATP-binding, compared to 9.3% in the background. Thus, while Peptidisc reconstitution does not fully avoid initial detergent exposure, these findings underscore the importance of implementing Peptidisc in the TPP workflow when dealing with membrane proteins.

      In the introduction, the authors make statements such as "..it is widely acknowledged that even mild detergents can disrupt protein structures and activities, leading to challenges in accurately identifying drug targets.." and "[peptidisc] libraries are instrumental in capturing and stabilizing IMPs in their functional states while preserving their interactomes and lipid allosteric modulators...'. These need to be rephrased, as it has been shown by countless studies that even with membrane protein suspended in micelles robust ligand binding assays and binding kinetics have been performed leading to physiologically relevant conclusions and identification of protein-protein and protein-ligand interactions.

      We thank the reviewer for this valuable feedback and fully agree with the point raised. In response, we have revised the Introduction and conclusion to moderate the language concerning the limitations of detergent use. We now explicitly acknowledge that numerous studies have successfully used detergent micelles for ligand-binding assays and kinetic analyses, yielding physiologically relevant insights into both protein–protein and protein–ligand interactions [e.g., PMID: 22004748, 26440106, 31776188].

      At the same time, we clarify that the Peptidisc method offers a complementary advantage, particularly in the context of thermal proteome profiling (TPP), which involves mass spectrometry workflows that are incompatible with detergents. In this setting, Peptidiscs facilitate the detection of ligand-binding events that may be more difficult to observe in detergent micelles.

      We have reframed our discussion accordingly to present Peptidiscs not as a replacement for detergent-based methods, but rather as a complementary tool that broadens the available methodological landscape for studying membrane protein interactions.

      If the method involves detergent solubilization, for example using 1% DDM, it is a bit disingenuous to argue that 'interactomes and lipid allosteric modulators' characterized by lowaffinity interactions will remain intact or can be rescued upon detergent removal. Authors should discuss this or at least highlight the primary caveat of the peptidisc method of membrane protein reconstitution - which is that it begins with detergent solubilization of the proteome and does not completely circumvent structural perturbations invoked by detergents.

      We would like to clarify that, in our current workflow, ligand incubation occurs after reconstitution into Peptidiscs. As such, the method is designed to circumvent the negative effects of detergent during the critical steps involving low-affinity interactions.

      That said, we fully acknowledge that Peptidisc reconstitution begins with detergent solubilization (e.g., 1% DDM), and we have revised the conclusion to explicitly state this important caveat. As the reviewer correctly points out, this initial step may introduce some structural perturbations or result in the loss of weakly associated lipid modulators.

      However, reconstitution into Peptidiscs rapidly restores a detergent-free environment for membrane proteins, which has been shown in our previous studies [PMID: 38577106, 38232390, 31736482, 31364989] to mitigate these effects. Specifically, we have demonstrated that time-limited DDM exposure, followed by Peptidisc reconstitution, minimizes membrane protein delipidation, enhances thermal stability, retains functionality, and preserves multi-protein assemblies.

      It would also be important to test detergents that are even milder than 1% DDM and ones which are harsher than 1% DDM to show that this method of reconstitution can indeed rescue the perturbations to the structure and interactions of the membrane protein done by detergents during solubilization step. 

      We selected 1% DDM based on our previous work [PMID: 37295717, 39313981,38232390], where it consistently enabled robust and reproducible solubilization for Peptidisc reconstitution. We agree that comparing milder detergents (e.g., LMNG) and harsher ones (e.g., SDC) would provide valuable insights into how detergent strength influences structural perturbations, and how effectively these can be mitigated by Peptidisc reconstitution. Preliminary data (not shown) from mouse liver membranes indicate broadly similar proteomic profiles following solubilization with DDM, LMNG, and SDC, although potential differences in functional activity or ligand binding remain to be investigated.

      Based on the methods provided, it appears that the final amount of detergent in peptidisc membrane protein library was 0.008%, which is ~150 uM. The CMC of DDM depending on the amount of NaCl could be between 120-170 uM.

      While we cannot entirely rule out the presence of residual DDM (0.008%) in the raw library, its free concentration may be lower than initially estimated. This is related to the formation of mixed micelles with the amphipathic peptide scaffold, which is supplied in excess during reconstitution. These mixed micelles are subsequently removed during the ultrafiltration step. Furthermore, in related work using His-tagged Peptidiscs [PMID: 32364744], we purified the library by nickel-affinity chromatography following a 5× dilution into a detergent-free buffer. Although this purification step reduced the number of soluble proteins, the same membrane proteins were retained, suggesting that any residual detergent does not significantly interfere with Peptidisc reconstitution. Supporting this, our MM-TPP assays on purified libraries (data not shown) consistently demonstrated stabilization of ATP-binding proteins (e.g., SLC27A2, DNAJB3), indicating that the observed ligand–protein interactions result from successful incorporation into Peptidiscs.

      Perhaps, to completely circumvent the perturbations from detergents other methods of detergentfree solubilization such as using SMA polymers and SMALP reconstitution could be explored for a comparison. Moreover, a comparison of the peptidisc reconstitution with detergent-free extraction strategies, such as SMA copolymers, could lend more strength to the presented method.

      We agree that detergent-free methods such as SMA polymers hold promise for membrane protein solubilization. However, in preliminary single-replicate experiments using SMA2000 at 51 °C in the presence of ATP–VO₄ (data not shown), we observed broad, non-specific stabilization effects. Of the 2,287 quantified proteins, 9.3% were annotated as ATP-binding, yet 9.9% of the 101 proteins showing a log₂ fold change >1 or <–1 were ATPbinding, indicating no meaningful enrichment. Given this lack of specificity and the limited dataset, we chose not to pursue further SMA experiments and have not included them here. However, in a recent study (https://doi.org/10.1101/2025.08.25.672181), we directly compared Peptidisc, SMA, and nanodiscs for liver membrane proteome profiling. In that work, Peptidisc outperformed both SMA and nanodiscs in detecting membrane protein dysregulation between healthy and diseased liver. By extension, we expect Peptidisc to offer superior sensitivity and specificity for detecting ligand-induced stabilization events, such as those observed here with ATP–vanadate.

      Cross-verification of the identified interactions, and subsequent stabilization or destabilizations, should be demonstrated by other in vitro methods of thermal stability and ligand binding analysis using purified protein to support the efficacy of the MM-TPP method. An example cross-verification using SDS-PAGE, of the well-studied MsbA, is shown in Figure 2. In a similar fashion, other discussed targets such as, BCS1L, P2RX4, DgkA, Mao-B, and some un-annotated IMPs shown in supplementary figure 3 that display substantial stabilization or destabilization should be cross-verified.

      We appreciate this suggestion and note that a similar point was raised in R1’s comment “In addition, except for MsbA, no orthogonal methods are used to support the conclusions, and the authors rely entirely on quantifying rather small differences in abundances using either iBAQ or LFQ.” We have developed a detailed response to R1 on this matter, which equally applies here. 

      Cited Reference:

      35616533: Young JW, Wason IS, Zhao Z, et al. Development of a Method Combining Peptidiscs and Proteomics to Identify, Stabilize, and Purify a Detergent-Sensitive Membrane Protein Assembly. J Proteome Res. 2022;21(7):1748-1758. doi:10.1021/acs.jproteome.2c00129. PMID: 35616533.

      31364989: Carlson ML, Stacey RG, Young JW, et al. Profiling the Escherichia coli membrane protein interactome captured in Peptidisc libraries. Elife. 2019;8:e46615. doi:10.7554/eLife.46615. 

      22004748: O'Malley MA, Helgeson ME, Wagner NJ, Robinson AS. Toward rational design of protein detergent complexes: determinants of mixed micelles that are critical for the in vitro stabilization of a G-protein coupled receptor. Biophys J. 2011;101(8):1938-1948. doi:10.1016/j.bpj.2011.09.018.

      26440106: Allison TM, Reading E, Liko I, Baldwin AJ, Laganowsky A, Robinson CV. Quantifying the stabilizing effects of protein-ligand interactions in the gas phase. Nat Commun. 2015;6:8551. doi:10.1038/ncomms9551.

      31776188: Beckner RL, Zoubak L, Hines KG, Gawrisch K, Yeliseev AA. Probing thermostability of detergentsolubilized CB2 receptor by parallel G protein-activation and ligand-binding assays. J Biol Chem. 2020;295(1):181190. doi:10.1074/jbc.RA119.010696.

      38577106: Jandu RS, Yu H, Zhao Z, Le HT, Kim S, Huan T, Duong van Hoa F. Capture of endogenous lipids in peptidiscs and effect on protein stability and activity. iScience. 2024;27(4):109382. doi:10.1016/j.isci.2024.109382.

      38232390: Antony F, Brough Z, Zhao Z, Duong van Hoa F. Capture of the Mouse Organ Membrane Proteome Specificity in Peptidisc Libraries. J Proteome Res. 2024;23(2):857-867. doi:10.1021/acs.jproteome.3c00825.

      31736482: Saville JW, Troman LA, Duong Van Hoa F. PeptiQuick, a one-step incorporation of membrane proteins into biotinylated peptidiscs for streamlined protein binding assays. J Vis Exp. 2019;(153). doi:10.3791/60661. 

      37295717: Zhao Z, Khurana A, Antony F, et al. A Peptidisc-Based Survey of the Plasma Membrane Proteome of a Mammalian Cell. Mol Cell Proteomics. 2023;22(8):100588. doi:10.1016/j.mcpro.2023.100588. 

      39313981: Antony F, Brough Z, Orangi M, Al-Seragi M, Aoki H, Babu M, Duong van Hoa F. Sensitive Profiling of Mouse Liver Membrane Proteome Dysregulation Following a High-Fat and Alcohol Diet Treatment. Proteomics. 2024;24(23-24):e202300599. doi:10.1002/pmic.202300599. 

      32364744: Young JW, Wason IS, Zhao Z, Rattray DG, Foster LJ, Duong Van Hoa F. His-Tagged Peptidiscs Enable Affinity Purification of the Membrane Proteome for Downstream Mass Spectrometry Analysis. J Proteome Res. 2020;19(7):2553-2562. doi:10.1021/acs.jproteome.0c00022.

      32591519: The M, Käll L. Focus on the spectra that matter by clustering of quantification data in shotgun proteomics. Nat Commun. 2020;11(1):3234. doi:10.1038/s41467-020-17037-3. 

      33188197: Kurzawa N, Becher I, Sridharan S, et al. A computational method for detection of ligand-binding proteins from dose range thermal proteome profiles. Nat Commun. 2020;11(1):5783. doi:10.1038/s41467-02019529-8. 

      26524241: Reinhard FBM, Eberhard D, Werner T, et al. Thermal proteome profiling monitors ligand interactions with cellular membrane proteins. Nat Methods. 2015;12(12):1129-1131. doi:10.1038/nmeth.3652. 

      23828940: Martinez Molina D, Jafari R, Ignatushchenko M, et al. Monitoring drug target engagement in cells and tissues using the cellular thermal shift assay. Science. 2013;341(6141):84-87. doi:10.1126/science.1233606. 

      32133759: Mateus A, Kurzawa N, Becher I, et al. Thermal proteome profiling for interrogating protein interactions. Mol Syst Biol. 2020;16(3):e9232. doi:10.15252/msb.20199232. 

      14755328: Dorsam RT, Kunapuli SP. Central role of the P2Y12 receptor in platelet activation. J Clin Invest. 2004;113(3):340-345. doi:10.1172/JCI20986. 

      Reviewer #1 (Recommendations for the authors):

      “The authors use iBAC or LFQ to compare across samples. This inconsistency is puzzling. As far as I know, LFQ should always be used when comparing across samples”

      As mentioned above, we use iBAQ only in Fig. 2B to illustrate within-sample relative abundance; all comparative analyses elsewhere use LFQ. We have updated the Fig. 2B legend to state this explicitly.

      We used iBAQ Fig. 2B as it provides a notion of protein abundance within a sample, normalizing the summed peptide intensities by the number of theoretically observable peptides. This normalization facilitates comparisons between proteins within the same sample, offering a clearer understanding of their relative molar proportions [PMID: 33452728]. LFQ, by contrast, is optimized for comparing the same protein across different samples. It achieves this by performing delayed normalization to reduce run-to-run variability and by applying maximal peptide ratio extraction, which integrates pairwise peptide intensity ratios across all samples to build a consistent protein-level quantification matrix [PMID: 24942700]. These features make LFQ more robust to missing values and technical variation, thereby enabling accurate detection of relative abundance changes in the same protein under different experimental conditions. This distinction is well supported by the proteomics literature: Smits et al. [PMID: 23066101] used iBAQ specifically to determine the relative abundance of proteins within one sample, whereas LFQ was applied for comparative analyses between conditions.

      “[Regarding Figure 2A] Why does the control also contain ATP-vanadate? Also, I am not aware of a commercially available chemical "ATP-VO4". I assume this is a mistake”

      The control condition in Figure 2A was mislabeled, and the figure has been corrected to remove this discrepancy. In our experiments, ATP and orthovanadate (VO<sub>4</sub>) were added together, and for simplicity this was annotated as “ATP-VO<sub>4</sub>.” 

      “[Regarding Figure 2B] What is the fold change in MsbA iBAQ values? It seems that the differences are quite small, and as such require a more quantitative approach than iBAQ (e.g SILAC or some other internal standard). In addition, what information does this panel add relative to 2C”

      The figure has been updated to clarify that the values shown are log₂transformed iBAQ intensities. Figures 2B and 2C are complementary: Figure 2B shows that in the control sample, MsbA’s peptide abundance decreases with temperatures (51, 56, and 61 °C) relative to the remaining bulk proteins. Figure 2C shows the specific thermal profiles of MsbA in control and ATP–vanadate conditions. To make this clearer, we have added a sentence to the Results section explaining the specific role of Figure 2B.

      Together, these panels indicate that the method can identify ligand-induced stabilization even for proteins whose abundance decreases faster than the bulk during the TPP assay. We have provided the rationale for not using SILAC or TMT labeling in our public response.

      “[Regarding Figure 2C] Although not mentioned in the legend, I assume this is iBAQ quantification, which as mentioned above isn't accurate enough for such small differences. In addition, I find this data confusing: why is MsbA more stable at the lower temperatures in the absence of ATP-vanadate? The smoothed-line representation is misleading, certainly given the low number of data points”

      The data presented represent LFQ values for MsbA, and we have updated the figure legend to clearly indicate this. Additionally, as suggested, we have removed the smoothing line to more accurately reflect the data. Regarding the reviewer’s concern about stability at lower temperatures, we note that MsbA exhibits comparable abundance at 38 °C and 46 °C under both conditions, with overlapping error bars. We therefore interpret these data as indicating no significant difference in stability at the lower temperatures, with ligand-dependent stabilization becoming apparent only at elevated temperatures. We do not exclude the possibility that MsbA stability at these temperatures is affected by the conformational dynamics of this ABC transporter upon ATP binding and hydrolysis.

      “[Regarding Figure 3A] is this raw LFQ data? Why did the authors suddenly change from iBAQ to LFQ? I find this inconsistency puzzling”

      To clarify, all analyses of protein stabilization or destabilization presented in the manuscript are based on LFQ values. The only instance where iBAQ was used is Figure 2B, where it served to illustrate the relative peptide abundance of MsbA within the same sample. We have revised the figure legends and text to make this distinction explicit and ensure consistency in presentation.

      “[Regarding Figure 3B] The non-specific ATP-dependent stabilization increases the likelihood of false positive hits. This limitation is not mentioned by the authors. I think it is important to show other small molecules, in addition to ATP. The authors suggest that their approach is highly relevant for drug screening. Therefore, a good choice is to test an effect of a known stabilizing drug (eg VX-809 and CFTR)”

      We thank the reviewer for this suggestion. As noted in the manuscript (results and discussion sections), ATP is a natural hydrotrope and is therefore expected to induce broad, non-specific stabilization effects, a phenomenon also observed in previous proteome-wide studies, which demonstrated ATP’s widespread influence on cytosolic protein solubility and thermal stability (PMID: 30858367). To demonstrate that MM-TPP can resolve specific ligand–protein interactions beyond these global ATP effects, we tested 2-methylthio-ADP (2-MeS-ADP), a selective agonist of P2RY12 (PMID: 14755328). In these experiments, we observed robust and reproducible stabilization of P2RY12 at both 51°C and 57°C, with no consistent stabilization of unrelated proteins across temperatures. This provides direct evidence that our workflow can distinguish specific from non-specific ligand-induced effects. We selected 2-MeS-ADP due to its structural stability and receptor higher-affinity over ADP, allowing us to extend our existing workflow while testing a receptor-specific interaction. We agree that extending this approach to clinically relevant small-molecule drugs, such as VX-809 with CFTR, would further underscore the pharmacological potential of MM-TPP, and we have now noted this as an important avenue for future studies.

      “X axis of Figure 3B: Log 2 fold difference of what? iBAQ? LFQ? Similar ambiguity regarding the Y axis of 3E. What peptide? And why the constant changes in estimating abundances?”

      We thank the reviewer for pointing out these inaccuracies in the figure annotations. As mentioned above, all analyses (except Figure 2B) are based on LFQ values. We have revised the figure legends and text to make this clear.

      In Figure 3E, “peptide intensity” refers to log2 LFQ peptide intensities derived from the BCS1L protein, as indicated in the figure caption. 

      “The authors suggest that P2RY6 and P2RY12 are stabilized by ADP, the hydrolysis product of ATP. Currently, the support for this suggestion is highly indirect. To support this claim, the authors need to directly show the effect of ADP. In reference to the alpha fold results shown in Figure 4D, the authors state that "Collectively, these data highlight the ability of MM-TPP to detect the side effects of parent compounds, an important consideration for drug development". To support this claim, it is necessary to show that Mao-B is indeed best stabilized with ADP or AMP, rather than ATP.”

      In this revision, we chose not to test ADP directly, as it is a broadly binding, relatively weak ligand that would likely stabilize many proteins without revealing clear target-specific effects. Since we had already evaluated ATP-VO₄, a similarly broad, non-specific ligand, additional testing with ADP would provide limited additional insight. Instead, we prioritized 2-methylthio-ADP, a selective agonist of P2RY12, to more effectively demonstrate the specificity of MM-TPP. With this ligand, we observed clear and reproducible stabilization of P2RY12, underscoring the ability of MM-TPP to resolve receptor–ligand interactions beyond ATP’s broad hydrotropic effects. Importantly, and as expected, we did not observe stabilization of the related purinergic receptor P2RY6, further supporting the specificity of the observed effect.

      We have also revised the AlphaFold-related statement in Figure 4D to adopt a more cautious tone: “Collectively, these data suggest that MM-TPP may detect potential side effects of parent compounds, an important consideration for drug development.” In this context, we use AlphaFold not as a validation tool, but rather as a structural aid to help rationalize why certain off-target proteins (e.g., ATP with Mao-B) exhibit stabilization.

      Reviewer #2 (Recommendations for the authors):

      “In the main text, it will be useful to include the unique peptides table of at least the targets discussed in the manuscript. For example, in presence of AMP-PNP at 51oC P2RY6 shows 4-6 peptides in all n=3 positive & negative ionization modes. But, for P2RY12 only 1-3 peptides were observed. Depending on the sequence length and the relative abundance in the cell of a protein of interest, the number of peptides observed could vary a lot per protein. Given the unique peptide abundance reported in the supplementary file, for various proteins in different conditions, it appears the threshold of observation of two unique peptides for a protein to be analyzed seems less stringent.”

      By applying a filter requiring at least two unique peptides in at least one replicate, we exclude, on average, 15–20% of the total identified proteins. We consider this a reasonable level of stringency that balances confidence in protein identification with the retention of relevant data. This threshold was selected because it aligns with established LC-MS/MS data analysis practices (PMID: 32591519, 33188197, 26524241), and we have included these references in the Methods section to justify our approach. We have included in this revision a Supplemental Table 2 showing the unique peptide counts for proteins highlighted in this study.  

      “It appears that the time of heat treatment for peptidisc library subjected to MM-TPP profiling was chosen as 3 min based on the results presented in Supplementary Figure 1A, especially the loss of MsbA observed in 1% DDM after 3 min heat perturbation. However, when reconstituted in peptidisc there seems to be no loss in MsbA even after 12 mins at 45oC. So, perhaps a longer heat treatment would be a more efficient perturbation.”

      Previous studies indicate that heat exposure of 3–5 minutes is optimal for visualizing protein denaturation (PMID: 23828940, 32133759). We have added a statement to the Results section to justify our choice of heat exposure. Although MsbA remains stable at 45 °C for extended periods, higher temperatures allow for more effective perturbation to reveal destabilization. Supplementary Figure 1A specifically illustrates MsbA instability in detergent environments.

      “Some of the stabilized temperatures listed in Table 1 are a bit confusing. For example, ABCC3 and ABCG2. In the case of ABCC3 stabilization was observed at 51oC and 60oC, but 56oC is not mentioned. In the same way, 51oC is not mentioned for ABCG2. You would expect protein to be stabilized at 56oC if it is stabilized at both 51oC and 60oC. So, it is unclear if the stabilizations were not monitored for these proteins at the missing temperatures in the table or if no peptides could be recorded at these temperatures as in the case of P2RX4 at 60oC in Figure 4C.”

      Both scenarios are represented in our data. For some proteins, like ABCG2, sufficient peptide coverage was achieved, but no stabilization was observed at intermediate temperatures (e.g., 56 °C), likely because the perturbation was not strong enough to reveal an effect. In other cases, such as ABCC3 at 56 °C or P2RX4 at 60 °C, the proteins were not detected due to insufficient peptide identifications at those temperatures, which explains their omission from the table. 

      “In Figure 4C, it is perplexing to note that despite n = 3 there were no peptide fragments detected for P2RX4 at 60oC in presence of ATP-VO4, but they were detected in presence of AMP-PNP. It will be useful to learn authors explanation for this, especially because both of these ligands destabilize P2RX4. In Figure 4B, it would have been great to see the effect of ADP too, to corroborate the theory that ATP metabolites could impact the thermal stability.”

      In Figure 4C, the absence of P2RX4 peptide detection at 60 °C with ATP–VO₄ mirrors variability observed in the corresponding control (n = 6). Specifically, neither the control nor ATP–VO₄ produced unique peptides for P2RX4 at 60 °C in that replicate, whereas peptides were detected at 60 °C in other replicates for both the control and AMPPNP, and at 64 °C for ATP–VO<sub>4</sub>, the controls, and AMP-PNP. Such missing values are a natural feature of MS-based proteomics and can arise from multiple technical factors, including inconsistent heating, incomplete digestion, stochastic MS injection, or interference from Peptidisc peptides. We therefore interpret the absence of peptides in this replicate as a technical artifact rather than evidence against protein destabilization. Importantly, the overall dataset consistently shows that both ATP–VO₄ and AMP-PNP destabilize P2RX4, supporting their characterization as broad, non-specific ligands with off-target effects.

      Because ATP and ADP belong to the same class of broadly binding, non-specific ligands, additional testing with ADP would not provide meaningful mechanistic insight. Instead, we chose to test 2-methylthio-ADP, a selective P2RY12 agonist. This experiment revealed robust, reproducible stabilization of P2RY12, without consistent effects on unrelated proteins at 51 °C and 57 °C, thereby demonstrating the ability of MM-TPP to detect specific receptor–ligand interactions.

      Finally, we note that P2RX4 is not a primary target of ATP–VO<sub>4</sub> or AMP-PNP. Consequently, the observed destabilization of P2RX4 is expected to be less pronounced than the strong, physiologically consistent stabilization of ABC transporters by ATP–VO<sub>4</sub>, as shown in Figure 3D, where the majority of ABC transporters are thermally stabilized across all tested temperatures.

      “As per Figure 4, P2Y receptors P2RY6 and P2RY12 both showed great thermal stability in presence of ATP-VO4 despite their preference for ADP. The authors argue this could be because of ATP metabolism, and binding of the resultant ADP to the P2RY6. If P2RX4 prefers ATP and not the metabolized product ADP that apparently is available, ideally you should not see a change in stability. A stark destabilization would indicate interaction of some sorts. P2X receptors are activated by ATP and are not naturally activated by AMP-PNP. So, destabilization of P2RX4 upon binding to ATP that can activate P2X receptors is conceivable. However, destabilization both in presence of ATP-VO4 and AMP-PNP is unclear. It is perhaps useful to test effect of ADP using this method, and maybe even compare some antagonists such as TNPATP.”

      In this study, we did not directly test ADP, as we had already demonstrated that MM-TPP detects stabilization by broad-binding ligands such as ATP–VO₄. Instead, we focused on a more selective ligand, 2-MeS-ADP, a specific agonist of P2RY12 [PMID: 14755328]. Here, we observed robust and reproducible stabilization of P2RY12 at 51 °C and 57 °C, while P2RY6 showed no significant changes, and no other proteins were consistently stabilized (Figure 4B, S4). This confirms that MM-TPP can distinguish specific ligand–receptor interactions from broader ATP-induced effects. To further explore the assay’s nuance and sensitivity, testing additional nucleotide ligands—including antagonists like TNP-ATP or ATPγS—would provide valuable insights, and we have identified this as an important future direction.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary: This work by Matsui et al. examined the function of a gene Stand Stil (stil) in Drosophila in regulation of germ cell death in the female germline. They show that stil mutants contain many apoptotic cells, leading to germ cell loss and infertility. Gene expression analysis showed upregulation of pro-apoptotic genes such as rpr in stil mutant. DamID experiment further showed that stil binds to rpr promoter region to repress its expression. Additionally, they also show that undifferentiated germ cells are resistant to cell death in stil mutant (but stil mutant still eventually loses all germ cells).

      Major comments: Overall, experiments adhere to a general standard of rigor, and each result is fairly convincing. In that sense, this paper warrants publication, as a paper that revealed a new gene important for preventing germ cell death. With that said, I feel that this paper does not reveal a new biological insight. In a nutshell, this paper is about a transcriptional repressor for pro-apoptotic gene, hence its depletion leads to cell death. Data is solid and the conclusion is well supported. But the readers will be left wondering why nature implemented such control? Unless one can show what kind of defects stil rpr double mutant (which rescues germ cell loss phenotype) exhibits, there is no insight why the balance of pro-apoptotic gene and its repressor is important. The paper discusses the 'molecular' mechanisms that explain the phenomenon, but it does not provide insights. The lack of conceptual advancement is the limitation of this work.

      Response:

      We thank the reviewer for pointing out a biological insight into the evolutionary rationale underlying the adoption of such a regulatory mechanism in nature. To address this point, we assessed the evolutionary conservation of rpr and stil through BLAST searches and comparative analyses. Our results showed that both genes are Diptera-restricted, whereas their key domains (the rpr IAP-binding motif and the Stil BED finger) are widely conserved across metazoans. In this phylogenetic context, we propose that Stil acts as a dedicated repressor of rpr in the Drosophila female germline, thereby establishing an apoptotic control architecture in which hid predominates and rpr is repressed by Stil. This explains why the balance between a potent effector (Rpr) and its repressor (Stil) is critical in oogenesis; preventing catastrophic germline loss while preserving hid-mediated responsiveness.

      We have incorporated these phylogenetic analyses and the perspective into the revised Discussion section as follows.

      Revised Page 22, Line 475; rpr is conserved only within Diptera, although its IAP-binding motif, essential for apoptosis induction, is broadly conserved across metazoans (Du et al., 2000; Gottfried et al., 2004; Hegde et al., 2002; Shi, 2002; Verhagen et al., 2000; Vucic et al., 1998; Wing et al., 2001; L. Zhou, 2005) (Fig. S7). Similarly, stil is also restricted to Diptera, predominantly within Drosophila, whereas its BED-type zinc finger domain is widely conserved among diverse organisms (Aravind, 2000; Hayward et al., 2013; Tue et al., 2017b; H. Zhou et al., 2016). Phylogenetic patterns across Diptera are consistent with a model in which stil acts as a dedicated repressor of rpr in the Drosophila germline cells (Fig. S7). Due to its potent pro-apoptotic activity, rpr must be stringently repressed in a spatiotemporal manner through mechanisms that are specific to both cell type and developmental stage. During embryogenesis, repression of rpr is mediated by the Dpp-signaling factor Shn, which binds to the rpr regulatory region, whereas in intestinal stem cells (ISCs), its expression is suppressed through chromatin conformation. In Drosophila female germline cells, hid serves as the primary regulator of apoptosis, while rpr activity is generally suppressed (Park et al., 2019; Xing et al., 2015). However, rpr mutants exhibit reduced fertility despite producing viable eggs (Fig. 3H), suggesting that rpr-mediated apoptosis may be required for proper egg development. Accordingly, we propose that stil restrains rpr in the Drosophila female germline, allowing hid to predominate in apoptotic regulation.

      New Fig. S7;

      The legend of new Fig. S7;

      Figure S7 Conservation of Rpr and Stil within Diptera

      Homologs of Drosophila melanogaster Rpr and Stil were identified by BLASTp, aligned, and analyzed phylogenetically. Homologs are present across Dipteran lineages, with the genus Drosophila highlighted in blue. Branch lengths indicate the expected number of substitutions per site, as shown by the scale bar.

      Minor comments: Although this is a minor point, and this is not specifically pointing a finger at the author of this paper, I really don't like the term 'safeguard'. This term is now overutilized to add hype to papers, when 'is necessary' is sufficient. In this case, unless the answer is provided as to 'against what stil is safeguarding germ cells', this term is not meaningful. For example, if one can show that stil specifically senses germline-specific threat and tweaks the regular apoptotic pathway based on germline-specific needs, then the term 'safeguard' may be warranted.

      Response:

      In light of the reviewer's comment, we have revised the title of the manuscript to replace 'safeguard' with 'ensure,' which better reflects the demonstrated function of Stil without overstating its role. The new title of the manuscript is: 'Transcriptional Repression of reaper by Stand Still Ensures Female Germline Development in Drosophila'

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      In this well-executed study, Matsui et al. investigate how the female Drosophila germline prevents inappropriate apoptosis during development. They identify stand still (stil) as a key germline-specific repressor of apoptosis. Stil mutant flies are homozygous viable but female sterile due to widespread germ cell loss at the time of eclosion, which is driven by activation of the pro-apoptotic gene reaper (rpr) and caspase-dependent cell death. Germline-specific expression of anti-apoptotic factors such as p35 can rescue this phenotype, confirming that the defect lies in apoptotic regulation. The authors show that Stil directly represses rpr transcription through its BED-type zinc finger domain. Notably, undifferentiated germline cells remain resistant to apoptosis in the absence of stil, which the authors attribute to a silenced chromatin state at the rpr locus, marked by H3K9me3. These findings support a dual mechanism of protection: transcriptional repression of rpr by Stil, and a potential parallel chromatin-based silencing mechanism operating specifically in undifferentiated cells.

      Major Issues:

      1. Clarify cell identity in Figure 2E: It is unclear whether the apoptotic cells shown are somatic or germline in origin. Including a somatic marker such as 1B1 would allow the reader to clearly distinguish the apoptotic population and better interpret the figure.

      Response:

      We thank the reviewer for this helpful suggestion. Occasionally, the signal of the germline marker Vasa can be attenuated in dying germline cells. As suggested by the reviewer, we also tested α-Spectrin (a plasma membrane and fusome marker) instead of 1B1 together with TUNEL labeling, but this approach did not clearly distinguish somatic from germline apoptotic cells. To directly clarify cell identity, we now provide an improved co-stained image in which TUNEL-positive nuclei are surrounded by Vasa-positive cytoplasm, indicating a germline origin. Figure 2E has been updated accordingly.

      New Fig. 2E;

      Quantification of undifferentiated cells in mutants: There appears to be inconsistency in the representation of undifferentiated germ cells across figures. Early panels show near-complete germline loss, while later analyses focus on undifferentiated cells that are reportedly apoptosis-resistant. The authors should quantify the proportion of ovarioles retaining undifferentiated cells and present this data in Figure 1 or the supplements to resolve this discrepancy.

      Response:

      Thank you for raising the important point regarding the apparent inconsistency in the representation of undifferentiated germ cell populations. In early panes (Fig.1C, D), we analyzed adult ovaries of stil loss-of function mutants where all germline cells including undifferentiated germline stem cells (GSCs) are almost completely lost (Fig. 1C), showing nearly 100% agametic ovarioles. However, in later analysis such as those in Fig. 5A, B, we showed 3rd instar-larval ovaries of stil loss-of function mutants containing a few surviving germline cells nearby the future cap cell, the niche providing stem cell ligand, Decapentaplegic (Dpp) (Xie & Spradling, 1998). This suggests that Dpp-responsive undifferentiated germline cells may be relatively resistant to apoptosis caused by stil loss.

      Indeed, the GSC-like cells generated by the overexpression of a constitutively active form of Dpp receptor, Thickveins (Tkv.CA) or loss of the differentiation factor bam, were resistant to apoptosis caused by stil loss (Fig. 5C, D). These GSC-like cells may possess enhanced stemness, owing to either excessively elevated Dpp signaling or complete loss of bam, which could lead to stronger repression of rpr expression through tighter chromatin compaction.

      We added this argument in the Results section of the revised manuscript as follows.

      Revised Page 16, Line 361; Compared to GSCs, which were almost completely lost in stil mutants, GSC-like cells may retain a more robust stemness owing to the extremely elevated Dpp signaling pathway, potentially resulting in stronger repression of rpr expression.

      Interpretation of chromatin state at the rpr locus: The claim that H3K9me3, but not H3K27me3, marks the rpr locus is not fully convincing given the low ChIP-seq signal shown. Including a comparison to a known positive control locus would strengthen the argument. Alternatively, the authors could broaden the discussion to include global chromatin reorganization during germ cell to maternal transition, as reported in Kotb et al., 2024 and how such changes may impact rpr accessibility. Also stl mutant rescued with P53 have a "string of pearls" phenotype that are associated with germ cell to maternal transition defects (Figure S3, p53 OE)

      Response:

      We thank the reviewer for the thoughtful and constructive comment regarding the interpretation of chromatin state at the rpr locus. To strengthen the inference that the rpr locus shows H3K9me3 enrichment, whereas clear H3K27me3 enrichment is not evident, we have now included ChIP-seq signal profiles for known positive control loci, using light (lt) as an H3K9me3-enriched locus (Akkouche et al., 2017; Greil et al., 2003) and Ultrabithorax (Ubx) as a canonical H3K27me3 target (Torres-Campana et al., 2022). These comparisons support our interpretation that H3K9me3, rather than H3K27me3, characterize chromatin around the rpr locus in GSCs. Accordingly, while we do not exclude a minor H3K27me3 contribution, our analyses indicate H3K9me3 as the predominant signature at rpr in GSCs.

      New Fig.6B and 6C;

      The legend of new Fig. 6B and Fig. 6C;

      (B) H3K9me3 ChIP-seq signal at the rpr locus and the lt locus (H3K9me3-positive control) in GSCs and 4C NCs. (C) H3K27me3 ChIP-seq signal at the rpr locus and the Ubx locus (H3K27me3-positive control) in GSCs and 32C NCs.

      A sentence of Result section was revised as below.

      Revised Page 17, Line 396; As internal controls, we confirmed H3K9me3 enrichment at the light (lt) locus and H3K27me3 enrichment at the Ultrabithorax (Ubx) locus, consistent with their established chromatin states (Akkouche et al., 2017; Greil et al., 2003; Torres-Campana et al., 2022); relative to these controls, the rpr locus shows H3K9me3 but no clear H3K27me3 enrichment in GSCs.

      Regarding the suggestion to broaden the discussion to include global chromatin reorganization during the germline-to-maternal transition, as reported in Kotb et al., 2024, we agree that this is an important avenue for understanding rpr accessibility. The "string of pearls" phenotype observed in stil mutants rescued with P35 overexpression (Figure S3) is consistent with perturbations during this transition. However, a detailed analysis of such chromatin reorganization and its potential impact on rpr regulation lies beyond the scope of the present study and represents a valuable direction for future work.

      Broader analysis of rpr regulation in somatic cells: It would be informative to examine publicly available chromatin or transcriptional data for the rpr locus in somatic ovarian cells. This could help clarify whether rpr regulation by Stil is truly germline-specific or reflects broader developmental trends. This will also clarify why the flies are homozygous viable but female sterile.

      Response:

      We thank the reviewer for this insightful suggestion. We agree that exploring chromatin accessibility and transcriptional regulation at the rpr locus in somatic ovarian cells would provide valuable insights into tissue- or cell-type-specific chromatin environments that influence rpr expression.

      However, to our knowledge, there are currently no publicly available ATAC-seq or comparable chromatin datasets for purified ovarian somatic cells, including follicle cells or ovarian somatic cells (OSCs). As such, we are unable to incorporate this analysis in the current study. Nevertheless, we fully recognize the importance of this line of inquiry and consider it a valuable direction for future research.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript describes the characterization of stand still (stil), a previously identified gene needed for germ cell survival in Drosophila. The molecular function of Stil has until now remained poorly understood. This new work shows that loss of stil results in reaper (rpr)-dependent apoptosis within female germ cells. Loss of rpr suppresses many of the phenotypes observed in stil mutants. Experiments performed using Drosophila cell culture suggest that Stil binds to elements within the rpr promoter. DamID and structure/function experiments indicate that Stil likely directly represses the transcription of rpr within germ cells.

      In general, the experiments are well executed, and the data largely support the basic claims of the authors. Replicates are included and appropriate statistical analyses have been provided. The text and figures clear and accurate. Appropriate references were cited. There are a few things the authors should address or rephrase before publication.

      On page 9 line 190-192. The authors state "Altogether, these findings indicate that the loss of stil function not only triggers apoptosis that can be suppressed by apoptosis inhibitors but also causes defects in oogenesis progression that are not rescued by blocking cell death." Failure to rescue defects during mid-oogenesis could be due to insufficient transgene expression. Indeed, loss of rpr appears to rescue the fertility of stil mutants. The conclusions of this section should be restated.

      Response:

      We agree that the failure to rescue mid-oogenesis defects by P35 overexpression may, at least in part, be due to insufficient transgene expression. This explanation is particularly plausible given that loss of rpr more effectively restored fertility in stil mutants. As suggested by the reviewer, we have revised the relevant sentences, to avoid misinterpretation as below.

      Revised Page 9, Line 191; Altogether, these findings indicate that the loss of stil function triggers apoptosis that can be suppressed by apoptosis inhibitors.

      Revised Page 12, Line 253; The complete rescue of germline survival in stil rpr double mutants also suggests that the failure of P35 overexpression to restore mid-oogenesis defects may partly reflect insufficient transgene expression (Fig. S3).

      The authors should present the overlap between genes that change expression in a stil mutant and those in which the DamID experiments indicate are directly bound by Stil protein. DamID can sometimes give spurious results depending on expression levels. Further discussion along this point is necessary.

      Response:

      We thank the reviewer for raising this issue. As suggested, we have now analyzed the overlap between genes that are differentially expressed in stil mutant ovaries (identified by RNA-seq with stil mutant expressing P35) and genes that are potentially bound by Stil based on DamID-seq data (promoter-proximal peaks {less than or equal to}1 kb) as Supplementary Table 4. The list includes genes with DamID peaks within promoter regions and that also exhibit significant differential expression (|log2FC| > 1, adjusted p The overlap between DamID-seq and RNA-seq comprises 682 genes, including rpr, supporting the idea that Stil regulates rpr expression through interaction with its upstream promoter region. However, the detected peak signal at rpr was 3.41, which was not that strong, suggesting that Stil may also bind to and regulate other genes in female germline cells. Investigating the potential role of Stil in regulating other genes represents an important future direction of our study.

      We have included this analysis and argument in the revised manuscript as below.

      Revised Page 13, Line 280; A total of 682 genes with Stil-enriched peaks detected at promoter regions ({less than or equal to}1 kb) showed significantly altered expression in RNA-seq analysis of stil mutants expressing P35, including rpr (Supplementary Table 4).

      Revised Page 20, Line 440; Notably, the DamID peak intensity at the rpr locus reached 3.41, which is moderate rather than strong (Supplementary Table 4). This suggests that, in addition to repressing rpr, Stil may bind to and regulate other genomic loci in the female germline. Investigating the repertoire of Stil target genes and elucidating their roles in germline cells will be an important future direction of this study.

      For structure function experiments, a western blot showing expression levels of the different transgenes in ovaries should be included.

      Response:

      We thank the reviewer for this helpful comment. To address this point, we examined the expression levels of the four Stil variants (FL, NT, CT, and AAYA) in ovaries driven by a germline driver under a wild-type background using Western blotting. The representative blot and quantification from three biological replicates showed comparable expression levels among the variants, with the CT variant displaying a slightly reduced signal. Importantly, AAYA showed expression comparable to FL yet, like CT, failed to rescue, indicating that the rescue failure is not explained by expression-level differences. These data instead support a requirement for the BED-type zinc finger for Stil function in the germline. While we cannot fully exclude a minor contribution from the slightly lower expression of the CT variant to the lack of rescue, the AAYA result argues that loss of BED-type zinc-finger function is the primary cause; we note this caveat in the revised text. The corresponding data are now presented in Figure S6A of the revised manuscript.

      New Fig. S6A;

      The legend of new Fig. S6A;

      (A) Western blot analysis of 6×Myc-tagged Stil variants (FL, NT, CT, and AAYA) driven by NGT40-Gal4; NosGal4-VP16, with y w as a control. Stil variants were detected with anti-Myc, and α-Tubulin (αTub) served as a loading control. Arrowheads indicate Stil variant proteins. The lower panel shows quantification of the Myc/αTub signal ratio normalized to FL. Error bars indicate standard deviation (s.d.) (n = 3).

      A sentence of Result section was revised as below.

      Revised Page 13, Line 291; The expression of all four Stil variant proteins from the transgenes was confirmed, although Stil-CT showed a slightly reduced expression level (Fig. S6A)

      Revised Page 14, Line 305; Although CT shows slightly lower expression, AAYA fails to rescue despite FL-like expression, indicating that expression level is not limiting and that loss of the BED-type zinc finger underlies the phenotype.

      "With the addition of the new Fig. S6A, the following figure labels have been updated;

      Fig. S6A →S6B

      Fig. S6B → S6C

      Fig. S6C → S6D

      Fig. S6D → S6E

      Individual data points should be shown in each graph in place of simple bar graphs. This type of presentation was inconsistent throughout the paper.

      Response:

      We thank the reviewer for this constructive comment. In line with the reviewer's suggestion, we have revised the relevant graphs to include individual data points overlaid on bar plots with error bars. This modification enables readers to better assess data variability. We also ensured consistency in data presentation among the revised figures while maintaining clarity throughout the manuscript.

      Reference "G & D., 1997" should be properly formatted.

      Page 6 line 117 and 121- a couple of instances where "cell" should be "cells"

      Page 14 line 304- typo "Still"

      Response:

      As suggested, we have revised all figures to display individual data points in each graph instead of using simple bar graphs. This change has been applied consistently throughout the manuscript to improve data transparency and readability. The revised figures include Figure 1A, 2B, S1A, and S2A.

      We have also corrected the following textual issues;

      ・The reference "G & D., 1997" has been properly formatted as "Pennetta & Pauli, 1997".

      ・On page 6, lines 119 and 123, "cell" has been corrected to "cells" to ensure grammatical accuracy.

      ・On page 14, line 315, the typo "Still" has been corrected to "Stil".

      Reviewer #3 (Significance (Required)):

      The significance of the work lies in characterizing a previously unknown function of Stil. By showing that Stil acts to repress transcription of the cell death gene rpr, the authors provide new insights into how programmed cell death is regulated in the Drosophila female germline. Readers interested in reproductive biology, cell death, chromatin, and general developmental biology will find value in these new findings.

      One thing to consider is the possibility that Stil represses rpr in the context of the double strand breaks that form during meiosis. Experiments in the paper indicate that stil knockdown results in TUNEL labeling in region 2A/2B of the germarium. The authors should consider co-labeling for a meiosis marker (C(3)G or gammaH2Av) to see if this PCD correlates with this expression. In addition, they could test whether loss of Spo11 (mei-W68) suppresses stil phenotypes during early germ cell development. Relating the function of Stil to repression of cell death during this critical time of germ cell development would elevate the impact and significance of the paper. However, this may be considered beyond the scope of the current study.

      Response:

      We deeply thank the reviewer for this insightful and thought-provoking suggestion.

      As suggested, we conducted co-staining with γH2Av (DBS marker), as well as genetic interaction experiments with Spo11 (mei-W68) mutants to address this question shown below. In region 2 across all genotypes including y w control, and stil heterozygous and homozygous ovaries expressing P35, γH2Av signals were discernible and subsequently lost in region 3 through the meiotic recombination-specific DNA repair program (Additional Figure A). In stil mutants, however, an additional strong γH2Av signal was specifically observed in the oocyte, beyond the expected meiotic pattern. Furthermore, loss of meiotic recombination factors, including mei-W68, in stil mutants partially rescued the germline loss phenotype, although not to the same extent as in rpr mutants (Additional Figure B, C: 43.5 % in mei-W68-GLKD, 23.9 % in mei-P22P22 and 12.8 % in vilya826 versus 100 % with loss of rpr in Fig. 3E, F of the revised manuscript). These findings suggest that accumulation of meiotic DSBs is not the main cause of rpr upregulation in stil mutants. We feel that these analyses are beyond the scope of the current study, which focuses on identifying Stil as a transcriptional repressor of rpr and characterizing its role in germline apoptosis. Elucidating other mechanisms that elevate rpr expression in stil mutants will be the focus of future work. Hence, we are providing these data here for the reviewer's reference, but if the reviewer prefers, we would be happy to incorporate them into the manuscript.

      Additional Figure (A) Immunostaining of ovarioles from y w, stilEY16156/CyO; P35 OE (NGT40; NosGal4-VP16> P35), stilEY16156; P35 OE flies with antibody against DNA double-strand break marker H2Av (green), Vasa (red), and DAPI (blue). Insets show enlarged views of egg chamber. White dots indicate oocyte nuclei, Scale bar: 50 μm (ovariole) and 20 μm (egg chamber). (B) Immunofluorescence of Vasa (red) and DAPI (blue) in ovaries from stilEY16156, stilEY16156; mei-W68-GLKD (driven by NGT40; NosGal4-VP16), stilEY16156; meiP22P22, and stilEY16156; vilya826. Scale bar: 50 μm. (C) Quantification of the percentage of ovarioles containing germline cells in 2-3-day-old females. The genotypes of females are indicated below the x-axis, and the number of germaria analyzed is shown above each bar. Error bars represent the standard deviation (s.d.).

      Akkouche, A., Mugat, B., Barckmann, B., Varela-Chavez, C., Li, B., Raffel, R., Pélisson, A. & Chambeyron, S. (2017). Piwi Is Required during Drosophila Embryogenesis to License Dual-Strand piRNA Clusters for Transposon Repression in Adult Ovaries. Molecular Cell, 66(3), 411-419.e4. https://doi.org/10.1016/j.molcel.2017.03.017

      Greil, F., Kraan, I. van der, Delrow, J., Smothers, J. F., Wit, E. de, Bussemaker, H. J., Driel, R. van, Henikoff, S. & Steensel, B. van. (2003). Distinct HP1 and Su(var)3-9 complexes bind to sets of developmentally coexpressed genes depending on chromosomal location. Genes & Development, 17(22), 2825-2838. https://doi.org/10.1101/gad.281503

      Röper, K. & Brown, N. H. (2004). A Spectraplakin Is Enriched on the Fusome and Organizes Microtubules during Oocyte Specification in Drosophila. Current Biology, 14(2), 99-110. https://doi.org/10.1016/j.cub.2003.12.056

      Torres-Campana, D., Horard, B., Denaud, S., Benoit, G., Loppin, B. & Orsi, G. A. (2022). Three classes of epigenomic regulators converge to hyperactivate the essential maternal gene deadhead within a heterochromatin mini-domain. PLoS Genetics, 18(1), e1009615. https://doi.org/10.1371/journal.pgen.1009615

      Xie, T. & Spradling, A. C. (1998). decapentaplegic Is Essential for the Maintenance and Division of Germline Stem Cells in the Drosophila Ovary. Cell, 94(2), 251-260. https://doi.org/10.1016/s0092-8674(00)81424-5

    1. One of Synthesizer's most complex tasks is tracking overlapping memory writes:

      This is second important part. In which cases this aliasing resolution is required? "Overlapping" is just one example but not all.

      Example one: Suppose that MSTORE is going to store a DataPt "X" (32 bytes) in the memory at offset 0x03. After some time has passed, MLOAD is loading a 32-byte memory value at offset 0x00 to the stack. Say this "Y". Suppose there have been no "overlapping" during the meantime. Do you think the returned stack value "Y" is still the same as "X" even if there was no overlapping?

      Example 2: In general, Calldata can be much longer than 32 bytes. So whenever EVM is going to load specific function input argument "Y" onto the stack, it chunks the Calldata.

      It's quite tricky for a Synthesizer to shadow this, since DataPts cannot deal with words greater than 32 bytes! The current version of the Synthesizer avoids solving this problem: it simply takes the resulted chunk made by the EVM as an Oracle. The next version, currently in development, will fundamentally solve this: it will create another virtual MemoryPt dedicated to CallData and store DataPts for the function selector and function arguments there—this process is the reverse of resolving aliasing.

      Please see this code for dealing with "CALLDATALOAD".

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Point-by-Point Response to Reviewers for Manuscript #RC-2024-02720

      Manuscript Title: Molecular and Neural Circuit Mechanisms Underlying Sexual Experience-dependent Long-Term Memory in Drosophila.

      Corresponding Author: Woo Jae Kim

      We extend our sincere gratitude to the Managing Editor and both reviewers for their diligent and insightful evaluation of our manuscript. The comprehensive feedback provided has been invaluable, guiding us to significantly strengthen the manuscript's scientific rigor, logical cohesion, and overall impact. We have undertaken a substantial revision, incorporating new experimental evidence, reframing the central narrative, and improving data presentation to address all concerns raised.

      The major revisions include:

      1. New Experimental Evidence: We have performed three new sets of experiments to address key questions raised by the reviewers. First, we used the protein synthesis inhibitor cycloheximide to pharmacologically validate that the observed memory is indeed a form of long-term memory (LTM). Then, we performed genetic intersectional analyses to determine if the identified Yuelao (YL) neurons express the canonical sex-determination transcription factors doublesex (dsx) and fruitless (fru).
      2. Narrative Reframing and Logical Restructuring: We fully agree with the reviewers that the logic of the original manuscript was confusing, particularly regarding the distinction between the broad Mushroom Body (MB) Kenyon Cell (KC) population and the specific YL neurons. The manuscript has been extensively rewritten to present a clear, hypothesis-driven narrative. We now frame the initial KC-related findings as part of a broader screening effort that logically led to the identification and focused investigation of the YL neuron circuit.
      3. Refined Central Claim: Guided by the reviewers' feedback and our new data, we have sharpened our central claim. We now propose that YL neurons constitute a critical circuit for forming attractive taste- and pheromone-based memories derived from Gr5a neuronal inputs. This form of appetitive memory is distinct from the previously characterized internal reward state associated with ejaculation, adding a new layer to our understanding of how male flies remember and evaluate reproductive experiences.
      4. Improved Data Quality and Analysis: In response to valid critiques, all imaging figures have been replaced with high-resolution versions. Furthermore, our methods for fluorescence quantification, particularly for the TRIC calcium imaging experiments, have been corrected to include normalization against an internal reference channel, adhering to established best practices. All requested genetic control experiments have been performed. We are confident that these comprehensive revisions have fully addressed all concerns and have transformed our manuscript into a much stronger, more focused, and logically sound contribution. We thank you again for the opportunity to improve our work and look forward to your evaluation of the revised manuscript.

      Responses to Reviewer #1

      General Comments: This study explores the molecular and neural circuitry mechanisms underlying sexual experience-dependent long-term memory (SELTM) in male Drosophila. The authors use behavioral, imaging, and bioinformatics approaches to identify YL neurons, a subset of mushroom body (MB) projecting neurons, as crucial for SELTM formation. They propose that YL neurons receive inputs from WG neurons via the sNPF-sNPFR pathway and implicate molecular players such as orb2, fmr1, MDAR2-CaMK, and synaptic plasticity in their function.

      However, the evidence presented does not adequately support the authors' claims. The data fail to cohesively tell a logical story, and key conclusions appear to be based on assumptions and correlations rather than robust evidence.

      • Answer: We are deeply grateful to both reviewers for their thorough and constructive evaluation of our manuscript. Their collective feedback has been instrumental in helping us to clarify the study's rationale, strengthen our interpretations, and significantly improve the overall quality and impact of the work. We appreciate the recognition of our study's potential to advance the understanding of how sexual experience modifies future mating behaviors and to elucidate the neuronal and molecular mechanisms of how memory regulates a key sexual behavior in male Drosophila*.

      • *In response to the general comments, we have undertaken a major revision of the manuscript to improve the clarity, logic, and presentation. We have rewritten the Abstract and Introduction to more clearly define "sexual experience-dependent long-term memory" (SELTM) and articulate its significance in the context of adaptive decision-making and interval timing. The entire manuscript has been restructured to present a more logical, hypothesis-driven narrative that clearly distinguishes our initial broad screening from the focused investigation of the YL neuron circuit. We have also incorporated alternative interpretations of our data, particularly regarding the role of the YL circuit in regulating baseline mating duration in naive males, which has added more depth to the study. Finally, all figures have been remade in high resolution, and all requested genetic controls and methodological clarifications have been added to ensure rigor and reproducibility. We are confident that these revisions have addressed the reviewers' concerns and have resulted in a much stronger manuscript.

      Comment 1: The study identifies the knowledge gap (lines 103-104) but fails to integrate relevant literature, particularly Shohat-Ophir et al., Science (2012), and Zer-Krispil et al., Curr Biol (2018). These studies established that ejaculation induces appetitive memory in male Drosophila via corazonin and NPF neurons. The current study does not provide direct evidence that the "act of mating itself" drives SELTM, as it includes both courtship and copulation.

      Response: Thank you for highlighting these two landmark studies. We fully agree that Shohat-Ophir et al., Science (2012) and Zer-Krispil et al., Curr Biol (2018) were pivotal in demonstrating that ejaculation—and the accompanying corazonin/NPF signalling—can establish an appetitive memory in males.

      In the revised manuscript we have now integrated both papers on lines 111-118:

      “Previous work has shown that successful copulation is intrinsically rewarding to male Drosophila: a single mating encounter elevates brain neuropeptide F (NPF) levels and suppresses subsequent ethanol preference19. Importantly, Zer-Krispil et al. further demonstrated that ejaculation itself—artificially induced by optogenetic activation of corazonin (Crz) neurons—is sufficient to mimic this reward state, driving appetitive memory formation and up-regulation of NPF. These findings indicate that the act of ejaculation, rather than the entire courtship sequence, is the critical sensory event that gates post-mating reward.”

      Comment 2: The nature of the observed long-lasting reduced mating duration requires clearer characterization: Is this an associative memory or experience-dependent behavioral plasticity? Can the formation of this long-term memory be blocked by protein synthesis inhibitors, such as cycloheximide?

      Response: We thank the reviewer for this excellent suggestion to pharmacologically characterize the nature of the memory. To definitively test whether the observed SMD is a form of protein synthesis-dependent long-term memory (LTM), we performed a new experiment as suggested.

      We have now included data in new Figure supplement 1I showing that feeding males the protein synthesis inhibitor cycloheximide (CXM) for 24 hours immediately following the sexual experience completely blocks the formation of the long-lasting SMD phenotype. Control flies fed a vehicle solution exhibited robust SMD. This result provides strong evidence that SELTM is not merely a form of transient behavioral plasticity but is a genuine form of LTM that requires de novo protein synthesis for its consolidation, a hallmark of LTM across species.[1]

      The revised text was put on lines 173-176:

      " To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I)."

      Comment 3: While schematics illustrate the working hypotheses, the text lacks detailed explanations, leaving the reader unclear about the rationale behind certain conclusions.

      __Response: __Thank you very much for this insightful comment. We fully agree that the original manuscript did not provide sufficient textual justification for the conclusions derived from the schematics. In the revised version we have therefore added comprehensive explanations immediately following each figure (or schematic) that explicitly state the underlying rationale, the key observations supporting our hypotheses, and the logical steps leading to each conclusion. We believe these additions now make the reasoning transparent and easy to follow. We appreciate your feedback, which has substantially improved the clarity of our work.

      • *

      Comment 4*: The logic to draw certain conclusions was confusing and misleading. - For instance, the role of orb2 in SELTM is examined via knockdown in MB Kenyon cells (KCs) (using ok107>orb2-RNAi), which is irrelevant to the claim that orb2 functions in YL neurons. Additionally, RNAseq analyses (Fig. 1N-S) focusing on orb2 expression in a/b KCs are irrelevant to and cannot support the claim that Orb2 functions in YL neurons. *

      *- Similarly, the claim (lines 302-303) that sNPF-R expression is exclusive to MB KCs conflicts with data showing effects when sNPF-R is knocked down in YL neurons. How can knocking-down a gene, which is exclusively expressed in neural population A, in neural population B affect a phenotype? This inconsistency undermines the interpretation of the results. *

      *- Other examples include lines 223-227 and lines 246-249. It is very confusing how the authors came to the indications. *

      - The authors also kept confusing the readers and themselves by mistakenly referring to MB KC a-lobe and YL a-lobe projection. They may know the difference between the two neural populations but they did not always refer to the right one in the text.

      Response: We agree completely with the reviewer that the logic in the original manuscript was confusing and failed to clearly distinguish between the general MB Kenyon Cell (KC) population and the specific YL projection neurons. This was a major flaw, and we are grateful for the opportunity to correct it. We have undertaken a major revision of the manuscript's narrative and structure to present a clear, logical progression of discovery.

      The new logical flow of the manuscript is as follows:

      1. We first establish that sexual experience induces a robust, long-lasting SMD behavior that is dependent on protein synthesis
      2. We then perform initial experiments to implicate the MB as a key brain region. We show that broad inhibition of MB KCs (using the ok107-GAL4 driver) disrupts SMD behavior.This result establishes the general involvement of the MB but lacks cellular specificity.
      3. The remainder of the manuscript then focuses specifically on dissecting the molecular and cellular properties of these YL neurons. Finally, we have meticulously edited the entire manuscript to ensure that we always use precise terminology, clearly distinguishing between "YL neuron projections to the MB α-lobe" and the "MB KC α-lobe."

      Comment 5*: The imaging figures provided are unfocused and poorly resolved, making it difficult to assess data quality. *

      *- Colocalization analyses of orb2 and YL are unconvincing... Maximum intensity projection images are insufficient... complete image stacks with staining of orb2, YL, and KCs (MB-dsRed) are needed for validation. *

      - Quantification of imaging data appears flawed. For example, claims of orb2 and CaMKII upregulation in MB a-lobe projections (e.g., Fig. S2F-J, Fig. 3M,N) are confounded by widespread increases in intensity across the brain, lacking specificity.

      • *

      *- The TRIC experiment analysis should normalize GFP signals to internal reference channel (RFP in the TRIC construct)... *

      - In Fig. 6H-J, methods for counting synapse numbers are not described. How are synapse numbers counted in these low-resolution images?

      Response: We sincerely apologize for the poor quality of the imaging data presented in the original manuscript. We agree with the reviewer's critiques and have taken comprehensive steps to rectify these issues.

      • Image Quality: We apologize for not including the full image data in the original submission. The complete figure is now presented in revised Fig. 2J .
      • Fluorescence Quantification: The fluorescence quantification has been re-analyzed. The Methods section now includes a detailed description of our protocol.
      • TRIC Normalization: We apologize for not stating this explicitly in the previous version. As now described in the revised Methods subsection “Quantitative Analysis of Fluorescence Intensity”, all TRIC images were acquired with identical laser power and exposure settings. The GFP signal was background-corrected and then normalized to the RFP fluorescence encoded by the TRIC construct itself (UAS-mCD8RFP), which serves as an internal reference for construct expression and mounting thickness.
      • Synapse Counting: We agree with the reviewer that the resolution of our images was insufficient for accurate synapse particle counting. We have therefore removed the problematic analysis from the former Fig 6H-J. Our conclusions regarding synaptic plasticity now rest on the more robust and quantifiable data showing a significant increase in the total area of dendritic (DenMark) and presynaptic (syt.eGFP) markers. Comment 6: The study presents data from unrelated learning paradigms (e.g., olfactory associative learning, courtship conditioning; Fig. 7) without justifying how these paradigms relate to SELTM. Particularly, the authors claimed that SELTM is related to Gr5a, which leads to appetitive memories, which involve PAM dopaminergic neurons and MB horizontal lobes. However, the olfactory associative learning with electric shock and courtship conditioning lead to aversive memories, that involve PPL1 dopaminergic neurons and the vertical lobes.

      • *

      Response: We thank the reviewer for requesting clarification on the rationale for including these experiments. The purpose of these assays was to test the specificity of the YL neuron circuit. A key question is whether YL neurons represent a general-purpose LTM circuit or one specialized for a particular memory modality.

      The data show that knockdown of Orb2 or Nmdar2 specifically in YL neurons has no effect on the formation of LTM for aversive olfactory conditioning or aversive courtship conditioning. These negative results are critically important, as they demonstrate that the YL circuit is

      not required for all forms of LTM. This finding strongly supports our revised central claim that YL neurons are specialized for processing appetitive memories derived from the specific sensory context of mating (i.e., taste and pheromonal cues from Gr5a neurons).

      To improve the narrative flow of the main text, We rearranged the order of the articles. The relevant description is in lines 398-401:

      “To determine whether YL neurons constitute a general LTM circuit or are dedicated to the appetitive context of mating, we tested two canonical aversive paradigms: electric-shock olfactory conditioning and courtship conditioning. If YL neurons serve as a universal LTM module, their genetic impairment should also impair aversive memory.”

      lines 469-472:

      “The inability of YL perturbation to impair aversive memories (Fig. 7) corroborates that this micro-circuit is dedicated to Gr5a-dependent SELTM rather than acting as a generic LTM hub”

      Minor Issues

      Comment 1: Fig 2F. YL projections are labeled as MBONs. Clarify whether YL neurons are the upstream or downstream (MBON) of KCs.

      __Response: __Thank you for this helpful comment. As Huang et al., 2018[2] (Nat. Commun. 9:872) have mentioned, the MB093C-GAL4 driver is the MBON-α3 mushroom body output neuro. Consequently, YL neurons are positioned downstream of the MBON-α3.

      We have now clarified this point in the revised manuscript lines 217-222:

      “Each of these neurons extends a vertical fiber to the dorsal brain region, where they form dense arbors within the α-lobes of the mushroom body. Because the MB093C-GAL4 driver labels MBON-α3 output neuron[51], these YL arbors are positioned postsynaptically within the α-lobe and relay mushroom-body output to the anterior, middle, and posterior superior-medial protocerebrum.”

      Comment 2: Extensive language polishing is required, as several sentences are unclear (e.g., lines 169-172).

      Response: We apologize for the lack of clarity in the original text. The entire manuscript has undergone extensive revision and professional language editing to improve readability, precision, and grammatical accuracy.

      Responses to Reviewer #2


      Major Comments

      Comment 1: Clearer articulation of the rationale, motivation, and significance of the overall study design and individual experiments can strengthen the manuscript and promote readership. For example, the beginnings of the abstract and introduction should define what authors mean by sexual experience-dependent long-term memory and its significance (including why it is "significant for reproductive success" (lines 46 and 92)). Similarly, employing more concrete language throughout the text will help anchor and contextualize the study. Interpretation is occasionally insufficient or does not follow directly from the data provided.

      Response: We thank the reviewer for this valuable advice. We agree that the motivation and significance of our study were not articulated clearly enough. We have rewritten the Abstract and the beginning of the Introduction to address this. The revised text now explicitly defines SELTM as a protein synthesis-dependent, appetitive memory formed in response to gustatory and pheromonal cues. We explain its significance in the context of adaptive behavior, linking it to interval timing, a process by which male flies strategically adjust their mating investment (i.e., mating duration) based on prior experience to optimize reproductive success and energy expenditure. This framing provides a clearer context for our investigation into its underlying neural and molecular mechanisms.

      Comment 2: Long term memory: I do not work on Drosophila memory, but a cursory search suggests that the field generally considers long term memory in Drosophila to last for 24 hr to days (courtship memory lasts for >24 hr). SMD decays between 12-24 hr after copulation. Could SMD be considered a short-term effect?

      Response: This is an important point of clarification, as described in our response to Reviewer #1 (Major Comment 2), we have performed a new experiment demonstrating that the formation of SMD is blocked by the protein synthesis inhibitor cycloheximide (Figure 1I). This dependence on de novo protein synthesis is a defining characteristic of LTM, distinguishing it from short- and intermediate-term memory forms.[1] where memories lasting 12-24 hours are well-established as forms of LTM.[3] Therefore, based on both its duration and its molecular requirements, SMD represents a bona fide form of LTM.

      The relevant statement is in lines 174-178:

      "To determine whether the persistent reduction in mating duration (SMD) depends on de-novo protein synthesis, we fed males the translational inhibitor cycloheximide (CXM). Under this regimen, CXM completely abolished the SMD phenotype (Fig. 1I). This finding suggests that the reduction in mating investment is contingent upon the formation of LTM."

      Comment 3: Fig 1B-E share the same control (naive) group. If these experiments were performed in the same replicate(s), they should be plotted in the same figure. If not, please provide more details on how experimental blocks were set up and how controls compared between replicates.

      Response: Thank you for this helpful suggestion. We understand that sharing the same naive control across multiple panels (Fig. 1B–E) may raise concerns about data independence. However, we chose to present these panels separately for the following reasons:

      1. Clarity and Readability: Each panel (B–E) represents a distinct temporal condition (0 h, 6 h, 12 h, 24 h post-isolation). Separating them avoids visual clutter and allows readers to focus on one time point at a time, improving interpretability.

      __ Consistency with Internal Controls:__

      Although the naive group is identical across panels, each experimental block (i.e., each isolation time point) was run independently on same days, with internal controls (naive vs. experienced) included in every block. This ensures that statistical comparisons remain valid within each panel, even if the naive data overlap.

      We have now added a clear statement in the figure legend explaining that the naive group is shared across panels and that each time point was tested independently with internal controls. This maintains transparency while preserving the visual clarity of the current layout.

      Comment 4: Serial mating (Fig 1F-H): please provide details on the methods. How much time elapsed between successive matings? Is a paired statistical test used? Sperm depletion also affects mating duration, and without this information the authors' conclusion (lines 155-156) does not automatically follow from the data.

      Response:

      1. __ Interval between successive matings__ We have rewritten the Methods to state explicitly that “as soon as one copulation ended the male was transferred immediately to a fresh virgin female, so the next mating began immediately.”

      we add new method:

      " Serial mating ____duration ____assay

      Serial mating duration assay was identical to the standard procedure except that each male was presented with four DF virgin females in immediate succession: upon termination of the first copulation the male was immediately put into a fresh chamber containing the next virgin, the timer was restarted at first contact, and this step was repeated until four complete matings were recorded or 5 min elapsed without initiation, whichever came first."

      __ Statistical test__

      We apologize for omitting this detail. Unpaired t-test was used: for male the mating duration before (naïve) and after sexual experience was recorded, yielding paired observations. Prism’s unpaired t-test module was therefore applied to evaluate the mean difference.

      The figure legend now states “with error bars representing SEM. Asterisks represent significant differences, as revealed by the Unpaired t test and ns represents non-significant difference (**p __ Mating duration versus sperm depletion__

      We apologize for not having made it clear that these two observations are complementary, not contradictory. Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively [4]

      The revised text is as follows (lines235-241):

      "Previous work has shown that when male Drosophila copulate repeatedly, mating duration remains stable even though the number of sperm transferred—and thus the number of progeny sired—declines progressively. This dissociation confirms that the constant mating duration we observe in our serial-mating experiment (Fig. 1F–H) is consistent with normal sperm depletion and does not compromise the conclusion that the experience-dependent reduction in mating duration reflects long-term memory."

      Thank you for helping us improve the clarity of our study.

      Comment 5: Mating duration assay: Which isolation interval was chosen for the rest of the SMD experiments? The 12 hr en masse mating setup is relatively uncommon among studies on courtship/copulation/post-copulatory phenotypes, and introduces uncertainty and variability in the number and timing of matings that occurred during the 12 hr-window. This source of variability and its implication in interpreting the data should be acknowledged. Moreover, the 3 studies referenced in the methods all house males in groups of 4, whereas this study uses groups of 40. Could density confound the manifestation of SMD?

      Response: We thank the reviewer for these important methodological questions.

      • Isolation Interval: We have clarified in the Methods that virgin females were introduced into vials for last 1 day before assay.
      • Housing Density: This is an excellent point. To control for any potential effects of housing density itself, we have clarified that our "naive" control males are also housed in groups of 40 for the same duration as the "experienced" males. Therefore, the only difference between the two groups is the presence of females, isolating the effect of sexual experience from the effect of social density. Comment 6: SMD behavior: comparing orb2 mutants and controls (Fig 1M and Fig S1K-L), loss of orb2 actually reduces the mating duration in native males (mean ~15 min) relative to controls (~20 min), and have possibly no effect on experienced males (~15 min). This is inconsistent with the SMD behavior demonstrated in Fig 1B-E. The same pattern is found for mushroom body silencing (Fig 1P, Fig S1M-N), orb2 knockdown in YL neurons (Fig 2D, Fig S2A-B), Fmr1 knockdown in YL neurons (Fig 3D, Fig S2B, S3D) and most other experiments where mating duration is not significantly different between naive and experienced males. This might demonstrate a separate role of YL neurons and its related circuit in regulating mating duration in naive males. Could the authors discuss this interpretation? As an aside, plotting genetic controls next to experimental groups is customary and facilitates comparisons between relevant groups.

      Response: Thank you very much for this insightful observation.

      1. Baseline differences among genotypes We agree that absolute mating duration differs slightly between genotypes (e.g. naive orb2∆/+ about 15 min vs. wild-type CS about 20 min). Such differences are common when mutations or transgenes are introduced into distinct genetic backgrounds, and they do not affect the within-genotype comparison that is the essence of SMD (sexual-experience-dependent shortening of mating duration). Therefore, for every experiment we compared naive vs. experienced males of the identical genotype, keeping all other variables constant.

      Consistency of SMD across figures

      In every manipulation that disrupts SMD memory (orb2∆, MB silencing, orb2-RNAi in YL neurons, Fmr1-RNAi in YL neurons, etc.) the naive–experienced difference disappears, whereas the genetic controls retain a significant ΔMD. This is fully consistent with Fig. 1B–E and demonstrates that the memory trace, not the basal duration, is abolished.

      Figure layout

      Following your suggestion, we have re-ordered all bar graphs so that the relevant genetic controls are placed immediately adjacent to the experimental groups, making within-panel comparisons easier.

      We hope these clarifications and adjustments address your concerns.

      Comment 7: Bitmap figures: unfortunately the bitmap figures are compressed and their resolution makes it difficult to evaluate the visual evidence.

      Response: We apologize for the poor quality of the figures. All figures in the revised manuscript, including the scRNA-seq plots, have been remade as high-resolution vector graphics to ensure clarity and detail. For better understanding, different colored illustrations are also placed next to the scRNA-seq.

      Comment 8: Sexual dimorphism of YL neurons: many neurons involved in sexual behaviors express dsx and/or fru. Do YL neurons express them?

      Response: This is an excellent question. To address it, we performed a new set of experiments using genetic intersectional tools to test for the expression of doublesex (dsx) and fruitless (fru) in YL neurons. Our analysis, presented in figure supplement 2B, reveals that YL neurons are indeed fru-negative and dsx-negative. We therefore conclude that YL neurons do not belong to the canonical fru- or dsx-expressing neuronal classes and are unlikely to be intrinsically sex-specific.

      We add explanation in lines 223-229:

      "Our further analysis confirmed the presence of only three pairs of nuclei near the SOG in male brains, whereas female brains exhibit a greater number of nuclei near the AL (Fig. 2I), suggesting subtle sexual dimorphisms in GAL4MB093C-expressing neurons. Importantly, these neurons do not overlap with either fru- or dsx-expressing cells: co-immunostaining for GFP and Fru or Dsx revealed almost no colocalization in any brain region examined (Fig. S2B), indicating that YL neurons are distinct from the canonical sex-specific fru/dsx circuits."

      Comment 9: Genetic controls for some crucial experiments are not provided, e.g. Fig 2J, Fig S3C, Fig S3E-F Fig 5B-C, F, Q-R, Fig S5A-E.

      Response: We thank the reviewer for their careful attention to detail. We have now performed all the missing genetic control experiments.

      Comment 10: Colocalization experiments: please provide more detail on how fluorescence is normalized for each channel across images, especially when the overall expression of an effector is up- or down-regulated after mating.

      Response: We have updated the Methods section under "Quantitative Analysis of Fluorescence Intensity" and "Colocalization Analysis" to provide a detailed description of our normalization procedure.

      Comment 11: Please resolve this apparent contradiction on the expression of Nmdar1 and 2 in YL neurons. On line 261: "both receptors co-expressing in Orb2-positive MB Kenyon cells"; on line 279-281 "Nmdar1 is not expressed with YL neurons [...] whereas Nmdar2 is expressed in a single pair of YL neurons..."

      Response: We apologize for this contradiction, which arose from the confusing narrative structure of the original manuscript. As detailed in our response to Reviewer #1 (Major Comment 4), we have reframed the manuscript.

      Comment 12: Particle analysis (Fig 6H-J): experienced males seem to have more synapses but trend towards smaller average size. It would be helpful to show number of synapses and average size as paired data, or show that the total particle area is larger in experienced males.

      Response: We agree with the reviewer that this analysis was inconclusive and potentially misleading due to the limitations of image resolution. As noted in our response to Reviewer #1, we have removed this particle analysis (former Fig 6H-J) from the revised manuscript. Our claim for increased synaptic plasticity is now supported by the more robust measurement of the total fluorescence area of the pre- and postsynaptic markers, which shows a significant increase in experienced males.

      Minor Comments

      We thank the reviewer for their meticulous attention to detail. We have addressed all minor comments as follows:

      Comment 1: 1. Some figures (e.g. Fig 3M-R) and experiments (e.g. oenocyte scRNA-seq) are not referenced in the text. dnc data is shown alongside amn and rut but the rationale for its inclusion is not provided.

      __Response: __Original Fig. 3M-R (now Fig,3 M-O) was referenced on line 283. The rationale for including dnc data (as a canonical memory mutant) is now clarified in the text on lines 187-189:

      "To ask whether the same molecular machinery underlies the SMD that follows sexual experience, we tested three classical memory mutants: dunce (dnc), amnesiac (amn), and rutabaga(rut)."

      Comment 2: Some references might not point to the intended article (e.g. ref 123).

      __Response: __The reference list has been checked and corrected.


      Comment 3. Please plot genetic controls next to experimental genotypes as they are a crucial part of the experiment.


      __Response: __All relevant figures now include plots of genetic controls next to experimental genotypes.

      Comment 4. The "estimation statistics" plots are not necessary since the authors show individual data points. To further enhance data transparency, the authors may consider reducing the alpha and/or dot size so the individual data points are more readily visible.

      Response: Thank you for this helpful suggestion! We fully agree that data transparency is essential. After carefully testing lower alpha values and smaller dot sizes, we found that either change markedly obscured the dense regions of the distributions. So we didn't change the size of the point.

      The estimation-statistics overlays are kept for two courteous reasons: (i) they provide an immediate visual estimate of the mean difference and its 95 % confidence interval, which is the key statistic we base our conclusions on, and (ii) they spare readers from having to cross-reference separate tables.


      Comment 5. For accessibility, please avoid using green and red in the same plot.

      __Response: __We fully agree that red–green combinations can be problematic for colour-vision-impaired readers. In the present manuscript, however, the only panel that juxtaposes pure red and pure green is the Fly-SCOPE co-expression data. These scRNA-seq plots are provided only as supportive reference; the actual quantitative conclusions are based on independent genetic and imaging experiments that use magenta, cyan, yellow, and greyscale palettes. Moreover, the scope images are accompanied by detailed text descriptions of the overlapping cell clusters, so no essential information is lost even if the colours are indistinguishable

      Comment 6. Fly Cell Atlas: please show color scales used for each gene as the color thresholds are gene-specific by default.The 3-color overlap on SCope also makes it very difficult to see the expression pattern for each gene. One possibility is outlining the Kenyon cells on the tSNE plots and showing the expression for each gene of interest.

      Response: Thank you for this helpful suggestion. To avoid the ambiguity that arises from RGB blending in the three-colour overlay, we have added a small colour-mixing diagram next to the t-SNE plots (revised Fig. 1). This key shows the exact hues produced by pairwise and three-way overlaps:

      • Red + Green = Yellow

      • Red + Blue = Magenta

      • Green + Blue = Cyan

      • Red + Green + Blue = White

      Thus, yellow, magenta or cyan dots indicate co-expression of two genes, while white dots mark cells where all three genes are detected. this diagram allows readers to interpret overlap colours at a glance without re-entering SCope.

      Comment 7. Please also refer to Fly Cell Atlas as such. SCope is a visualization platform that houses multiple datasets.

      __Response: __The reference to Fly Cell Atlas was added.

      Comment 8. Please introduce acronyms and genetic reagents the first time they are mentioned.

      __Response: __All acronyms and genetic reagents are now defined upon their first use.

      Comment 9. Line 184: please specify "split-GAL4 reagents" instead of "advanced genetic tools".

      __Response: __We have replaced "advanced genetic tools" with the more specific term "Split-GAL4 reagents."


      Comment 10. Line 187: there are a few other lines with p>0.05 or p>0.01, so "uniquely" is inaccurate. Are the p-values in Table 1 corrected for multiple testing?

      __Response: __The term "uniquely" has been revised for accuracy. No correction for multiple testing was applied because each entry in Table 1 represents a single pairwise comparison (naive vs. exp). Thus only one p-value was generated per experiment.

      Comment 11. Some immunofluorescence panels lack scale bars.

      __Response: __Scale bars have been added to all immunofluorescence panels.


      Comment 12. Fig S2G-I: do authors mean "naive" instead of "group"?

      __Response: __The term "group" in Fig S2G-I has been corrected to "naive."

      Comment 13. Movie 1 should be referenced when YL neurons are first introduced.

      __Response: __Movie 1 is now referenced when YL neurons are first introduced in the text.

      Comment 14. Is Fig 4L similar to Fig 6L-N?

      __Response: __This error has been corrected after the article was reformatted

      Comment 15. Fig 7: please plot olfactory conditioning experiment results as either percentages, preference index, or paired numbers. "Number of flies/tube" is not as informative.

      __Response: __Thank you for pointing this out. The bars in Fig. 7 indeed represent paired numbers, but we realise this was not stated explicitly. We apologize for the lack of clarity. In the revised manuscript we explained it in detail in figure legend and method. In the figure, we also marked the percentage of flies that chose to avoid the side of the stimulus with gas, and explained it in the Figure legend.




      Reference

      1. Lagasse F, Devaud J-M, Mery F. A Switch from Cycloheximide-Resistant Consolidated Memory to Cycloheximide-Sensitive Reconsolidation and Extinction in Drosophila. J Neurosci. 2009;29: 2225–2230. doi:10.1523/jneurosci.3789-08.2009
      2. Huang C, Maxey JR, Sinha S, Savall J, Gong Y, Schnitzer MJ. Long-term optical brain imaging in live adult fruit flies. Nat Commun. 2018;9: 872. doi:10.1038/s41467-018-02873-1
      3. Tonoki A, Davis RL. Aging Impairs Protein-Synthesis-Dependent Long-Term Memory in Drosophila. J Neurosci. 2015;35: 1173–1180. doi:10.1523/jneurosci.0978-14.2015
      4. Macartney EL, Zeender V, Meena A, Nardo AND, Bonduriansky R, Lüpold S. Sperm depletion in relation to developmental nutrition and genotype in Drosophila melanogaster. Evol Int J Org Evol. 2021;75: 2830–2841. doi:10.1111/evo.14373
    1. Vous êtes-vous déjà demandé par où commencer?Vous n'êtes pas seul. Même avec les meilleurs outils à portée de main, tout configurer correctement peut sembler complexe. C'est pourquoi nous ne vous donnons pas les clés et ne vous disons pas au revoir…Nous configurons personnellement l'intégralité de votre système Go Plus.Voici ce que cela signifie:Configuration professionnelle de vos systèmes centraux✅ Vos produits/services ajoutés pour une vente immédiate.✅ Stripe ou PayPal connectés pour un encaissement immédiat.✅ Un formulaire de capture de leads configuré pour commencer à générer des leads.✅ Une automatisation des e-mails configurée pour votre formulaire de capture de leads, prête à vous permettre d'ajouter autant d'e-mails d'engagement que vous le souhaitez.Image de marque✅ Un domaine personnalisé configuré pour votre site web, votre tunnel de vente et/ou vos e-mails.✅ Logo et couleurs appliqués aux paramètres de votre entreprise.Plus besoin de tout gérer seul. Fini les longues nuits blanches à essayer de créer des tutoriels!Donnez-nous simplement les informations de votre entreprise et nous nous chargeons des tâches techniques les plus complexes.Réfléchissez-y: combien paieriez-vous normalement quelqu'un pour configurer tout cela?Un consultant technique facture généralement entre $100 et 150 de l'heure, et la configuration de ces systèmes peut généralement prendre jusqu'à 10 heures.Cela représente $1 000 à 1 500 de frais d'installation que vous n'aurez pas à payer, car nous les incluons dans l'offre exclusive Prez'effect.Nous le faisons car nous savons que plus vite vous serez opérationnel, plus vite votre entreprise connaîtra le succès.

      Vous avez déjà tout en main… mais vous ne savez pas par où commencer ?

      Rassurez-vous, vous n’êtes pas seul·e. Même avec les meilleurs outils, la configuration technique peut vite devenir un vrai casse-tête.

      🎯 C’est pourquoi nous ne nous contentons pas de vous remettre un accès : nous configurons personnellement tout votre système Go Plus, prêt à fonctionner.

      Voici ce que cela inclut :

      ⚙️ Mise en place technique professionnelle ✅ Vos produits et services ajoutés, prêts à la vente ✅ Connexion Stripe ou PayPal pour encaisser immédiatement ✅ Formulaire de capture de leads configuré pour commencer à collecter des contacts ✅ Automatisation e-mail associée, prête à accueillir vos séquences d’engagement

      🎨 Personnalisation à votre image ✅ Domaine personnalisé configuré pour votre site, tunnel de vente et e-mails ✅ Logo et couleurs intégrés à votre identité de marque

      Plus besoin de passer des nuits entières à chercher “comment faire”.

      🧠 Nous prenons en charge toutes les étapes techniques pour que vous puissiez vous concentrer sur l’essentiel : votre contenu et vos clients.

      💰 À titre de comparaison, un prestataire technique facture entre 100 et 150 $/h, et ce type d’installation prend souvent 10 heures.

      Soit environ 1 000 à 1 500 $ de frais d’installation — entièrement inclus dans l’offre exclusive Prez’effect.

      Parce que plus vous êtes opérationnel·le rapidement, plus vite votre activité décolle

    2. Le "Growth Plan" de Xperiencify coûte $99 par mois, à acheter séparément…Mais jusqu'à ce que le compte à rebours en haut de la page atteigne zéro, vous pouvez bénéficier d'une année complète du Gro XP grâce à cette Offre Exclusive Prez'effect.Ce forfait comprend :✅ Formations, étudiants, formateurs et communautés illimités✅ 10 formations publiées/actives que vous pouvez vendre à n'importe qui dans le monde✅ Jusqu'à 1 000 étudiants actifs chaque mois✅ Toutes les fonctionnalités premium de notre plateforme primée, y compris la gamification avancée✅ Domaine personnalisé et suite d'automatisation complèteLa plupart des autres plateformes démarrent à $149/mois, ce qui représente évidemment un investissement important à ce stade de votre activité.Si vous envisagez sérieusement de lancer votre formation mais souhaitez minimiser vos dépenses dès maintenant, investir dans l'Offre Exclusive Prez'effect est tout à fait judicieux.Alors que d'autres s'angoissent à l'idée de payer 99 $ par mois pour notre "Growth Plan", vous aurez une année complète pour créer et développer votre entreprise sans souci.

      Le Growth Plan de Xperiencify coûte habituellement 99 $ par mois, à payer séparément.

      Mais dans le cadre de cette offre exclusive Prez’effect, vous bénéficiez d’une année complète offerte, pour créer, tester et vendre vos formations en toute liberté.

      🎁 Ce plan inclut :

      ✅ Formations, étudiants, formateurs et communautés illimités ✅ Jusqu’à 10 formations actives prêtes à la vente dans le monde entier ✅ Jusqu’à 1 000 apprenants actifs chaque mois ✅ Toutes les fonctionnalités premium de la plateforme (dont la gamification avancée 🎮) ✅ Votre domaine personnalisé + une suite d’automatisation complète

      💡 Là où la plupart des plateformes similaires débutent à 149 $/mois, cette offre vous permet de travailler une année entière sans abonnement, le temps de bâtir solidement votre écosystème.

      👉 Un vrai tremplin pour celles et ceux qui veulent se lancer sérieusement, sans se ruiner ni se limiter.

      Vous bénéficiez de 12 mois pour créer, affiner et déployer vos formations, avec tous les outils nécessaires pour passer de l’idée… à votre succès !

    1. Analyse du "Bon Pays" : Mondialisation, Coopération et Intérêt National

      https://hyp.is/go?url=https%3A%2F%2Findex.goodcountry.org%2F&group=world

      Résumé

      Ce document de synthèse analyse les thèses centrales présentées par Simon Anholt concernant les défis de la mondialisation et la nécessité d'une nouvelle approche de la gouvernance mondiale.

      Le problème fondamental identifié est un décalage critique : alors que les problèmes les plus urgents de l'humanité (changement climatique, pandémies, crises économiques) sont mondialisés, les systèmes de gouvernance restent ancrés dans des cadres nationaux égoïstes.

      Trois obstacles majeurs à la coopération internationale sont identifiés : la demande des électeurs pour des politiques nationalistes, une forme de "psychopathie culturelle" qui limite l'empathie envers les étrangers, et la fausse croyance des dirigeants que les agendas nationaux et internationaux sont incompatibles.

      La solution proposée repose sur une découverte issue d'une analyse de données à grande échelle sur la perception des pays (l'Indice des Marques Nationales). Cette recherche révèle que les pays les plus admirés ne sont pas les plus riches ou les plus puissants, mais ceux perçus comme "bons" – c'est-à-dire ceux qui contribuent de manière significative au bien commun de l'humanité.

      Cette découverte lie directement la "bonté" d'un pays à son "intérêt personnel", car une réputation positive attire investissements, tourisme et talents, rendant la collaboration internationale un levier de compétitivité nationale.

      Pour matérialiser ce concept, Anholt a créé "l'Indice des Bons Pays", qui mesure la contribution de chaque nation à l'humanité.

      L'Irlande se classe au premier rang, démontrant qu'un pays peut honorer ses devoirs internationaux tout en gérant ses propres défis économiques.

      L'appel à l'action final est d'intégrer le terme "bon" (défini comme le contraire d'égoïste) dans le discours public et politique, afin de créer une pression citoyenne pour que les gouvernements adoptent des politiques plus collaboratives et tournées vers l'extérieur.

      1. Le Paradoxe de la Mondialisation : Problèmes Mondiaux, Solutions Nationales

      La mondialisation a profondément interconnecté le monde, créant un système où des événements locaux peuvent avoir des répercussions mondiales quasi instantanées. Des exemples frappants illustrent cette réalité :

      Sanitaire : "Il y a 20 ou 30 ans, si un poulet attrapait froid, éternuait et mourait dans un petit village d'Extrême-Orient, c'était tragique pour le poulet [...] mais c'était peu probable qu'on ait peur d'une pandémie mondiale".

      Économique : "si une banque américaine prêtait trop d'argent à des clients non solvables et que la banque faisait faillite, c'était néfaste [...] mais nous ne pensions pas que ça amènerait un effondrement du système économique pendant presque dix ans."

      Cette interconnexion a apporté des bénéfices, comme le succès des Objectifs du Millénaire, prouvant que "l'espèce humaine peut arriver à d'extraordinaires progrès en se montrant unie et persévérante".

      Cependant, la mondialisation a également amplifié les problèmes : réchauffement climatique, terrorisme, épidémies, trafic de drogue, et bien d'autres.

      Le problème central est que l'humanité n'a pas adapté ses structures de gouvernance à cette nouvelle réalité.

      L'organisation mondiale est toujours fragmentée en environ 200 États-nations dont les gouvernements sont programmés pour se concentrer quasi exclusivement sur leurs intérêts nationaux.

      Citation clé : "Il faut que nous arrivions à nous reprendre et trouver comment améliorer la mondialisation des solutions pour éviter de devenir une espèce victime de la mondialisation des problèmes."

      2. Les Obstacles à la Coopération Internationale

      Simon Anholt identifie trois raisons principales qui expliquent la lenteur des progrès sur les enjeux mondiaux et la persistance de l'approche nationaliste.

      2.1 La Demande des Électeurs

      La première raison est que les citoyens eux-mêmes exigent de leurs gouvernements une focalisation interne.

      En élisant ou en tolérant des gouvernements, le message envoyé est clair : la priorité est la prospérité, la croissance, la compétitivité et la justice à l'intérieur des frontières nationales.

      Les politiciens, en regardant "dans un microscope" plutôt que "dans un télescope", ne font que répondre à cette demande.

      2.2 La "Psychopathie Culturelle"

      Le deuxième obstacle est un biais psychologique collectif qu'Anholt nomme la "psychopathie culturelle".

      Il s'agit d'un manque de capacité à ressentir une véritable empathie pour les personnes qui sont culturellement différentes.

      • L'empathie fonctionne bien avec ceux qui "nous ressemblent, marchent, parlent, mangent, prient et s'habillent comme nous".

      • En revanche, les autres, ceux qui sont différents, sont souvent perçus comme des "personnages en carton", des figures bidimensionnelles plutôt que des êtres humains complexes.

      Ce manque d'empathie à grande échelle empêche une véritable solidarité mondiale.

      2.3 La Fausse Dichotomie des Agendas

      Le troisième obstacle est la croyance, particulièrement ancrée chez les dirigeants, que les agendas nationaux et internationaux sont mutuellement exclusifs. Anholt qualifie cette idée de "grand n'importe quoi".

      Fort de son expérience de conseiller politique auprès de nombreux gouvernements, il affirme n'avoir jamais vu "un seul problème national qui ne pouvait être résolu de façon plus inventive, plus efficace et plus rapide qu'en le traitant comme un problème international".

      3. L'Intérêt Personnel comme Levier du Changement

      Pour surmonter ces obstacles et la résistance naturelle de l'être humain au changement, il est nécessaire de démontrer qu'un comportement plus collaboratif sert l'intérêt personnel des nations. C'est le cœur de la découverte d'Anholt.

      3.1 La Recherche sur la Réputation des Nations

      En 2005, Anholt a lancé l'Indice des Marques Nationales, une étude à très grande échelle recueillant les perceptions du public mondial sur les différents pays.

      Cette base de données de 200 milliards de points de données a révélé un fait économique crucial :

      • Les pays dépendent "énormément de leurs réputations afin de survivre et de prospérer dans le monde".

      • Une bonne image (ex : Allemagne, Suède, Suisse) facilite tout : tourisme, investissements, exportation.

      • Une mauvaise image rend tout "difficile et [...] cher".

      3.2 La Découverte Clé : Admiration et "Bonté"

      En interrogeant cette base de données pour comprendre pourquoi certains pays sont plus admirés que d'autres, la réponse fut surprenante.

      Ce n'est ni la richesse, ni la puissance, ni la modernité qui est le facteur principal.

      Citation clé : "les pays que nous préférons sont les bons pays. [...] nous admirons surtout un pays parce qu'il est bon."

      Un "bon pays" est défini comme un pays qui "contribue au monde dans lequel nous vivons", le rendant "plus sûr, meilleur, plus riche ou plus juste".

      Cette découverte crée un lien direct et puissant entre l'altruisme et l'égoïsme : pour réussir économiquement (servir son intérêt national), un pays doit "faire le bien" et contribuer à l'humanité.

      "Plus vous collaborez, plus vous devenez compétitif."

      4. L'Indice des Bons Pays : Une Nouvelle Mesure du Succès

      Pour concrétiser cette idée, Anholt et son équipe ont développé l'Indice des Bons Pays ("The Good Country Index").

      Objectif : Mesurer la contribution exacte de chaque pays, non pas à ses propres habitants, mais au reste de l'humanité.

      Définition de "Bon" : Le terme n'a pas une connotation morale ("bon" vs "mauvais"), mais est utilisé comme le contraire de "égoïste".

      Un "bon" pays est un pays qui se préoccupe des intérêts de tous.

      4.1 Classement et Enseignements

      Les résultats de l'indice offrent des perspectives importantes :

      Rang

      Pays

      Observations Clés

      1

      Irlande

      Le pays qui, par habitant ou par dollar de PIB, contribue le plus au monde. Salué pour sa capacité à maintenir ses devoirs internationaux tout en se relevant d'une grave récession.

      2

      Finlande

      Très proche de l'Irlande, avec des scores globalement élevés.

      13

      Allemagne

      21

      États-Unis

      66

      Mexique

      95

      Russie

      Pays en développement focalisé sur sa construction interne.

      107

      Chine

      Pays en développement focalisé sur sa construction interne.

      Domination Européenne : Le top 10 est majoritairement composé de pays riches d'Europe occidentale (à l'exception de la Nouvelle-Zélande).

      L'Importance de l'Attitude : La présence du Kenya dans le top 30 est cruciale.

      Elle prouve que la contribution au monde n'est pas qu'une question d'argent, mais "d'attitude", de "culture" et de volonté politique de se tourner vers l'extérieur.

      Les données complètes de l'indice sont accessibles sur le site goodcountry.org.

      5. Appel à l'Action : Redéfinir le Discours Politique

      La finalité de ce projet n'est pas seulement de classer les pays, mais de changer radicalement le dialogue public et politique.

      5.1 Changer le Vocabulaire du Succès

      Anholt exprime sa lassitude face à un vocabulaire centré sur l'égoïsme national : "J'en ai assez d'entendre parler de compétitivité.

      J'en ai assez d'entendre parler de prospérité, de richesse, de croissance rapide. J'en ai assez d'entendre parler de pays heureux parce que ça reste quand même égoïste."

      Il propose de réinjecter le mot "bon" (au sens de "non-égoïste") dans la conversation.

      5.2 Un Outil pour les Citoyens

      Ce mot doit devenir un "bâton qui s'abattrait sur nos politiciens".

      Les citoyens sont invités à utiliser ce critère pour juger les politiques et les dirigeants en se posant la question :

      Question clé : "Est-ce qu'un bon pays ferait ça ?"

      L'objectif ultime est de faire évoluer les mentalités, pour que le désir principal des citoyens ne soit plus de vivre dans un pays riche ou compétitif, mais dans un "bon pays".

      Un pays dont on peut être fier à l'international, car il est reconnu pour sa contribution positive au monde entier.

    1. However, Costa Rican authorities 3According to Informe Estado de la Ciencia, la Tecnología y la Innovación, 2014.19have certainly made efforts to confront a number of challenges in critical areas, as set forth in the PNCTI.First, to re-orient investment efforts in R&D financed with public resources towardinnovation activities that will have real-world applications, the following initiatives are underway

      Opportunity base for costa rica if done right

    1. Reviewer #1 (Public review):

      Summary:

      Bansal et al. present a study on the fundamental blood and nectar feeding behaviors of the critical disease vector, Anopheles stephensi. The study encompasses not just the fundamental changes in blood feeding behaviors of the crucially understudied vector, but then uses a transcriptomic approach to identify candidate neuromodulation pathways which influence blood feeding behavior in this mosquito species. The authors then provide evidence through RNAi knockdown of candidate pathways that the neuromodulators sNPF and Rya modulate feeding either via their physiological activity in the brain alone or through joint physiological activity along the brain-gut axis (but critically not the gut alone). Overall, I found this study to be built on tractable, well-designed behavioral experiments.

      Their study begins with a well-structured experiment to assess how the feeding behaviors of A. stephensi change over the course of its life history and in response to its age, mating, and oviposition status. The authors are careful and validate their experimental paradigm in the more well-studied Ae. aegypti, and are able to recapitulate the results of prior studies, which show that mating is a prerequisite for blood feeding behaviors in Ae. aegypt. Here they find A. Stephensi, like other Anopheline mosquitoes, has a more nuanced regulation of its blood and nectar feeding behaviors.

      The authors then go on to show in a Y-maze olfactometer that ,to some degree, changes in blood feeding status depend on behavioral modulation to host cues, and this is not likely to be a simple change to the biting behaviors alone. I was especially struck by the swap in valence of the host cues for the blood-fed and mated individuals, which had not yet oviposited. This indicates that there is a change in behavior that is not simply desensitization to host cues while navigating in flight, but something much more exciting is happening.

      The authors then use a transcriptomic approach to identify candidate genes in the blood-feeding stages of the mosquito's life cycle to identify a list of 9 candidates that have a role in regulating the host-seeking status of A. stephensi. Then, through investigations of gene knockdown of candidates, they identify the dual action of RYa and sNPF and candidate neuromodulators of host-seeking in this species. Overall, I found the experiments to be well-designed. I found the molecular approach to be sound. While I do not think the molecular approach is necessarily an all-encompassing mechanism identification (owing mostly to the fact that genetic resources are not yet available in A. stephensi as they are in other dipteran models), I think it sets up a rich line of research questions for the neurobiology of mosquito behavioral plasticity and comparative evolution of neuromodulator action.

      Strengths:

      I am especially impressed by the authors' attention to small details in the course of this article. As I read and evaluated this article, I continued to think about how many crucial details could potentially have been missed if this had not been the approach. The attention to detail paid off in spades and allowed the authors to carefully tease apart molecular candidates of blood-seeking stages. The authors' top-down approach to identifying RYamide and sNPF starting from first principles behavioral experiments is especially comprehensive. The results from both the behavioral and molecular target studies will have broad implications for the vectorial capacity of this species and comparative evolution of neural circuit modulation.

      Weaknesses:

      There are a few elements of data visualizations and methodological reporting that I found confusing on a first few read-throughs. Figure 1F, for example, was initially confusing as it made it seem as though there were multiple 2-choice assays for each of the conditions. I would recommend removing the "X" marker from the x-axis to indicate the mosquitoes did not feed from either nectar, blood, or neither in order to make it clear that there was one assay in which mosquitoes had access to both food sources, and the data quantify if they took both meals, one meal, or no meals.

      I would also like to know more about how the authors achieved tissue-specific knockdown for RNAi experiments. I think this is an intriguing methodology, but I could not figure out from the methods why injections either had whole-body or abdomen-specific knockdown.

      I also found some interpretations of the transcriptomic to be overly broad for what transcriptomes can actually tell us about the organism's state. For example, the authors mention, "Interestingly, we found that after a blood meal, glucose is neither spent nor stored, and that the female brain goes into a state of metabolic 'sugar rest', while actively processing proteins (Figure S2B, S3)".

      This would require a physiological measurement to actually know. It certainly suggests that there are changes in carbohydrate metabolism, but there are too many alternative interpretations to make this broad claim from transcriptomic data alone.

    2. Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bansal et al examine and characterize feeding behaviour in Anopheles stephensi mosquitoes. While sharing some similarities to the well-studied Aedes aegypti mosquito, the authors demonstrate that mated females, but not unmated (virgin) females, exhibit suppression in their blood-feeding behaviour. Using brain transcriptomic analysis comparing sugar-fed, blood-fed, and starved mosquitoes, several candidate genes potentially responsible for influencing blood-feeding behaviour were identified, including two neuropeptides (short NPF and RYamide) that are known to modulate feeding behaviour in other mosquito species. Using molecular tools, including in situ hybridization, the authors map the distribution of cells producing these neuropeptides in the nervous system and in the gut. Further, by implementing systemic RNA interference (RNAi), the study suggests that both neuropeptides appear to promote blood-feeding (but do not impact sugar feeding), although the impact was observed only after both neuropeptide genes underwent knockdown.

      Strengths and/or weaknesses:

      Overall, the manuscript was well-written; however, the authors should review carefully, as some sections would benefit from restructuring to improve clarity. Some statements need to be rectified as they are factually inaccurate.

      Below are specific concerns and clarifications needed in the opinion of this reviewer:

      (1) What does "central brains" refer to in abstract and in other sections of the manuscript (including methods and results)? This term is ambiguous, and the authors should more clearly define what specific components of the central nervous system was/were used in their study.

      (2) The abstract states that two neuropeptides, sNPF and RYamide are working together, but no evidence is summarized for the latter in this section.

      (3) Figure 1<br /> Panel A: This should include mating events in the reproductive cycle to demonstrate differences in the feeding behavior of Ae. aegypti.<br /> Panel F: In treatments where insects were not provided either blood or sugar, how is it that some females and males had fed? Also, it is unclear why the y-axis label is % fed when the caption indicates this is a choice assay. Also, it is interesting that sugar-starved females did not increase sugar intake. Is there any explanation for this (was it expected)?

      (4) Figure 3<br /> In the neurotranscriptome analysis of the (central) brain involving the two types of comparisons, can the authors clarify what "excluded in males" refers to? Does this imply that only genes not expressed in males were considered in the analysis? If so, what about co-expressed genes that have a specific function in female feeding behaviour?

      (5) Figure 4<br /> The authors state that there is more efficient knockdown in the head of unfed females; however, this is not accurate since they only get knockdown in unfed animals, and no evidence of any knockdown in fed animals (panel D). This point should be revised in the results test as well. Relatedly, blood-feeding is decreased when both neuropeptide transcripts are targeted compared to uninjected (panel C) but not compared to dsGFP injected (panel E). Why is this the case if authors showed earlier in this figure (panel B) that dsGFP does not impact blood feeding? In addition, do the uninjected and dsGFP-injected relative mRNA expression data reflect combined RYa and sNPF levels? Why is there no variation in these data, and how do transcript levels of RYa and sNPF compare in the brain versus the abdomen (the presentation of data doesn't make this relationship clear).

      (6) As an overall comment, the figure captions are far too long and include redundant text presented in the methods and results sections.

      (7) Criteria used for identifying neuropeptides promoting blood-feeding: statement that reads "all neuropeptides, since these are known to regulate feeding behaviours". This is not accurate since not all neuropeptides govern feeding behaviors, while certainly a subset do play a role.

      (8) In the section beginning with "Two neuropeptides - sNPF and RYa - showed about 25% and 40% reduced mRNA levels...", the authors state that there was no change in blood-feeding and later state the opposite. The wording should be clarified as it is unclear.

      (9) Just before the conclusions section, the statement that "neuropeptide receptors are often ligand-promiscuous" is unjustified. Indeed, many studies have shown in heterologous systems that high concentrations of structurally related peptides, which are not physiologically relevant, might cross-react and activate a receptor belonging to a different peptide family; however, the natural ligand is often many times more potent (in most cases, orders of magnitude) than structurally related peptides. This is certainly the case for various RYamide and sNPF receptors characterized in various insect species.

      (10) Methods<br /> In the dsRNA-mediated gene knockdown section, the authors could more clearly describe how much dsRNA was injected per target. At the moment, the reader must carry out calculations based on the concentrations provided and the injected volume range provided later in this section.

      It is also unclear how tissue-specific knockdown was achieved by performing injection on different days/times. The authors need to explain/support, and justify how temporal differences in injection lead to changes in tissue-specific expression. Does the blood-brain barrier limit knockdown in the brain instead, while leaving expression in the peripheral organs susceptible? For example, in Figure 4, the data support that knockdown in the head/brain is only effective in unfed animals compared to uninjected animals, while there is no evidence of knockdown in the brain relative to dsGFP-injected animals. Comparatively, evidence appears to show stronger evidence of abdominal knockdown mostly for the RYa transcript (>90%) while still significantly for the sNPF transcript (>60%).

    3. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      Bansal et al. present a study on the fundamental blood and nectar feeding behaviors of the critical disease vector, Anopheles stephensi. The study encompasses not just the fundamental changes in blood feeding behaviors of the crucially understudied vector, but then uses a transcriptomic approach to identify candidate neuromodulation pathways which influence blood feeding behavior in this mosquito species. The authors then provide evidence through RNAi knockdown of candidate pathways that the neuromodulators sNPF and Rya modulate feeding either via their physiological activity in the brain alone or through joint physiological activity along the brain-gut axis (but critically not the gut alone). Overall, I found this study to be built on tractable, well-designed behavioral experiments.

      Their study begins with a well-structured experiment to assess how the feeding behaviors of A. stephensi change over the course of its life history and in response to its age, mating, and oviposition status. The authors are careful and validate their experimental paradigm in the more well-studied Ae. aegypti, and are able to recapitulate the results of prior studies, which show that mating is a prerequisite for blood feeding behaviors in Ae. aegypt. Here they find A. Stephensi, like other Anopheline mosquitoes, has a more nuanced regulation of its blood and nectar feeding behaviors.

      The authors then go on to show in a Y-maze olfactometer that ,to some degree, changes in blood feeding status depend on behavioral modulation to host cues, and this is not likely to be a simple change to the biting behaviors alone. I was especially struck by the swap in valence of the host cues for the blood-fed and mated individuals, which had not yet oviposited. This indicates that there is a change in behavior that is not simply desensitization to host cues while navigating in flight, but something much more exciting is happening.

      The authors then use a transcriptomic approach to identify candidate genes in the blood-feeding stages of the mosquito's life cycle to identify a list of 9 candidates that have a role in regulating the host-seeking status of A. stephensi. Then, through investigations of gene knockdown of candidates, they identify the dual action of RYa and sNPF and candidate neuromodulators of host-seeking in this species. Overall, I found the experiments to be well-designed. I found the molecular approach to be sound. While I do not think the molecular approach is necessarily an all-encompassing mechanism identification (owing mostly to the fact that genetic resources are not yet available in A. stephensi as they are in other dipteran models), I think it sets up a rich line of research questions for the neurobiology of mosquito behavioral plasticity and comparative evolution of neuromodulator action.

      We appreciate the reviewer’s detailed summary of our work. We thank them for their positive comments and agree with them on the shortcomings of our approach.

      Strengths:

      I am especially impressed by the authors' attention to small details in the course of this article. As I read and evaluated this article, I continued to think about how many crucial details could potentially have been missed if this had not been the approach. The attention to detail paid off in spades and allowed the authors to carefully tease apart molecular candidates of blood-seeking stages. The authors' top-down approach to identifying RYamide and sNPF starting from first principles behavioral experiments is especially comprehensive. The results from both the behavioral and molecular target studies will have broad implications for the vectorial capacity of this species and comparative evolution of neural circuit modulation.

      We really appreciate that the reviewer has recognised the attention to detail we have tried to put, thank you!

      Weaknesses:

      There are a few elements of data visualizations and methodological reporting that I found confusing on a first few read-throughs. Figure 1F, for example, was initially confusing as it made it seem as though there were multiple 2-choice assays for each of the conditions. I would recommend removing the "X" marker from the x-axis to indicate the mosquitoes did not feed from either nectar, blood, or neither in order to make it clear that there was one assay in which mosquitoes had access to both food sources, and the data quantify if they took both meals, one meal, or no meals.

      We thank the reviewer for flagging the schematic in figure 1F. As suggested, we have removed the “X” markers from the x-axis and revised the axis label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose in the assay. For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data, as it does not capture the variability in the data.

      I would also like to know more about how the authors achieved tissue-specific knockdown for RNAi experiments. I think this is an intriguing methodology, but I could not figure out from the methods why injections either had whole-body or abdomen-specific knockdown.

      The tissue-specific knockdown (abdomen only or abdomen+head) emerged from initial standardisations where we were unable to achieve knockdown in the head unless we used higher concentrations of dsRNA and did the injections in older females. We realised that this gave us the opportunity to isolate the neuronal contribution of these neuropeptides in the phenotype produced. Further optimisations revealed that injecting dsRNA into 0-10h old females produced abdomen-specific knockdowns without affecting head expression, whereas injections into 4 days old females resulted in knockdowns in both tissues. Moreover, head knockdowns in older females required higher dsRNA concentrations, with knockdown efficiency correlating with the amount injected. In contrast, abdominal knockdowns in younger females could be achieved even with lower dsRNA amounts.

      We have mentioned the knockdown conditions- time of injection and the amount dsRNA injected- for tissue-specific knockdowns in methods but realise now that it does not explain this well enough. We have now edited it to state our methodology more clearly (see lines 932-948).

      I also found some interpretations of the transcriptomic to be overly broad for what transcriptomes can actually tell us about the organism's state. For example, the authors mention, "Interestingly, we found that  after a blood meal, glucose is neither spent nor stored, and that the female brain goes into a state of metabolic 'sugar rest', while actively processing proteins (Figure S2B, S3)".

      This would require a physiological measurement to actually know. It certainly suggests that there are changes in carbohydrate metabolism, but there are too many alternative interpretations to make this broad claim from transcriptomic data alone.

      We thank the reviewer for pointing this out and agree with them. We have now edited our statement to read:

      “Instead, our data suggests altered carbohydrate metabolism  after a blood meal, with the female brain potentially entering a state of metabolic 'sugar rest' while actively processing proteins (Figure S2B, S3). However, physiological measurements of carbohydrate and protein metabolism will be required to confirm whether glucose is indeed neither spent nor stored during this period.” See lines 271-277.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Bansal et al examine and characterize feeding behaviour in Anopheles stephensi mosquitoes. While sharing some similarities to the well-studied Aedes aegypti mosquito, the authors demonstrate that mated females, but not unmated (virgin) females, exhibit suppression in their bloodfeeding behaviour. Using brain transcriptomic analysis comparing sugar-fed, blood-fed, and starved mosquitoes, several candidate genes potentially responsible for influencing blood-feeding behaviour were identified, including two neuropeptides (short NPF and RYamide) that are known to modulate feeding behaviour in other mosquito species. Using molecular tools, including in situ hybridization, the authors map the distribution of cells producing these neuropeptides in the nervous system and in the gut. Further, by implementing systemic RNA interference (RNAi), the study suggests that both neuropeptides appear to promote blood-feeding (but do not impact sugar feeding), although the impact was observed only  after both neuropeptide genes underwent knockdown.

      Strengths and/or weaknesses:

      Overall, the manuscript was well-written; however, the authors should review carefully, as some sections would benefit from restructuring to improve clarity. Some statements need to be rectified as they are factually inaccurate.

      Below are specific concerns and clarifications needed in the opinion of this reviewer:

      (1) What does "central brains" refer to in abstract and in other sections of the manuscript (including methods and results)? This term is ambiguous, and the authors should more clearly define what specific components of the central nervous system was/were used in their study.

      Central brain, or mid brain, is a commonly used term to refer to brain structures/neuropils without the optic lobes (For example: https://www.nature.com/articles/s41586-024-07686-5). In this study we have focused our analysis on the central brain circuits involved in modulating blood-feeding behaviour and have therefore excluded the optic lobes. As optic lobes account for nearly half of all the neurons in the mosquito brain (https://pmc.ncbi.nlm.nih.gov/articles/PMC8121336/), including them would have disproportionately skewed our transcriptomic data toward visual processing pathways.

      We have indicated this in figure 3A and in the methods (see lines 800-801, 812). We have now also clarified it in the results section for neuro-transcriptomics to avoid confusion (see lines 236-237).

      (2) The abstract states that two neuropeptides, sNPF and RYamide are working together, but no evidence is summarized for the latter in this section.

      We thank the reviewer for pointing this out. We have now added a statement “This occurs in the context of the action of RYa in the brain” to end of the abstract, for a complete summary of our proposed model.

      (3) Figure 1

      Panel A: This should include mating events in the reproductive cycle to demonstrate differences in the feeding behavior of Ae. aegypti.

      Our data suggest that mating can occur at any time between eclosion and oviposition in An. stephensi and between eclosion and blood feeding in Ae. aegypti. Adding these into (already busy) 1A, would cloud the purpose of the schematic, which is to indicate the time points used in the behavioural assays and transcriptomics.

      Panel F: In treatments where insects were not provided either blood or sugar, how is it that some females and males had fed? Also, it is unclear why the y-axis label is % fed when the caption indicates this is a choice assay. Also, it is interesting that sugar-starved females did not increase sugar intake. Is there any explanation for this (was it expected)?

      We apologise for the confusion. The experiment is indeed a choice assay in which sugar-starved or sugar-sated females, co-housed with males, were provided simultaneous access to both blood and sugar, and were assessed for the choice made (indicated on the x-axis): both blood and sugar, blood only, sugar only, or neither. The x-axis indicates the choice made by the mosquitoes, not the choice provided in the assay, and the y-axis indicates the percentage of males or females that made each particular choice. We have now removed the “X” markers from the x-axis and revised the axis label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose to take.

      In this assay, we scored females only for the presence or absence of each meal type (blood or sugar) and are therefore unable to comment on whether sugar-starved females consumed more sugar than sugarsated females. However, when sugar-starved, a higher proportion of females consumed both blood and sugar, while fewer fed on blood alone.

      For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data as it does not capture the variability in the data.

      (4) Figure 3

      In the neurotranscriptome analysis of the (central) brain involving the two types of comparisons, can the authors clarify what "excluded in males" refers to? Does this imply that only genes not expressed in males were considered in the analysis? If so, what about co-expressed genes that have a specific function in female feeding behaviour?

      This is indeed correct. We reasoned that since blood feeding is exclusive to females, we should focus our analysis on genes that were specifically upregulated in them. As the reviewer points out, it is very likely that genes commonly upregulated in males and females may also promote blood feeding and we will miss out on any such candidates based on our selection criteria.

      (5) Figure 4

      The authors state that there is more efficient knockdown in the head of unfed females; however, this is not accurate since they only get knockdown in unfed animals, and no evidence of any knockdown in fed animals (panel D). This point should be revised in the results test as well.

      Perhaps we do not understand the reviewer’s point or there has been a misunderstanding. In figure 4D, we show that while there is more robust gene knockdown in unfed females, blood-fed females also showed modest but measurable knockdowns ranging from 5-40% for RYamide and 2-21% for sNPF.

      Relatedly, blood-feeding is decreased when both neuropeptide transcripts are targeted compared to uninjected (panel C) but not compared to dsGFP injected (panel E). Why is this the case if authors showed earlier in this figure (panel B) that dsGFP does not impact blood feeding?

      We realise this concern stems from our representation of the data. Since we had earlier determined that dsGFP-injected females fed similarly to uninjected females (fig 4B), we used these controls interchangeably in subsequent experiments. To avoid confusion, we have now only used the label ‘control’ in figure 4 (and supplementary figure S9) and specified which control was used for each experiment in the legend.

      In addition to this, we wanted to clarify that fig 4C and 4E are independent experiments. 4C is the behaviour corresponding to when the neuropeptides were knocked down in both heads and abdomens.

      4E is the behaviour corresponding to when the neuropeptides were knocked down in only the abdomens. We have now added a schematic in the plots to make this clearer.

      In addition, do the uninjected and dsGFP-injected relative mRNA expression data reflect combined RYa and sNPF levels? Why is there no variation in these data,…

      In these qPCRs, we calculated relative mRNA expression using the delta-delta Ct method (see line 975). For each neuropeptide its respective control was used. For simplicity, we combined the RYa and sNPF control data into a single representation. The value of this control is invariant because this method sets the control baseline to a value of 1.

      …and how do transcript levels of RYa and sNPF compare in the brain versus the abdomen (the presentation of data doesn't make this relationship clear).

      The reviewer is correct in pointing out that we have not clarified this relationship in our current presentation. While we have not performed absolute mRNA quantifications, we extracted relative mRNA levels from qPCR data of 96h old unmanipulated control females. We observed that both sNPF and RYa transcripts are expressed at much lower levels in the abdomens, as compared to those in the heads, as shown in the graphs inserted below.

      Author response image 1.

      (6) As an overall comment, the figure captions are far too long and include redundant text presented in the methods and results sections.

      We thank the reviewer for flagging this and have now edited the legends to remove redundancy.

      (7) Criteria used for identifying neuropeptides promoting blood-feeding: statement that reads "all neuropeptides, since these are known to regulate feeding behaviours". This is not accurate since not all neuropeptides govern feeding behaviors, while certainly a subset do play a role.

      We agree with the reviewer that not all neuropeptides regulate feeding behaviours. Our statement refers to the screening approach we used: in our shortlist of candidates, we chose to validate all neuropeptides.

      (8) In the section beginning with "Two neuropeptides - sNPF and RYa - showed about 25% and 40% reduced mRNA levels...", the authors state that there was no change in blood-feeding and later state the opposite. The wording should be clarified as it is unclear.

      Thank you for pointing this out. We were referring to an unchanged proportion of the blood fed females. We have now edited the text to the following:

      “Two neuropeptides - sNPF and RYa - showed about 25% and 40% reduced mRNA levels in the heads but the proportion of females that took blood meals remained unchanged”. See lines 338-340.

      (9) Just before the conclusions section, the statement that "neuropeptide receptors are often ligand promiscuous" is unjustified. Indeed, many studies have shown in heterologous systems that high concentrations of structurally related peptides, which are not physiologically relevant, might cross-react and activate a receptor belonging to a different peptide family; however, the natural ligand is often many times more potent (in most cases, orders of magnitude) than structurally related peptides. This is certainly the case for various RYamide and sNPF receptors characterized in various insect species.

      We agree with the reviewer and apologise for the mistake. We have now removed the statement.

      (10) Methods

      In the dsRNA-mediated gene knockdown section, the authors could more clearly describe how much dsRNA was injected per target. At the moment, the reader must carry out calculations based on the concentrations provided and the injected volume range provided later in this section.

      We have now edited the section to reflect the amount of dsRNA injected per target. Please see lines 921-931.

      It is also unclear how tissue-specific knockdown was achieved by performing injection on different days/times. The authors need to explain/support, and justify how temporal differences in injection lead to changes in tissue-specific expression. Does the blood-brain barrier limit knockdown in the brain instead, while leaving expression in the peripheral organs susceptible?

      To achieve tissue-specific knockdowns of sNPF and RYa, we optimised both the time of injection as well as the dsRNA concentration to be injected. Injecting dsRNA into 0-10h females produced abdomen specific knockdowns without affecting head expression, whereas injections into 96h old females resulted in knockdowns in both tissues. Head knockdowns in older females required higher dsRNA concentrations, with knockdown efficiency correlating with the amount injected. In contrast, abdominal knockdowns in younger females could be achieved even with lower dsRNA amounts, reflecting the lower baseline expression of sNPF in abdomens compared to heads and the age-dependent increase in head expression (as confirmed by qPCR). It is possible that the blood-brain barrier also limits the dsRNA entering the brain, thereby requiring higher amounts to be injected for head knockdowns.

      We have now edited this section to state our methodology more clearly (see lines 932-948).

      For example, in Figure 4, the data support that knockdown in the head/brain is only effective in unfed animals compared to uninjected animals, while there is no evidence of knockdown in the brain relative to dsGFP-injected animals. Comparatively, evidence appears to show stronger evidence of abdominal knockdown mostly for the RYa transcript (>90%) while still significantly for the sNPF transcript (>60%).

      As we explained earlier, this concern likely stems from our representation of the data. Since we had earlier determined that dsGFP-injected females fed similarly to uninjected females (fig 4B), we used these controls interchangeably in subsequent experiments. To avoid confusion, we have now only used the label ‘control’ in figure 4 (and supplementary figure S9) and specified which control was used for each experiment in the legend.

      In addition to this, we wanted to clarify that fig 4C and 4E are independent experiments. 4C is the behaviour corresponding to when the neuropeptides were knocked down in both heads and abdomens. 4E is the behaviour corresponding to when the neuropeptides were knocked down in only the abdomen. We have now added a schematic in the plots to make this clearer.

      Reviewer #3 (Public review):

      Summary:

      This manuscript investigates the regulation of host-seeking behavior in Anopheles stephensi females across different life stages and mating states. Through transcriptomic profiling, the authors identify differential gene expression between "blood-hungry" and "blood-sated" states. Two neuropeptides, sNPF and RYamide, are highlighted as potential mediators of host-seeking behavior. RNAi knockdown of these peptides alters host-seeking activity, and their expression is anatomically mapped in the mosquito brain (sNPF and RYamide) and midgut (sNPF only).

      Strengths:

      (1) The study addresses an important question in mosquito biology, with relevance to vector control and disease transmission.

      (2) Transcriptomic profiling is used to uncover gene expression changes linked to behavioral states.

      (3) The identification of sNPF and RYamide as candidate regulators provides a clear focus for downstream mechanistic work.

      (3) RNAi experiments demonstrate that these neuropeptides are necessary for normal host-seeking behavior.

      (4) Anatomical localization of neuropeptide expression adds depth to the functional findings.

      Weaknesses:

      (1) The title implies that the neuropeptides promote host-seeking, but sufficiency is not demonstrated (for example, with peptide injection or overexpression experiments).

      Demonstrating sufficiency would require injecting sNPF peptide or its agonist. To date, no small-molecule agonists (or antagonists) that selectively mimic sNPF or RYa neuropeptides have been identified in insects. An NPY analogue, TM30335, has been reported to activate the Aedes aegypti NPY-like receptor 7 (NPYLR7; Duvall et al., 2019), which is also activated by sNPF peptides at higher doses (Liesch et al., 2013). Unfortunately, the compound is no longer available because its manufacturer, 7TM Pharma, has ceased operations. Synthesising the peptides is a possibility that we will explore in the future.

      (2) The proposed model regarding central versus peripheral (gut) peptide action is inconsistently presented and lacks strong experimental support.

      The best way to address this would be to conduct tissue-specific manipulations, the tools for which are not available in this species. Our approach to achieve head+abdomen and abdomen only knockdown was the closest we could get to achieving tissue specificity and allowed us to confirm that knockdown in the head was necessary for the phenotype. However, as the reviewer points out, this did not allow us to rule out any involvement of the abdomen. This point has been addressed in lines 364-371.

      (3) Some conclusions appear premature based on the current data and would benefit from additional functional validation.

      The most definitive way of demonstrating necessity of sNPF and RYa in blood feeding would be to generate mutant lines. While we are pursuing this line of experiments, they lie beyond the scope of a revision. In its absence, we relied on the knockdown of the genes using dsRNA. We would like to posit that despite only partial knockdown, mosquitoes do display defects in blood-feeding behaviour, without affecting sugar-feeding. We think this reflects the importance of sNPF in promoting blood feeding.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Overall, I found this manuscript to be well-prepared, visually the figures are great and clearly were carefully thought out and curated, and the research is impacwul. It was a wonderful read from start to finish. I have the following recommendations:

      Thank you very much, we are very pleased to hear that you enjoyed reading our manuscript!

      (1) For future manuscripts, it would make things significantly easier on the reviewer side to submit a format that uses line numbers.

      We sincerely apologise for the oversight. We have now incorporated line numbers in the revised manuscript.

      (2) There are a few statements in the text that I think may need clarification or might be outside the bounds of what was actually studied here. For example, in the introduction "However, mating is dispensable in Anophelines even under conditions of nutritional satiety". I am uncertain what is meant by this statement - please clarify.

      We apologise for the lack of clarity in the statement and have now deleted it since we felt it was not necessary.

      (3) Typo/Grammatical minutiae:

      a) A small idiosyncrasy of using hyphens in compound words should also be fixed throughout. Typically, you don't hyphenate if the words are being used as a noun, as in the case: e.g. "Age affects blood feeding.". However, you would hyphenate if the two words are used as a compound adjective "Age affects blood-feeding behavior". This may not be an all-inclusive list, but here are some examples where hyphens need to either be removed or added. Some examples:

      "Nutritional state also influences other internal state outputs on blood-feeding": blood-feeding -> blood feeding

      "... the modulation of blood-feeding": blood-feeding -> blood feeding

      "For example, whether virgin females take blood-meals...": blood-meals -> blood meals

      ".... how internal and external cues shape meal-choice"-> meal choice

      "blood-meal" is often used throughout the text, but is correctly "blood meal" in the figures.

      There are many more examples throughout.

      We apologise for these errors and appreciate the reviewer’s keen eye. We have now fixed them throughout the manuscript.

      b) Figure 1 Caption has a typo: "co-housed males were accessed for sugar-feeding" should be "co-housed males were assessed for sugar feeding"

      We apologise for the typo and thank the reviewer for spotting it. We have now corrected this.

      c) It would be helpful in some other figure captions to more clearly label which statement is relevant to which part of the text. For example, in Figure 4's caption.

      "C,D. Blood-feeding and sugar-feeding behaviour of females when both RYa and sNPF are knocked down in the head (C). Relative mRNA expressions of RYa and sNPF in the heads of dsRYa+dssNPF - injected blood-fed and unfed females, as compared to that in uninjected females, analysed via qPCR (D)."

      I found re-referencing C and D at the end of their statements makes it look as thought C precedes the "Relative mRNA expression" and on a first read through, I thought the figure captions were backwards. I'd recommend reformating here and throughout consistently to only have the figure letter precede its relevant caption information, e.g.:

      "C. Blood-feeding and sugar-feeding behaviour of females when both RYa and sNPF are knocked down in the head. D. Relative mRNA expressions of RYa and sNPF in the heads of dsRYa+dssNPF - injected bloodfed and unfed females, as compared to that in uninjected females, analysed via qPCR."

      We have now edited the legends as suggested.

      Reviewer #2 (Recommendations for the authors):

      Separately from the clarifications and limitations listed above, the authors could strengthen their study and the conclusions drawn if they could rescue the behavioural phenotype observed following knockdown of sNPF and RYamide. This could be achieved by injection of either sNPF or RYa peptide independently or combined following knockdown to validate the role of these peptides in promoting blood-feeding in An. stephensi. Additionally, the apparent (but unclear) regionalized (or tissue-specific) knockdown of sNPF and RYamide transcripts could be visualized and verified by implementing HCR in situ hyb in knockdown animals (or immunohistochemistry using antibodies specific for these two neuropeptides).

      In a follow up of this work, we are generating mutants and peptides for these candidates and are planning to conduct exactly the experiments the reviewer suggests.

      Reviewer #3 (Recommendations for the authors):

      The loss-of-function data suggest necessity but not sufficiency. Synthetic peptide injection in non-host seeking (blood-fed mated or juvenile) mosquitoes would provide direct evidence for peptide-induced behavioral activation. The lack of these experiments weakens the central claim of the paper that these neuropeptides directly promote blood feeding.

      As noted above, we plan to synthesise the peptide to test rescue in a mutant background and sufficiency.

      Some of the claims about knockdown efficiency and interpretation are conflicting; the authors dismiss Hairy and Prp as candidates due to 30-35% knockdown, yet base major conclusions on sNPF and RYamide knockdowns with comparable efficiencies (25-40%). This inconsistency should be addressed, or the justification for different thresholds should be clearly stated.

      We have not defined any specific knockdown efficacy thresholds in the manuscript, as these can vary considerably between genes, and in some cases, even modest reductions can be sufficient to produce detectable phenotypes. For example, knockdown efficiencies of even as low as about 25% - 40% gave us observable phenotypes for sNPF and RYa RNAi (Figure S9B-G).

      No such phenotypes were observed for Hairy (30%) or Prp (35%) knockdowns. Either these genes are not involved in blood feeding, or the knockdown was not sufficient for these specific genes to induce phenotypes. We cannot distinguish between these scenarios.

      The observation that knockdown animals take smaller blood meals is interesting and could reflect a downstream effect of altered host-seeking or an independent physiological change. The relationship between meal size and host-seeking behavior should be clarified.

      We agree with the reviewer that the reduced meal size observed in sNPF and RYa knockdown animals could result from their inability to seek a host or due to an independent effect on blood meal intake. Unfortunately, we did not measure host-seeking in these animals. We plan to distinguish between these possibilities using mutants in future work.

      Several figures are difficult to interpret due to cluttered labeling and poorly distinguishable color schemes. Simplifying these and improving contrast (especially for co-housed vs. virgin conditions) would enhance readability.

      We regret that the reviewer found the figures difficult to follow. We have now revised our annotations throughout the manuscript for enhanced readability. For example, “D1<sup>B</sup>” is now “D1<sup>PBM</sup>” (post-bloodmeal) and “D1<sup>O</sup>” is now “D1<sup>PO</sup>” (post-oviposition). Wherever mated females were used, we have now appended “(m)” to the annotations and consistently depicted these females with striped abdomens in all the schematics. We believe these changes will improve clarity and readability.

      The manuscript does not clearly justify the use of whole-brain RNA sequencing to identify peptides involved in metabolic or peripheral processes. Given that anticipatory feeding signals are often peripheral, the logic for brain transcriptomics should be explained.

      The reviewer is correct in pointing out that feeding signals could also emerge from peripheral tissues. Signals from these tissues – in response to both changing nutritional and reproductive states – are then integrated by the central brain to modulate feeding choices. For example, in Drosophila, increased protein intake is mediated by central brain circuitry including those in the SEZ and central complex (Munch et al., 2022; Liu et al., 2017; Goldschmidt et al., 2023). In the context of mating, male-derived sex peptide further increases protein feeding by acting on a dedicated central brain circuitry (Walker et al., 2015). We, therefore focused on the central brain for our studies.

      The proposed model suggests brain-derived peptides initiate feeding, while gut peptides provide feedback. However, gut-specific knockdowns had no effect, undermining this hypothesis. Conversely, the authors also suggest abdominal involvement based on RNAi results. These contradictions need to be resolved into a consistent model.

      We thank the reviewer for raising this point and recognise their concern. Our reasons for invoking an involvement of the gut were two-fold:

      (1) We find increased sNPF transcript expression in the entero-endocrine cells of the midgut in blood-hungry females, which returns to baseline  after a blood-meal (Fig. 4L, M).

      (2) While the abdomen-only knockdowns did not affect blood feeding, every effective head knockdown that affected blood feeding also abolished abdominal transcript levels (Fig. S9C, F). (Achieving a head-only reduction proved impossible because (i) systemic dsRNA delivery inevitably reaches the abdomen and (ii) abdominal expression of both peptides is low, leaving little dynamic range for selective manipulation.) Consequently, we can only conclude the following: 1) that brain expression is required for the behaviour, 2) that we cannot exclude a contributory role for gut-derived sNPF. We have discussed this in lines 364-371.

      The identification of candidate receptors is promising, but the manuscript would be significantly strengthened by testing whether receptor knockdowns phenocopy peptide knockdowns. Without this, it is difficult to conclude that the identified receptors mediate the behavioral effects.

      We agree that functional validation of the receptors would strengthen the evidence for sNPF and RYa_mediated control of blood feeding in _An. stephensi. We selected these receptors based on sequence homology. A possibility remains that sNPF neuropeptides activate more than one receptor, each modulating a distinct circuit, as shown in the case of Drosophila Tachykinin (https://pmc.ncbi.nlm.nih.gov/articles/PMC10184743/). This will mean a systematic characterisation and knockdown of each of them to confirm their role. We are planning these experiments in the future.

      The authors compared the percentage changes in sugar-fed and blood-fed animals under sugar-sated or sugar-starved conditions. Figure 1F should reflect what was discussed in the results.

      Perhaps this concern stems from our representation of the data in figure 1F? We have now edited the xaxis and revised its label from “choice of food” to “choice made” to better reflect what food the mosquitoes chose to take.

      For clarity, we have now also plotted the same data as stacked graphs at the bottom of Fig. 1F, which clearly shows the proportion of mosquitoes fed on each particular choice. We avoid the stacked graph as the sole representation of this data because it does not capture the variability in the data.

      Minor issues:

      (1) The authors used mosquitoes with belly stripes to indicate mated females. To be consistent, the post-oviposition females should also have belly stripes.

      We thank the reviewer for pointing this out. We have now edited all the figures as suggested.

      (2) In the first paragraph on the right column of the second page, the authors state, "Since females took blood-meals regardless of their prior sugar-feeding status and only sugar-feeding was selectively suppressed by prior sugar access." Just because the well-fed animals ate less than the starved animals does not mean their feeding behavior was suppressed.

      Perhaps there has been a misunderstanding in the experimental setup of figure 1F, probably stemming from our data representation. The experiment is a choice assay in which sugar-starved or sugar-sated females, co-housed with males, were provided simultaneous access to both blood and sugar, and were assessed for the choice made (indicated on the x-axis): both blood and sugar, blood only, sugar only, or neither. We scored females only for the presence or absence of each meal type (blood or sugar) and did not quantify the amount consumed.

      (3) The figure legend for Figure 1A and the naming convention for different experimental groups are difficult to follow. A simplified or consistently abbreviated scheme would help readers navigate the figures and text.

      We regret that the reviewer found the figure difficult to follow. We have now revised our annotations throughout the manuscript for enhanced readability. For example, “D1<sup>B</sup>” is now “D1<sup>PBM</sup>” (post-bloodmeal) and “D1<sup>O</sup>” is now “D1<sup>PO</sup>” (post-oviposition).

      (4) In the last paragraph of the Y-maze olfactory assay for host-seeking behaviour in An. stephensi in Methods, the authors state, "When testing blood-fed females, aged-matched sugar-fed females (bloodhungry) were included as positive controls where ever possible, with satisfactory results." The authors should explicitly describe what the criteria are for "satisfactory results".

      We apologise for the lack of clarity. We have now edited the statement to read:

      “When testing blood-fed females, age-matched sugar-fed females (blood-hungry) were included wherever possible as positive controls. These females consistently showed attraction to host cues, as expected.” See lines 786-790.

      (5) In the first paragraph of the dsRNA-mediated gene knockdown section in Methods, dsRNA against GFP is used as a negative control for the injection itself, but not for the potential off-target effect.

      We agree with the reviewer that dsGFP injections act as controls only for injection-related behavioural changes, and not for off-target effects of RNAi. We have now corrected the statement. See lines 919-920.

      To control for off-target effects, we could have designed multiple dsRNAs targeting different parts of a given gene. We regret not including these controls for potential off-target effects of dsRNAs injected.

      (6) References numbers 48, 89, and 90 are not complete citations.

      We thank the reviewer for spotting these. We have now corrected these citations.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Miles et al. used a combination of AlphaFold modeling, biochemical assays of mutant constructs and NMR spectroscopy to model the ternary complex of Aurora A, Bora and Plk1, and elucidate how Bora can act as a molecular bridge that facilitates the phosphorylation of the activation loop Thr210 within Plk1 by Aurora A. Their studies identified an interaction between residues 52-73 within Bora and the 'FW' pocket on the N-terminal lobe of Plk1, which binds Phe56 and Trp58 of Bora. Additionally, Ser59 of Bora was identified as a good Aurora A substrate using a Bora peptide array, and pSer59 was predicted to form bridging interactions with Aurora Arg205 and Plk1 Arg59. This was supported by NMR and biochemical assays. In addition, the authors validate that phosphorylation of Ser-112 on Bora enhances stabilization of the Aurora A-Bora complex Overall, the model revealed novel details of the interactions within the Aurora A-Bora-Plk1 ternary complex that are supported by the biochemical and NMR data. The work will be of significant interest to basic scientists whose work involves protein kinase signaling, cell division/mitosis, signal transduction, and cancer biology. We recommend publication of this manuscript with the following minor changes and additions.

      1. In the introduction, on page 2, the authors seem a little confused about the Plk1 Polo-box domain - text as written: "...kinase domain linked to tandem Polo-box domains (PBD)", and cite a review paper. Actually, there is only a single Polo-box domain in these kinases, which contains both Polo-boxes and a bit of the upstream linker region. The "PBD" terminology denotes his 2-Polo-box +linker structure. Perhaps it would be better here to cite the PBD structure (Elia et al., Cell, 2002) as a primary citation here.
      2. Similarly, the line "...during the G2/M transition following successful DNA damage repair" cites the Seki et al paper, but those findings are shown in the Macurek et al paper, not the Seki et al paper.
      3. Using the model of the ternary complex as shown in Figure 1B, deletion constructs of Bora missing regions within the disordered loops, but still retaining the residues that bind the PBD, FW pocket and Aurora A, can be modeled and tested to see if such deletions can improve the ipTM scores and binding affinity.
      4. On page 5, "S112A" within the sentence "Unexpectedly, the F56A/W58A Bora was less efficiently phosphorylated on S112A (Supplementary Figure S11, F compared to H and Supplementary Table S4)." This should be "S112".
      5. In the assays shown in Figure 2D, the presence of excess F56AW58A Bora that remained unphosphorylated on S112 may complicate the interpretation of the results. Can the authors show that the S112-phosphorylated F56AW68A Bora is predominantly bound to Aurora A in such a mixture, perhaps by NMR using labelled pS112 F56AW58A Bora and unlabeled S112 F56AW58A Bora?
      6. Please expand Figure 3A to better show the FW pocket-forming residues on Plk1.
      7. It would be helpful to label the peaks in the mass spectra in Fig. S11 with the phospho-species that they correspond to.
      8. In the last paragraph on page 7, "see we" in the sentence "As well as a decrease in intensity around pSer112 in Bora, see we an overall effect with decreased intensity across most of the Bora sequence." Should be corrected to "we see".
      9. While not required, it would be helpful if binding or Bora to Aurora A after Erk2 phosphorylation could be shown using fluorescence polarization or ITC to lend additional support to the NMR data for S112 and S59 phosphorylation and for CEP192 and TPX2 competition.
      10. The Aurora A phosphorylation motif has been further defined beyond that reported by the Pinna lab in 2005. Notably, the Ser-59 sequence on Bora (F-R-W-S-I), has, in addition to dominant selection for AR in the -2 position, both favorable -1 (W) and +1 (I) positions based on peptide library measurements (Alexander et al., Science Signaling 2011), further arguing that it may be an excellent Aurora A phosphorylation site.
      11. Have the authors tried to model the Drosophila melanogaster Aurora A-Bora-Polo complex to see if the Asn substitution of Bora Ser59, and the expected loss of the interactions between Bora pSer59 and Plk1 Arg59 and Aurora A Arg205 are compensated by other features?
      12. Given the relevance of the recent publication from Zhu et al. in https://doi.org/10.1038/s41467-025-63352-y to this study, the authors may want to comment on, or test, the relative importance of PKA and Aurora A as a potential kinase for Bora S59. While those authors argue that PKA phosphorylates Bora on Ser-59, one could easily imagine a model in which either PKA or Aurora A could initially phosphorylate that site followed by a propagation step after initial Aurora A activation, in which Aurora A phosphorylation of Bora Ser-59 is the dominant process.

      -Dan Lim and Michael Yaffe

      Significance

      The work is well done and clearly presented.

  2. milenio-nudos.github.io milenio-nudos.github.io
    1. (ulfert-blank_assessing_2022?) suggests to work with a unified construct denominated Digital Self-efficacy (hereinafter DSE) to reach a high-level research on this issue. Considering the gaps and inconsistencies in previous measurements, (ulfert-blank_assessing_2022?) points out that DSE construct have to

      Creo que debemos hacer una distinción mejor para que se entienda esto (de partida ya estamos usando la abreviación DSE antes de esta parte). Como nos referimos a las anteriores escalas como mediciones de DSE y hasta el momento no eran escalas que en estricto rigor medían la DSE creo que lleva a la confusión. Diría autoeficacia asociada a la tecnología hasta este punto del paper, así quedaría algo más claro, ya que las anteriores escalas no miden lo mismo.

    1. Necesario entender que el mismo testo no trata de dar una solución a un problema, más bien como en sí mismo propone; Es más importante diferenciar entre las propuestas y soluciones, de entre ellas, ¿Cual es la más acertada y de menos rechazo?

    2. De importancia recalcar que conforme se aprende más de un tema y más información es recopilada, se vuelve más difícil seguir una línea de edición, redacción y orden.

    3. El autor constantemente menciona la importancia de cuestionar y analizar de manera consiente la información que encontramos del tema. Y enfatiza lo repetitivo o cíclico que esto puede ser, pero, me gustaría que indagara en el proceso de formular esas cuestiones que ayudarán al investigador a centrarse en el tema.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews

      Reviewer #1 (Public review):

      Summary:

      The authors performed genome assemblies for two Fagaceae species and collected transcriptome data from four natural tree species every month over two years. They identified seasonal gene expression patterns and further analyzed species-specific differences.

      Strengths:

      The study of gene expression patterns in natural environments, as opposed to controlled chambers, is gaining increasing attention. The authors collected RNA-seq data monthly for two years from four tree species and analyzed seasonal expression patterns. The data are novel. The authors could revise the manuscript to emphasize seasonal expression patterns in three species (with one additional species having more limited data). Furthermore, the chromosome-scale genome assemblies for the two Fagaceae species represent valuable resources, although the authors did not cite existing assemblies from closely related species.

      Thank you for your careful assessment of our manuscript.

      Weaknesses:

      Comment; The study design has a fundamental flaw regarding the evaluation of genetic or evolutionary effects. As a basic principle in biology, phenotypes, including gene expression levels, are influenced by genetics, environmental factors, and their interaction. This principle is well-established in quantitative genetics.

      In this study, the four species were sampled from three different sites (see Materials and Methods, lines 543-546), and additionally, two species were sampled from 2019-2021, while the other two were sampled from 2021-2023 (see Figure S2). This critical detail should be clearly described in the Results and Materials and Methods. Due to these variations in sampling sites and periods, environmental conditions are not uniform across species.

      Even in studies conducted in natural environments, there are ways to design experiments that allow genetic effects to be evaluated. For example, by studying co-occurring species, or through transplant experiments, or in common gardens. To illustrate the issue, imagine an experiment where clones of a single species were sampled from three sites and two time periods, similar to the current design. RNA-seq analysis would likely detect differences that could qualitatively resemble those reported in this manuscript.

      One example is in line 197, where genus-specific expression patterns are mentioned. While it may be true that the authors' conclusions (e.g., winter synchronization, phylogenetic constraints) reflect real biological trends, these conclusions are also predictable even without empirical data, and the current dataset does not provide quantitative support.

      If the authors can present a valid method to disentangle genetic and environmental effects from their dataset, that would significantly strengthen the manuscript. However, I do not believe the current study design is suitable for this purpose.

      Unless these issues are addressed, the use of the term "evolution" is inappropriate in this context. The title should be revised, and the result sections starting from "Peak months distribution..." should be either removed or fundamentally revised. The entire Discussion section, which is based on evolutionary interpretation, should be deleted in its current form.

      If the authors still wish to explore genetic or evolutionary analyses, the pair of L. edulis and L. glaber, which were sampled at the same site and over the same period, might be used to analyze "seasonal gene expression divergence in relation to sequence divergence." Nevertheless, the manuscript would benefit from focusing on seasonal expression patterns without framing the study in evolutionary terms.

      We sincerely thank the reviewer for the detailed and thoughtful comments. We fully recognize the importance of carefully distinguishing genetic and environmental contributions in transcriptomic studies, particularly when addressing evolutionary questions. The reviewer identified two major concerns regarding our study design: (1) the use of different monitoring periods across species, and (2) the use of samples collected from different study sites. We addressed both concerns with additional analyses using 112 new samples and now present new evidence that supports the robustness of our conclusions.

      (1) Monitoring period variation does not bias our conclusions<br /> To address concerns about the differing monitoring periods, we added new RNA-seq data (42 samples each for bud and leaf samples for L. glaber and 14 samples each for bud and leaf samples for _L. eduli_s) collected from November 2021 to November 2022, enabling direct comparison across species within a consistent timeframe. Hierarchical clustering of this expanded dataset (Fig. S6) yielded results consistent with our original findings: winter-collected samples cluster together regardless of species identity. This strongly supports our conclusion that the seasonal synchrony observed in winter is not an artifact of the monitoring period and demonstrates the robustness of our conclusions across datasets.

      (2) Site variation is limited and does not confound our findings<br /> Although the study included three sites, two of them (Imajuku and Ito Campus) are only 7.3 km apart, share nearly identical temperature profiles (see Fig. S2), and are located at the edge of similar evergreen broadleaf forests. Only Q. acuta was sampled from a higher-altitude, cooler site. To assess whether the higher elevation site of Q. acuta introduced confounding environmental effects, we reanalyzed the data after excluding this species. Hierarchical clustering still revealed that winter bud samples formed a distinct cluster regardless of species identity (Fig. S7), consistent with our original finding.

      Furthermore, we recalculated the molecular phenology divergence index D (Fig. 4C) and the interspecific Pearson’s correlation coefficients (Fig. 5A) without including Q. acuta. These analyses produced results that were similar to those obtained from the full dataset (Fig. S12; Fig. S14), indicating that the observed patterns are not driven by environmental differences associated with elevation.

      (3) Justification for our approach in natural systems<br /> We agree with the reviewer that experimental approaches such as common gardens, reciprocal transplants, and the use of co-occurring species are valuable for disentangling genetic and environmental effects. In fact, we have previously implemented such designs in studies using the perennial herb Arabidopsis halleri (Komoto et al., 2022, https://doi.org/10.1111/pce.14716) and clonal Someiyoshino cherry trees (Miyawaki-Kuwakado et al., 2024, https://doi.org/10.1002/ppp3.10548) to examine environmental effects on gene expression. However, extending these approaches to long-lived tree species in diverse natural ecosystems poses significant logistical and biological challenges. In this study, we addressed this limitation by including three co-occurring species at the same site, which allowed us to evaluate interspecific differences under comparable environmental conditions. Importantly, even when we limited our analyses to these co-occurring species, the results remained consistent, indicating that the observed variation in transcriptomic profiles cannot be attributed to environmental factors alone and likely reflects underlying genetic influences.

      Accordingly, we added four new figures (Fig. S6, Fig. S7, Fig. S12 and Fig. S14) and revised the manuscript to clarify the limitations and strengths of our design, to tone down the evolutionary claims where appropriate, and to more explicitly define the scope of our conclusions in light of the data. We hope that these efforts sufficiently address the reviewer’s concerns and strengthen the manuscript.

      To better support the seasonal expression analysis, the early RNA-seq analysis sections should be strengthened. There is little discussion of biological replicate variation or variation among branches of the same individual. These could be important factors to analyze. In line 137, the mapping rate for two species is mentioned, but the rates for each species should be clearly reported. One RNA-seq dataset is based on a species different from the reference genome, so a lower mapping rate is expected. While this likely does not hinder downstream analysis, quantification is important.

      We thank the reviewer 1 for the helpful comment. To evaluate the variation among biological replicates, we compared the expression level of each gene across different individuals. We observed high correlation between each pair of individuals (Q. glauca (n=3): an average correlation coefficient r = 0.947; Q. acuta (n=3): r = 0.948; L. glaber (n=3): r = 0.948)). This result suggests that the seasonal gene expression pattern is highly synchronized across individuals within the same species. We mentioned this point in the Result section in the revised manuscript. We also calculated the mean mapping rates for each species. As the reviewer expected, the mapping rate was slightly lower in Q. acuta (88.6 ± 2.3%) and L. glaber (84.3 ± 5.4%), whose RNA-Seq data were mapped to reference genomes of related but different species, compared to that in Q. glauca (92.6 ± 2.2%) and L. edulis (89.3 ± 2.7%). However, we minimized the impact of these differences on downstream analysis. These details have been included in the revised main text.

      In Figures 2A and 2B, clustering is used to support several points discussed in the Results section (e.g., lines 175-177). However, clustering is primarily a visualization method or a hypothesis-generating tool; it cannot serve as a statistical test. Stronger conclusions would require further statistical testing.

      We thank the reviewer for the helpful comment. As noted, we acknowledge that hierarchical clustering (Fig. 2A) is primarily a visualization and hypothesis-generating method. To assess the biological relevance of the clusters identified, we conducted a Mann-Whitney U test or the Steel-Dwass test to evaluate whether the environmental temperatures at the time of sample collection differed significantly among the clusters. This analysis (Fig. 2B) revealed statistically significant differences in temperature in the cluster B3 (p < 0.01), indicating that the gene expression clusters are associated with seasonal thermal variation. These results support the interpretation that the clusters reflect coordinated transcriptional responses to environmental temperature. We revised the Results section to clarify this point.

      The quality of the genome assemblies appears adequate, but related assemblies should be cited and discussed. Several assemblies of Fagaceae species already exist, including Quercus mongolica (Ai et al., Mol Ecol Res, 2022), Q. gilva (Front Plant Sci, 2022), and Fagus sylvatica (GigaScience, 2018), among others. Is there any novelty here? Can you compare your results with these existing assemblies?

      We agree that genome assemblies of Fagaceae species are becoming increasing available. However, our study does not aim to emphasize the novelty of the genome assemblies per se. Rather, with the increasing availability of chromosome-level genomes, we regard genome assembly as a necessary foundation for more advanced analyses. The main objective of our study is to investigate how each gene is expressed in response to seasonal environmental changes, and to link genome information with seasonal transcriptomic dynamics. To address the reviewer’s comment in line with this objective, we added a discussion on the syntenic structure of eight genome assemblies spanning four genera within the Fagaceae, including a species from the genus Fagus (Ikezaki et al. 2025, https://doi.org/10.1101/2025.07.31.667835). This addition helps to position our work more clearly within the context of existing genomic resources.

      Most importantly, Figure 1B-D shows synteny between the two genera but also indicates homology between different chromosomes. Does this suggest paleopolyploidy or another novel feature? These chromosome connections should be interpreted in the main text-even if they could be methodological artifacts.

      A previous study on genome size variation in Fagaceae suggested that, given the consistent ploidy level across the family, genome expansion likely occurred through relatively small segmental duplications rather than whole-genome duplications. Because Figure 1B-D supports this view, we cited the following reference in the revised version of the manuscript. Chen et al. (2014) https://doi.org/10.1007/s11295-014-0736-y

      In both the Results and Materials and Methods sections, descriptions of genome and RNA-seq data are unclear. In line 128, a paragraph on genome assembly suddenly introduces expression levels. RNA-seq data should be described before this. Similarly, in line 238, the sentence "we assembled high-quality reference genomes" seems disconnected from the surrounding discussion of expression studies. In line 632, Illumina short-read DNA sequencing is mentioned, but it's unclear how these data were used.

      We relocated the explanation regarding the expression levels of single-copy and multi-copy genes to the section titled “Seasonal gene expression dynamics.” Additionally, we clarified in the Materials and Methods section that short-read sequencing data were used for both genome size estimation and phylogenetic reconstruction.

      Reviewer #2 (Public review):

      Summary:

      This study explores how gene expression evolves in response to seasonal environments, using four evergreen Fagaceae species growing in similar habitats in Japan. By combining chromosome-scale genome assemblies with a two-year RNA-seq time series in leaves and buds, the authors identify seasonal rhythms in gene expression and examine both conserved and divergent patterns. A central result is that winter bud expression is highly conserved across species, likely due to shared physiological demands under cold conditions. One of the intriguing implications of this study is that seasonal cycles might play a role similar to ontogenetic stages in animals. The authors touch on this by comparing their findings to the developmental hourglass model, and indeed, the recurrence of phenological states such as winter dormancy may act as a cyclic form of developmental canalization, shaping expression evolution in a way analogous to embryogenesis in animals.

      Strengths:

      (1) The evolutionary effects of seasonal environments on gene expression are rarely studied at this scale. This paper fills that gap.

      (2) The dataset is extensive, covering two years, two tissues, and four tree species, and is well suited to the questions being asked.

      (3) Transcriptome clustering across species (Figure 2) shows strong grouping by season and tissue rather than species, suggesting that the authors effectively controlled for technical confounders such as batch effects and mapping bias.

      (4) The idea that winter imposes a shared constraint on gene expression, especially in buds, is well argued and supported by the data.

      (5) The discussion links the findings to known concepts like phenological synchrony and the developmental hourglass model, which helps frame the results.

      We are grateful for the reviewer for the detailed and thoughtful review of our manuscript.

      Weaknesses:

      (1) While the hierarchical clustering shown in Figure 2A largely supports separation by tissue type and season, one issue worth noting is that some leaf samples appear to cluster closely with bud samples. The authors do not comment on this pattern, which raises questions about possible biological overlap between tissues during certain seasonal transitions or technical artifacts such as sample contamination. Clarifying this point would improve confidence in the interpretation of tissue-specific seasonal expression patterns.

      Leaf samples clustered into the bud are newly flushed leaves collected in April for Q. glauca, May for Q. acuta, May and June for L. edulis, and August and September for L. glaber. To clarify this point, we highlighted these newly flushed leaf samples as asterisk in the revised figure (Fig. 2A).

      (2) While the study provides compelling evidence of conserved and divergent seasonal gene expression, it does not directly examine the role of cis-regulatory elements or chromatin-level regulatory architecture. Including regulatory genomic or epigenomic data would considerably strengthen the mechanistic understanding of expression divergence.

      We thank the reviewer for this insightful comment. As noted in the Discussion section, we hypothesize that such genome-wide seasonal expression patterns—and their divergence across species—are likely mediated by cis-regulatory elements and chromatin-level mechanisms. While a direct investigation of regulatory architecture was beyond the scope of the present study, we fully agree that incorporating regulatory genomic and epigenomic data would significantly deepen the mechanistic understanding of expression divergence. In this regard, we are currently working to identify putative cis-regulatory elements in non-coding regions and are collecting epigenetic data from the same tree species using ChIP-seq. We believe the current study provide a foundation for these future investigations into the regulatory basis of seasonal transcriptome variation. We made a minor revision to the Discussion to note that an important future direction is to investigate the evolution of non-coding sequences that regulate gene expression in response to seasonal environmental changes.

      (3) The manuscript includes a thoughtful analysis of flowering-related genes and seasonal GO enrichment (e.g., Figure 3C-D), providing an initial link between gene expression timing and phenological functions. However, the analysis remains largely gene-centric, and the study does not incorporate direct measurements of phenological traits (e.g., flowering or bud break dates). As a result, the connection between molecular divergence and phenotypic variation, while suggestive, remains indirect.

      We would like to note that phenological traits have been observed in the field on a monthly basis throughout the sampling period and the phenological data were plotted together with molecular phenology (e.g. Fig. 2A, C; Fig. 3C, D). Although the temporal resolution is limited, these observations captured species-specific differences in key phenological events such as leaf flushing and flowering times. We revised the manuscript to clarify this point.

      (4) Although species were sampled from similar habitats, one species (Q. acuta) was collected at a higher elevation, and factors such as microclimate or local photoperiod conditions could influence expression patterns. These potential confounding variables are not fully accounted for, and their effects should be more thoroughly discussed or controlled in future analyses.

      We fully agree with the reviewer that local environmental conditions, including microclimate and photoperiod differences, could potentially influence gene expression patterns. To assess whether the higher elevation site of Q. acuta introduced confounding environmental effects, we reanalyzed the data after excluding this species. Hierarchical clustering still revealed that winter bud samples formed a distinct cluster regardless of species identity (Fig. S7), consistent with our original finding.

      Furthermore, we recalculated the molecular phenology divergence index D (Fig. 4C) and the interspecific Pearson’s correlation coefficients (Fig. 5A) without including Q. acuta. These analyses produced results that were qualitatively similar to those obtained from the full dataset (Fig. S12; Fig. S14), indicating that the observed patterns are not driven by environmental differences associated with elevation.

      We believe these additional analyses help to decouple the effects of environment and genetics, and support our conclusion that both seasonal synchrony and phylogenetic constraints play key roles in shaping transcriptome dynamics. We added four new figures (Fig. S6, Fig. S7, Fig. S12 and Fig. S14) and revised the text accordingly to clarify this point and to acknowledge the potential impact of site-specific environmental variation.

      (5) Statistical and Interpretive Concerns Regarding Δφ and dN/dS Correlation (Figures 5E and 5F):

      a) Statistical Inappropriateness: Δφ is a discrete ordinal variable (likely 1-11), making it unsuitable for Pearson correlation, which assumes continuous, normally distributed variables. This undermines the statistical validity of the analysis.

      We thank the reviewer for the insightful comment. We would like to clarify that the analysis presented in Figures 5E and 5F was based on linear regression, not Pearson’s correlation. Although Δ_φ_ is a discrete variable, it takes values from 0 to 6 in 0.5 increments, resulting in 13 levels. We treated it as a quasi-continuous variable for the purposes of linear regression analysis. This approach is commonly adopted in practice when a discrete variable has sufficient resolution and ordering to approximate continuity. To enhance clarity, we revised the manuscript to explicitly state that linear regression was used, and we now reported the regression coefficient and associated p-value to support the interpretation of the observed trend.

      b) Biological Interpretability: Even with the substantial statistical power afforded by genome-wide analysis, the observed correlations are extremely weak. This suggests that the relationship, if any, between temporal divergence in expression and protein-coding evolution is negligible.

      Taken together, these issues weaken the case for any biologically meaningful association between Δφ and dN/dS. I recommend either omitting these panels or clearly reframing them as exploratory and statistically limited observations.

      We agree with the reviewer’s comment. While we retained the original panels, we reframed our interpretation to emphasize that, despite statistical significance, the observed correlation is very weak—suggesting that coding region variation is unlikely to be the primary driver of seasonal gene expression patterns. Accordingly, we revised the “Relating seasonal gene expression divergence to sequence divergence” section in the Results, as well as the relevant part of the Discussion.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Sentences around lines 250-251 are incomplete and need revision.

      We thank the reviewer for pointing this out. We revised the sentences in the subsection “Peak month distribution of rhythmic genes and intra-genus and inter-genera comparison” in the Results section to ensure clarity and completeness. In addition, to improve the interpretability of the peak month distribution, we added arrows to indicate the major peaks in the circular histograms shown in Fig. 3C and 3D.

      Reviewer #2 (Recommendations for the authors):

      (1) In Figure 1E-G, the term Copy number or Copy number variation could be misleading, as it is commonly associated with inter-individual gene copy number variation in a population. Since the analysis here refers to orthology relationships rather than population-level variation, a more precise term, such as orthogroup classification, may be preferable.

      We thank the reviewer for this helpful suggestion. We agree that the term “copy number” could be misleading in this context. Accordingly, we updated the labeling in Fig. 1 to reflect the more precise term “orthogroup classification.”

      (2) In Figure 3A, the x-axis label Period (month) may be misleading, as it could be mistaken for calendar months rather than referring to the periodicity of gene expression cycles. A more explicit label, such as Expression periodicity (months), might improve clarity for the reader.

      We thank the reviewer for this valuable suggestion. In the original version of Fig. 3A, we used the label “Period (month),” which could indeed be misinterpreted as referring to calendar months. To clarify that this axis represents the length of gene expression cycles, we revised the label to “Period length (months).” This change also aligns with the terminology used throughout the manuscript, where “Period” refers specifically to cycle length, and “Periodicity” denotes the presence or absence of rhythmic expression.

      Other minor revisions

      We also made minor revisions for the reference list and the grant number details, and included the accession numbers for all DNA and RNA sequence data deposited in the DNA Data Bank of Japan (DDBJ) in the Data deposition and code availability section, in addition to the BioProject ID.

    1. formulation

      utilizing

      Clue/Trail/Plex Mark Atomic Terms used for naming

      info-morphic units of information that are high-resolution addressable high fildeilty meaning/intentfully deeply intertwingled named info-morphic-colab-orative interpersonal nterplanetary structures amenable to muassive multiplayer interplays that plays nicely with other structures

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Major comments:

      (comment #1)- It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).

      We acknowledge a slight reduction; however, this difference is not statistically significant (Fig S2c,e). We will quantify the levels of DDR markers upon TRF2 loss and exogenous DSBs and include it in the subsequent revision.

      (comment #2)-A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead). We have performed NSC-specific TRF2 ChIP-seq for an upcoming manuscript, which confirms TRF2 occupancy at multiple promoters of differentiation-associated genes. These data are provided solely for confidential evaluation by the designated reviewers.

      Regarding the ChIP-qPCR control experiments: We thank reviewer for pointing this out, indeed we included controls in our PCR assays as positive (telomeric) and TRF2-nonbinding loci (GAPDH, RPS18, and ACTB, based on HT1080 TRF2 ChIP-seq data) as negative controls. These results were not included earlier for clarity given that we were presenting several ChIP-PCR figures - in response to the comment we have included this now in the revised version (Fig. S3d,e). Gene expression analyses show selective upregulation of the TAN genes upon TRF2 loss (data normalised to GAPDH); whereas negative control genes lacking TRF2 binding (RPS18, ACTB) remain unchanged, ruling out non-specific effects. (Fig S3f,g,j,k).

      -(comment #3) A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?

      We confirm that DNase/RNase is routinely included in our pull-down experiments to exclude nucleic-acid bridging, with detailed methodology now elaborated in the Methods section. Not including this in the manuscript Methods was an oversight from our side. Our data demonstrate that only REST directly interacts with TRF2, while TRF2 engages PRC2 indirectly via REST, as also previously shown by us and others (page 6; ref. [62]; Sharma et al., ref. [15]).

      We thank the reviewer for noting the apparent differentiation in Fig. S5a. However, this observation represents rare spontaneous differentiation event and is not statistically significant (as shown in Fig S5b). Consistently, gene expression analysis of the TRF2-T188N mutant shows no significant change in TRF2-associated neuronal differentiation (TAN) genes. Therefore, Co-IP for TRF2-T188N with REST was not done.

      (comment #4) - The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      We clarify that Bis-indole carboxamide acts as a G4 stabilizer, while SMH14.6 is also a noted G4-binding ligand that stabilizes G4s (ref. [15]). The exclusion of TRF2 from G4 motifs in gene promoters by G4-binding ligands has also been documented previously (ref. [18]). In line with these findings, ChIP experiments performed following ligand treatment revealed a decreased occupancy of TRF2 at TAN gene promoters, supporting the proposed mechanism (added Fig. 6h).

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).

      Corrected

      • Fig S1h: The red box mentioned in the legend is not visible

      Corrected

      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l

      Corrected

      • The symbol γ of γH2AX is missing in the text

      Corrected

      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.

      Added SH-SY5Y in the legend of Fig. 3d.

      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.

      Corrected

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      1. For most of the data graphs in the manuscript, there is no indication of the number of independent biological replicates carried out (which should ideally be plotted as individual dots overlaying the column graphs), or what the error bars represent, or what statistical test was used. All the figure legends and methods have now been updated with the corresponding biological replicates per experiment, with error bars as SD/SEM and the corresponding statistical test along with p values.

      Figure S1.1a: needs a marker to show that the tissue is dentate gyrus.

      We acknowledge the reviewers' concern that high-magnification images alone make it difficult to verify whether the fields are taken from the correct anatomical location. The dentate gyrus (DG) of the hippocampus is a well-defined structure. In the revised figure (Fig S1.1a), we now include a low-magnification image showing the entire hippocampus, including the CA fields, along with two high-magnification fields specifically from the DG region. Consistent with our claim, the co-immunostaining demonstrates that Sox2-positive neural stem cells in the DG are also positive for TRF2.

      Figure 1c (and all other flow cytometry panels throughout the manuscript): it is not clear if the expression of any of these proteins, except maybe MAP2, are significantly different in the presence or absence of TRF2. These differences need to be presented more quantitatively, with the results compiled from multiple biological replicates and analysed statistically. I am not sure that flow cytometry is the best way to determine differences in protein expression levels for non-surface proteins, because many of the reported differences are not at all convincing.

      To detect intracellular/nuclear proteins by flow cytometry, cells were permeabilized using pre-chilled 0.2% Triton X-100 for 10 minutes, as described in the Methods section.

      We have revised the figures (Fig 1c,e) and now included statistical analysis from three independent biological replicates for these experiments.(Fig S1.4h-j, S2e, S6d)

      Fig 1d: has TRF2 been effectively silenced in this experiment? There appears to be just as many TRF2+ nuclei in the "TRF2 silenced" panel vs the control, including in the cells with neurite outgrowths.

      Quantification of nuclear levels of TRF2 showing decrease in nuclear TRF2 has been included in supplementary Fig S1g.

      Fig 2a-c: these experiments need a positive control, showing increased expression of these proteins in mNSC and SH-SY5Y cells in response to a DNA damaging agent. Again, flow cytometry may not be the best method for this; immunofluorescence combined with telomere FISH would be more convincing.

      We confirm that doxorubicin induces 53BP1 foci (IF-FISH Sup Fig. S2b) and TRF1 silencing elevates γH2AX (Sup Fig. S2c) validating DDR sensitivity. Unlike TRF2 loss (Fig. 2a-c), no TIFs appear with IF and telomere probes (Fig. 2d, Sup Fig. 2a), and without TIFs, there is no telomeric fusion. Flow cytometry was performed with Triton X- 100 to target nuclear protein. These findings adequately address the concern; therefore, further IF-FISH experiments were not included in the present study.

      To conclude that telomere damage is not occurring, an independent marker of such damage, such as telomere fusions, should also be measured.

      In response to uncapped telomeres, ATM kinase activates the DNA damage response (DDR), recruiting γH2AX and 53BP1 to telomeres, which precedes the end-to-end fusions (Takai et al., 2003; Maciejowski & de Lange, 2015; Takai et al., 2003; d'Adda di Fagagna et al., 2003; Cesare & Reddel, 2010; Hayashi et al., 2012; Sarek et al., 2015). We observe no DDR activation or foci (Fig. 2; Sup. Fig. 2). This absence of a DDR response and TIFs indicates no telomere uncapping, negating the need for direct telomere fusion analysis.

      Figure S2b is lacking a no-doxorubicin control.

      Untreated control has been included Fig. S2b.

      Figures 3a and 3b need a positive control (e.g. TRF2 binding to telomeric DNA) and a negative control (e.g. a promoter that did not show any TRF2 binding in the HT1080 ChiP-seq experiment in Fig S3).

      We have included positive (telomere) and negative (GAPDH) controls (based on HT1080 TRF2 ChIP-seq data) for the TRF2 ChIP assay in Supplementary Fig. S3d,e. Additionally, positive and negative controls for all ChIP experiments conducted in this study are presented in Supplementary Figs. S3d, S3e, S3h, S3i, S4c-h, and S5c-e

      The data in Figure 3 would be more compelling if all experiments were also performed in fibroblasts to confirm the cell-type specificity of the effect.

      Our HT1080 fibrosarcoma ChIP-seq data (ref. [18]; Sup. Fig. 3a,b) show TRF2 binding to TAN gene promoters in a fibroblast-derived model, with enrichment in neurogenesis-related genes (refs. [19,20]). In fibroblasts TRF2 depletion, as expected, induce telomere dysfunction and DDR (Fig. 2d; Sup. Fig. 2a), and eventually cell-cycle arrest and cell death as also reported earlier (van Steensel et al., 1998; Smogorzewska & de Lange, 2002). Therefore, the suggested experiments which would require sustained TRF2-depletion are not possible to perform in fibroblasts. TRF2 occupancy on the promoter of the genes in question in cells other than NSC was noted in HT1080 cells (ref. [18]; Sup. Fig. 3a,b).

      No references are provided for the TRF2 posttranslational modifications on R17, K176, K190 and T188. What is the evidence for these modifications, and is it known if they participate in the telomeric role of TRF2?

      These lines with references have been included in the manuscript (highlighted in blue).

      R17 methylation enhances telomere stability (66). K176/K190 acetylation stabilizes telomeres and is deacetylated by SIRT6 (67). T188 phosphorylation facilitates telomere repair after DSBs(68). These PTMs primarily support telomeric roles.

      The experiments in Fig 5 should also be performed with WT TRF2, to confirm that effects are not due to the overexpression of TRF2.

      WT TRF2 shows no differentiation phenotype and change in TAN gene expression (Fig. 1f,g; 3h, Sup Fig. 5a). Confirming effects are not due to TRF2 overexpression.

      Fig 5c has not been described in the text, and there are multiple technical problems with the TRF2 WT experiment: i) There appears to be significant background binding of REST to the IgG beads, though this blot has such high background it is hard to tell (the REST blot in Fig S4b is also of poor quality), ii) TRF2 is migrating at two different positions in the Input and IP lanes, and the TRF2 band in the K176R blot is at a different position to either, and iii) the relative loading of the Input and IP lanes is not indicated, so it's not clear why K176R appears to be so enriched in the IP.

      We acknowledge the oversight in not citing Fig 5c in the manuscript. This has been corrected, and, highlighted in blue in the revised manuscript.

      i) Multiple optimization attempts were made for the Co-IP experiments, and the presented figure reflects the best achievable result despite REST blot smearing, a pattern also reported previously (Ref. 65). The TRF2-REST interaction is well established, and a similar background was also observed in the cited study

      ii)Variable migration patterns of TRF2 were also noted in the cited study (Ref. 65), consistent with our observations. Our primary emphasis, however, is on the TRF2 K176R mutant, which clearly disrupts its interaction with REST.

      iii)The input loading corresponds to 10% of the total lysate. As the experiments were conducted independently, variations in transfection and pull-down efficiencies may account for observed differences.

      To rule out indirect effects of the G4 ligands on the results in Fig 6g, the binding of BG4 and TRF2 at the promoters of these genes should be measured by ChIP.

      To confirm that G4 ligand effects on TAN gene promoters are direct, TRF2 occupancy was assessed using ChIP. Significantly decreased occupancy of TRF2 was noted at TAN gene promoters, (added Fig. 6h). This implies that ligand-induced changes in TRF2 binding are directly linked to promoter-level G4 stabilization.

      Minor comments:

      1. The size of all the size markers in western blots should be added to the figures. Size has been included in all the western blots

      2. There are several figure panels that are incorrectly referenced in the text, e.g. Fig S1.1 (e-f) should be Fig S1.1 (e-h); Fig. 1m should be Fig. 1f; Figs 5e and 5f have been swapped.

      Corrected.

      1. Fig S1.4 is not referred to in the text. It is not clear what the purpose of Fig S1.4a is.

      The following line has been included in the manuscript highlighted in blue.

      Neurospheres were characterized using PAX6, a NSC marker (Fig S1.4a).

      Are the experiments in Figs 3e, 4a, 4c and 4e using 4-OHT treatment, or siRNA? If the latter, I don't think a control for the effectiveness of the knockdown in this cell type has been included anywhere in the manuscript.

      It is using siRNA, a western blot showing the effectiveness of knockdown is presented in supplementary figure S4c (now S4a).

      The lanes of the western blots in Fig S4c are not labelled.

      Corrected.

      1. Given that the experiments in Fig 5 were carried out on a background of endogenous WT TRF2 expression, presumably the K176R mutant is having a dominant negative effect. To understand the mechanism of this effect (e.g, is it simply due to replacement of endogenous WT TRF2 at its genomic binding sites by a large excess of exogenous K176R, or is dimerisation with WT TRF2 needed?) it would be helpful to know the relative expression levels of endogenous and K176R TRF2.

      To address the query, qRT-PCR with 3′ UTR-specific primers showed no change in endogenous TRF2 mRNA upon K176R expression in SH-SY5Y cells, while primers detecting total TRF2 revealed ~10-fold higher expression of K176R compared to control (Figure below). This indicates the absence of suppression of endogenous TRF2 mRNA. Given that the mutant's DNA binding is intact (Fig. 5f), the dominant-negative effect of K176R likely arises from overexpression of the exogenous mutant.

      For the sentence "...and critical for transcription factor binding including epigenetic functions that are G4 dependent" (bottom of page 3 of the PDF), the authors cite only their own prior papers, but there are examples from others that could be cited.

      We have incorporated citations from other research groups, now included as references 23-26.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      In this study, the authors show that TRF2 binds non-telomeric G-quadruplexes in promoters of a set of genes ("TAN" genes for TRF2-associated neuronal differentiation) and recruits REST/chromatin remodelers to repress those genes in neural stem cells, thereby maintaining the NSC state in a telomere-independent manner. They show that the loss of TRF2 derepresses TAN genes and promotes neuronal differentiation.

      However, key experiments are missing to fully support the claims: a genome-wide TRF2 ChIP-seq in NSC to validate binding beyond a restricted set of TAN genes, more robust evidence confirming the absence of telomeric dysfunction, and mechanistic clarification of the effects of G4 ligands on TRF2 binding.

      Major comments:

      • It is interesting that TRF2 loss not only fails to increase γH2AX/53BP1 levels but may even slightly reduce them (e.g., Fig. S2c and the IF images). While the main hypothesis is that TRF2 loss does not trigger telomere dysfunction in NSCs, this observation raises the possibility that TRF2 itself contributes to DDR signaling (ATM-P, γH2AX, 53BP1) in these cells and that in its absence, cells are not able to form those foci. To exclude the possibility that telomere-specific DDR is being missed due to an overall dampened DDR response in the absence of TRF2, it would be informative to induce exogenous DSBs in TRF2-depleted cells and test DDR competence (e.g., IF for γH2AX/53BP1). In other words, are those NSC lacking TRF2 even able to form H2AX/53BP1 foci when damaged? In addition, it would be interesting to perform telomere fusion analysis in TRF2 silenced cells (and TRF1 silenced cells as a positive control).
      • A TRF2 ChIP-seq should be performed in NSC as this list of genes (named TAN genes in the text) was determined using a ChIP performed in another cell line (HT1080). For the ChIP-qPCR in the various conditions, primers for negative control regions should be included to show the specific binding of TRF2 to the promoter of the genes associated with neuronal differentiation. For example, an intergenic region and/or promoters of genes that are not associated with neuronal differentiation (or don't contain a potential G4). The same comment goes true for the gene expression analysis: a few genes that are not bound by TRF2 should be included as negative controls to exclude a potential global effect of TRF2 loss on gene expression (ideally a RNA-seq would be performed instead).
      • A co-IP should be performed between the TRF2 PTM mutant K176R or WT TRF2 and REST and PRC2 components to directly show a defect of interaction between them when TRF2 is mutated (a co-IP with DNase/RNase treatment to exclude nucleic-acid bridging). The TRF2 PTM mutant T188N also seems to lead to an increased differentiation (Fig. S5a). Could the author repeat the measure of gene expression and co-IP with REST upon the overexpression of this mutant too?
      • The authors show that the G4 ligands SMH14.6 and Bis-indole carboxamide upregulate TAN genes and promote neuronal differentiation, but the underlying mechanism remains unclear. Bis-indole carboxamide is generally considered a G4 stabilizer, while SMH14.6 is less characterized and should be better introduced. The authors should clarify how G4 stabilization would interfere with TRF2 binding, it seems that it would likely be by blocking access. A more detailed discussion, and ideally TRF2 ChIP after ligand treatment and/or G4 helicase treatment, would strengthen the model.

      Minor comments:

      • Supp Figures related to the scRNA-seq are difficult to read (blurry).
      • Fig S1h: The red box mentioned in the legend is not visible
      • In the text, the Figures 1 f-g are misannotated as Fig 1m and l
      • The symbol  of H2AX is missing in the text
      • Fig.3d, please indicate in the legend that it is done in SH-SY5Y.
      • Fig. S3b: Please consider replotting this panel with an increased y-axis scale. As currently presented, the TRF2 ChIP-seq peaks at several promoters appear truncated by the scaling.
      • Fig S4b: the legends should be fixed, the figure shows TRF2 occupancy upon REST silencing and not the other way around.

      Significance

      Non-telomeric roles of TRF2 have been reported before: in repressing neuronal genes and promoting a stem-like state by stabilizing REST (PMID: 18818083), in promoter G4 binding and recruitment of chromatin repressors (previous studies from the same lab), and TRF2 was shown to be dispensable for telomere protection in pluripotent stem cells (ES). The novelty of the current study lies primarily in extending/combining these mechanisms to NSCs.

    1. But as generative AI becomes more ingrained into the workplace and higher education, a growing number of professors and industry experts believe this will be something all students need,

      The usage of AI being prominent in education beginning to give the idea that it is needed. The article points out that it is going into work and higher education. I think this is both good and bad since AI is being heavily relied on but Isn't accurate and doesn;y need to be added into out everyday lives and something so important such as education.

    1. El vibe coding funciona porque hay gente que sabe programar. Un programador que sabe lo que hace puede pedirle a una IA que le haga un código y luego puede revisar y corregir sus inevitables* errores. O puede corregir los errores de las personas que no saben programar pero usaron un chatbot para escribir código. De hecho hay toda una industria de programadores dedicados a hacer estos arreglos. Muchas empresas de software ahora no están contratando a programadores junior, con la idea de que alguien puede producir código à la vibe coding y luego un programador más experto lo puede corregir. ¿Pero qué van a hacer cuando esos programadores expertos se retiren y las empresas pierdan esas habilidades? Por ahora, muchas confían en las promesas de mejoría de la industria de la inteligencia artificial*.

      Lo que me pareció más interesante de este fragmento es cómo muestra que el vibe coding solo funciona porque aún existen personas con verdadero conocimiento en programación. Me sorprende pensar que, si las empresas dejan de formar nuevos programadores y dependen solo de la inteligencia artificial, llegará un momento en que nadie sabrá cómo corregir los errores que la misma IA cometa. Es curioso cómo una herramienta creada para facilitar el trabajo puede terminar debilitando las habilidades humanas que la sostienen. David Ramos

    2. Ya que escribo como trabajo, muchas veces me han preguntado si no creo que seré reemplazado por una inteligencia artificial. Yo creo que no. Aunque seguramente muchas personas usarán estas herramientas para escribir cosas, consideren lo que pasaría si todo el texto del mundo fuera creado por IA: los modelos de lenguaje en los que están basados estas herramientas simplemente regurgitarían infinitamente otros textos, si bien coherentes, de baja calidad y de dudosa verosimilitud ya regurgitados por otra inteligencia artificial. Eventualmente habría un mercado para algún humano que entrara, cuando menos, a revisar, a editar, a hacer algo con el texto. A escribir.

      La IA puede escribir, sí, pero no puede tener algo que decir y mientras exista alguien que piense, cuestione, viva y sienta va a ser necesario que un humano esté ahí, al menos para revisar, reinterpretar y darle alma a lo que se escribe. Sara Sarria

    3. Por su parte, las redes sociales (en un sentido amplio que incluye foros y blogs) atrofiaron nuestro sentido de habitar una realidad común. Pero a cambio nos dieron la posibilidad de cambiar las dinámicas del poder de la información. Ahora “cualquiera” (en el sentido de Ratatouille) puede hacer escuchar su voz, no sólo los guardianes de la información a los que hemos estado acostumbrados. Esto tiene sus cosas buenas y malas, pero sin duda ha cambiado cómo vivimos e interactuamos.

      Lo que plantea sobre las redes es muy cierto, antes unos pocos hablaban y el resto escuchaba, ahora cualquiera puede opinar, pero eso también hizo que cada uno viva en su propia “burbuja”. Ganamos voz pero perdimos un poco el sentido de realidad compartida "No todos estamos viendo el mismo mundo". Sara Sarria

    4. La inteligencia artificial es muy compleja y aún no nos ha demostrado que se justifique para ser inevitable y que sus críticos quedemos como Sócrates. *La industria de la inteligencia artificial argumenta que su producto mejorará tanto que los errores sí llegarán a ser evitables.

      Cierra el texto retomando el paralelo con Sócrates y deja una pregunta abierta sobre el futuro de la IA. Es un buen punto para un momento reflexivo, me gusta cómo el autor termina el texto dandonos como una cierta duda lo cual, nos invita a pensar si la IA realmente será tan necesaria como la escritura, sugiere que tal vez los críticos de la IA no están equivocados del todo, sino que están advirtiendo sobre un cambio que aún no comprendemos del todo. Esta reflexión nos hace pensar en la responsabilidad colectiva que tenemos frente al uso y los límites de la IA.

    5. A Sócrates no le convencía eso de escribir. Su argumento principal era que, al tener las ideas siempre a la mano en un dispositivo externo a la mente humana, esto atrofiaría nuestra memoria: ya no haríamos un esfuerzo por recordar largos poemas épicos, o largas listas de hechos científicos. Pero tampoco haríamos un esfuerzo por recordar nuestros propios argumentos sobre disquisiciones varias. Todo estaría por ahí, en papel o en piedra, listo para consultarse cuando se nos diera la gana.

      Fue interesantes porque Sócrates, el filósofo, no confiaba en la escritura porque pensaba que al externalizar el conocimiento en textos, nuestra memoria se volvería floja y solo tendríamos una simulación del saber, no el conocimiento de verdad. La ironía es que sabemos esto porque Platón, su estudiante, al final lo escribió el, la escritura se impuso a pesar de las críticas, igual que hoy pasa con la inteligencia artificial: hay un montón de gente que le tiene miedo o le hace críticas, pero el texto sugiere que, como con la escritura, la IA probablemente acabará triunfando y cambiándolo todo, aunque ahora nos cueste verlo.

      Maria Gabriela Quiroga B

    6. Un discípulo de Platón, Aristóteles, a veces es descrito como una de las últimas personas que sabían todo lo que había por saber. No porque estuviera al tanto de todo el conocimiento en general, sino porque en su época la escritura aún no era tan popular y la cantidad de conocimiento a la que podía potencialmente tener acceso un individuo seguía siendo muy limitada. Quizás conociera todo lo que había que conocer en su mundo, pero ese mundo era bastante pequeño. Probablemente ignoraba conocimientos de China, o América, pero no podía saber que los ignoraba.

      Esta parte del texto muestra que Aristóteles era visto como alguien que sabía todo lo que se podía saber en su época, pero ese “todo” era muy limitado. En su tiempo, el conocimiento estaba restringido porque la escritura y el contacto con otros pueblos eran escasos, por eso, aunque fuera muy sabio, solo conocía lo que existía dentro de su pequeño mundo, la idea final resalta que el conocimiento siempre depende del contexto y que incluso los más sabios ignoran cosas que ni siquiera saben que existen. Allison Marentes Reyes

    7. Un discípulo de Platón, Aristóteles, a veces es descrito como una de las últimas personas que sabían todo lo que había por saber. No porque estuviera al tanto de todo el conocimiento en general, sino porque en su época la escritura aún no era tan popular y la cantidad de conocimiento a la que podía potencialmente tener acceso un individuo seguía siendo muy limitada. Quizás conociera todo lo que había que conocer en su mundo, pero ese mundo era bastante pequeño. Probablemente ignoraba conocimientos de China, o América, pero no podía saber que los ignoraba. Eso es imposible de sostener ahora. Ninguna persona por sí sola puede tener en su cabeza todo el conocimiento humano. Pero sí tiene acceso, potencialmente, a todo este conocimiento, en internet, en libros, incluso en ChatGPT. Cada formato con sus errores y sesgos.

      Me pareció un texto muy interesante porque reflexiona sobre cómo ha cambiado nuestra relación con el conocimiento a lo largo del tiempo. En la época de Aristóteles, era posible que una persona conociera casi todo lo que se sabía en su entorno, ya que el mundo que se conocía era limitado y la información circulaba de manera más sencilla. Hoy en día, vivimos rodeados de muchísima información a la que podemos acceder fácilmente gracias al internet y la tecnología. Aun así, tener tanta información también tiene su parte negativa, porque no siempre sabemos qué es cierto o qué vale la pena creer. Por eso, lo difícil ahora no es saberlo todo, sino aprender a entender bien lo que encontramos y usarlo de la mejor manera. Stephania Parra

    8. Por su parte, las redes sociales (en un sentido amplio que incluye foros y blogs) atrofiaron nuestro sentido de habitar una realidad común. Pero a cambio nos dieron la posibilidad de cambiar las dinámicas del poder de la información. Ahora “cualquiera” (en el sentido de Ratatouille) puede hacer escuchar su voz, no sólo los guardianes de la información a los que hemos estado acostumbrados. Esto tiene sus cosas buenas y malas, pero sin duda ha cambiado cómo vivimos e interactuamos.

      En este párrafo se reconoce que las redes sociales transformaron nuestra manera de relacionarnos con la información, permitiendo que más voces puedan ser escuchadas, aunque a costa de perder una sensación compartida de realidad. Personalmente, considero que este cambio no puede juzgarse de forma completamente negativa ni positiva. Si bien es cierto que las redes han fragmentado la percepción colectiva y fomentado la desinformación, también democratizaron el acceso a la expresión y al debate público. En conclusión , su valor depende del uso que las personas hagan de ellas: son un reflejo de nuestras dinámicas sociales más que una causa directa de su deterioro.

    9. Una de las críticas que se le suele hacer a la inteligencia artificial generativa (que como conté en otro post, es una sección muy específica de la IA) y que yo mismo hago, es que va a atrofiar nuestra capacidad de hacer y pensar cosas críticamente. Si decides programar usando sólo un chatbot (una práctica llamada “vibe coding” en inglés), vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo. Nunca vas a aprender a programar bien. Ni siquiera vas a saber cómo corregir los errores que salgan de ese vibe coding, porque no vas a saber identificarlos. Lo mismo puede pasar con cualquier actividad humana que se le delegue a una inteligencia artificial: escribir, componer o tocar música, pensar en argumentos, lo que sea.

      La crítica me parece muy cierta porque la inteligencia artificial, aunque ayuda bastante, también puede hacer que dejemos de pensar por nosotros mismos. Si usamos un chatbot para todo, terminamos repitiendo lo que nos da sin entenderlo, y eso nos quita la capacidad de aprender o de resolver problemas por nuestra cuenta. El problema no es la herramienta, sino cuando dejamos que piense por nosotros. El texto invita a reflexionar sobre cómo la tecnología puede afectar nuestra forma de aprender y crear. No se trata de rechazar la inteligencia artificial, sino de usar su potencial sin perder el pensamiento crítico ni la creatividad humana, porque al final lo que nos diferencia es justamente nuestra manera de cuestionar y comprender lo que hacemos.

      Laura Ximena Bolivar Roman

    10. El vibe coding funciona porque hay gente que sabe programar. Un programador que sabe lo que hace puede pedirle a una IA que le haga un código y luego puede revisar y corregir sus inevitables* errores. O puede corregir los errores de las personas que no saben programar pero usaron un chatbot para escribir código. De hecho hay toda una industria de programadores dedicados a hacer estos arreglos. Muchas empresas de software ahora no están contratando a programadores junior, con la idea de que alguien puede producir código à la vibe coding y luego un programador más experto lo puede corregir. ¿Pero qué van a hacer cuando esos programadores expertos se retiren y las empresas pierdan esas habilidades? Por ahora, muchas confían en las promesas de mejoría de la industria de la inteligencia artificial*.

      El texto plantea una preocupación real sobre cómo la inteligencia artificial está cambiando la forma de programar. Hoy en día muchas personas usan chatbots para generar código sin entender realmente cómo funciona, y eso puede ser peligroso a largo plazo. Los programadores con experiencia todavía corrigen esos errores, pero llegará un momento en que esas personas ya no estén. Entonces, ¿quién quedará con el conocimiento? Más que una crítica a la tecnología, el texto nos hace pensar en la importancia de no perder las habilidades humanas detrás del progreso digital. Daniel Camargo

    11. (con todas las críticas que merecían y aún merecen) se impondrían como tecnología, cambiarían nuestra manera de vivir (y sí, estoy citando esto no como un buen ejemplo, sino como un ejemplo de que uno puede usar este argumento para cualquier innovación)

      Lo que me pareció más importante es cómo el texto muestra que a veces pensamos en innovación solo como avances tecnológicos súper complejos, como nuevo hardware mucho mas potente y eficiente o máquinas "futuristas" como robots animatronicos, pero se nos olvida que lo digital, como las redes sociales, también es una forma de innovación que cambia profundamente cómo vivimos, esas innovaciones digitales entran más fácil en nuestra cultura, se vuelven parte de lo cotidiano sin que nos demos mucha cuenta, y terminan transformando nuestras costumbres más rápido que muchos avances tecnológicos que parecen más “grandes”.

      Jose Sebastian Mosquera Dediego

    12. A diferencia de la escritura, no es claro cuál es el beneficio concreto que pueda traernos la inteligencia artificial para que se justifique su eventual omnipresencia (y el atrofiamiento que ella implica). Si absolutamente todos adoptáramos su uso en todas las áreas de la vida, pronto nadie tendría habilidades.

      Estoy de acuerdo con esta parte del texto, es inevitable pensar que la inteligencia artificial va a cambiar muchos aspectos en nuestras vidas, y uno de ellos es nuestra habilidad de pensar criticamente, debido a que esta misma herramienta hará todo por nosotros sin tener que esforzarnos en pensar.

    13. No podemos negar que la inteligencia artificial esté aquí para quedarse. El asunto es cómo va a quedarse. A diferencia de la escritura, no es claro cuál es el beneficio concreto que pueda traernos la inteligencia artificial para que se justifique su eventual omnipresencia (y el atrofiamiento que ella implica). Si absolutamente todos adoptáramos su uso en todas las áreas de la vida, pronto nadie tendría habilidades.

      Estoy de acuerdo con esta afirmación porque aunque la IA ofrece muchas ventajas su uso excesivo puede generar una fuerte dependencia que afecte nuestras capacidades. Si dejamos que la IA piense, escriba, decida y solucione todo por nosotros poco a poco perderemos la habilidad de razonar, analizar y crear por nuestra propia cuenta "la historia demuestra que las herramientas deben complementar al ser humano, no remplazarlo". Ana Sofia Chacón Casasbuenas

    14. la escritura triunfó como tecnología: casi todas las sociedades del planeta la han adoptado y buena parte de nuestro conocimiento, nuestras comunicaciones y nuestra vida en general está basada en esta invención.

      A pesar de que Sócrates no confiaba en la escritura, esta termino siendo parte esencial de la vida humana. Con el tiempo, casi todas las sociedades la adoptaron y gracias a ella hoy podemos guardar y compartir conocimiento, comunicarnos rápidamente y hacernos la vida mas fácil.

    15. tener habilidades humanas es mucho más valioso.

      Me parece muy importante esta frase, es mas valioso tener la fortuna de pensar por nosotros mismos. Siempre he pensado que la IA nunca podrá reemplazarnos, porque es incapaz de crear algo, sirve como motor de búsqueda, crea algo a partir de miles de cosas que ya existen, si, es una gran ventaja y por eso le es tan fácil "crear", pero nada como un cerebro humano investigando, haciendo lluvias de ideas, sacando inspiración de vivencias propias, sintiendo algo con tanta fuerza que pueda sacar una maravillosa pieza artística, grafica o escrita, la IA nunca podrá reemplazar esto. Si bien es posible que algún día aprendan a crear de ceros, nunca será un cerebro humano lleno de recuerdos, sentimientos, motivaciones y razones para pensar. Además recordemos que la IA siempre va a necesitar de un humano que le diga que hacer.

    16. Una de las críticas que se le suele hacer a la inteligencia artificial generativa (que como conté en otro post, es una sección muy específica de la IA) y que yo mismo hago, es que va a atrofiar nuestra capacidad de hacer y pensar cosas críticamente.

      Esta parte del texto atrajo mi atención, durante todo el texto se mencionó la palabra "atrofiar" para referirnos al hecho de perder la capacidad para recordar algo, en sí perder parte de nuestra memoria, desde que tengo memoria el internet ha sido parte fundamental de mi vida, en la mayoría de cosas o actividades; hace unos años para poder investigar algún tema o tener conocimiento de este ingresabas en miles de páginas para poder recolectar información que sea útil, actualmente ya solo utilizas una herramienta para eso, que es la IA, esta palabra atrofiar da mucho de que pensar, que realmente estamos perdiendo la capacidad de poder hacer tareas tan básicas como el resumen de un libro, etc. Con esto quiero decir, de que no tenemos conciencia ni estamos colocando límites, como máquinas que somos de aprender, mejorar, superar nuestras capacidades de aprendizaje, estamos volviendo flojos en todo sentido, para cada situación recurrimos a que las inteligencias artificiales nos hagan las vida aún más fácil en esta época, estamos "atrofiando" todo lo que somos, la cúspide de la inteligencia en la tierra.

      Juan Sebastian Quiroz Arroyo

    17. vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo. Nunca vas a aprender a programar bien. Ni siquiera vas a saber cómo corregir los errores que salgan de ese vibe coding, porque no vas a saber identificarlos. Lo mismo puede pasar con cualquier actividad humana que se le delegue a una inteligencia artificial: escribir, componer o tocar música, pensar en argumentos, lo que sea.

      Ésta parte me parece muy interesante ya que en un par de lineas logra explicar muy detalladamente uno de los principales y más peligrosos problemas del uso de inteligencias artificiales, ya que al resultarnos efectivas el 99% de las veces llegamos a confiar ciegamente en ellas, siendo esto lo más peligroso, perder la habilidad de ver qué es lo que se está diciendo o si es correcto eso que se está diciendo porque "si antes estuvo bien y no me falló, porqué sería diferente ahora?".

      Juan Castellanos

    18. La inteligencia artificial es muy compleja y aún no nos ha demostrado que se justifique para ser inevitable y que sus críticos quedemos como Sócrates.

      No sé si decir que por el momento sus críticos no están como Sócrates en su momento o ahora. Sócrates criticó un aspecto de la escritura (que eso sí, ya extremista lo llevó al punto del rechazo absoluto), la dependencia de un medio para la memorización, pero en ningún momento habla de cómo aun así tiene otro tipo de beneficios, como es la facilidad de compartir información en masa (aunque eso ya iría mucho más en el futuro de parte de Gutenberg y aprovechado por Lutero). Sócrates criticó un aspecto de la escritura, que incluso hoy día me parece razonable, al igual que la crítica hacia las IA, al igual que Sócrates, se juzgó un aspecto en concreto de ese todo en general, ambas lógicas y funcionales hoy día a mi parecer, por lo cual no me parece este símil que se crea en la conclusión final de que aún no se demuestra que el crítico, en este caso el autor, quede como Sócrates.

    19. Aunque seguramente muchas personas usarán estas herramientas para escribir cosas, consideren lo que pasaría si todo el texto del mundo fuera creado por IA: los modelos de lenguaje en los que están basados estas herramientas

      Esto en especifico me recuerda a varias cosas que lei sobre como funcionaba la ia con la recopilacion de información y como la ia muchas veces funciona de manera que pueda complacerte o hacerte interesarte en un resultado. La inteligencia artificial activamente se va comiendo informacion de todas partes, las traduce y almacena en todo su ''conocimiento'' virtual, pero, ¿Que pasa cuando este conocimiento se retroalimenta de si mismo?

      Uno de estos temas tocaba el como algunas ia tomaban informacion de articulos escritos por otras ia, ¿Es siempre información tan confiable?

      En mi opinion, si de aqui en un futuro sucediera algo así, estariamos condenados a un mundo todavia más lleno de desinformación y seria más dificil investigar que es real y que no, haciendo que muchos datos que son falsos se tomen como ciertos en la mente colectiva

    20. las personas nos daremos cuenta de que obtener habilidades es mucho más valioso de delegárselas a una máquina.

      Esta parte me hizo tener un momento de verdadera reflexión. El autor afirma una idea que en mi cabeza se ha venido debatiendo de un buen tiempo para acá... Que tan realmente "superior" es la habilidad digital y de la IA, sobre la capacidad humana y sus ya conocidos limites?

      Pienso que es algo que desarrolla bien a lo largo del texto, y que como individuo, es algo que al momento no he logrado aclarar.

    21. Por su parte, las redes sociales (en un sentido amplio que incluye foros y blogs) atrofiaron nuestro sentido de habitar una realidad común. Pero a cambio nos dieron la posibilidad de cambiar las dinámicas del poder de la información. Ahora “cualquiera” (en el sentido de Ratatouille) puede hacer escuchar su voz, no sólo los guardianes de la información a los que hemos estado acostumbrados. Esto tiene sus cosas buenas y malas, pero sin duda ha cambiado cómo vivimos e interactuamos.

      Estoy de acuerdo con la idea de que las redes sociales, incluyendo foros y blogs, han tenido un impacto doble. Por un lado, como se menciona, siento que han debilitado nuestra noción de vivir en una realidad compartida.Sin embargo el cambio que trajeron es radical. Antes, la información y la opinión pública estaban casi totalmente controladas por unos pocos, grandes medios de comunicación, instituciones y expertos.a referencia a Ratatouille es perfecta: "cualquiera puede cocinar", o en este caso, cualquiera puede tener una voz. Un testimonio en un blog, un hilo en Twitter o un video en TikTok pueden visibilizar una injusticia o una idea que los medios tradicionales ignoraban. La democratización de la voz viene acompañada de desinformación y ruido. Pero a pesar de eso, ha cambiado esencialmente cómo nos informamos, nos relacionamos y hasta cómo exigimos responsabilidades a los poder. 1025062039

    22. Aunque seguramente muchas personas usarán estas herramientas para escribir cosas, consideren lo que pasaría si todo el texto del mundo fuera creado por IA

      Este punto me parece muy interesante. Si todo lo escribiera una máquina, perderíamos la creatividad y la originalidad humanas. Las ideas frescas, las emociones reales, los errores que también son parte de escribir... todo eso es difícil que lo reproduzca una IA. Me gusta pensar que todavía vale la pena escribir nosotros mismos.

    23. El vibe coding funciona porque hay gente que sabe programar. Un programador que sabe lo que hace puede pedirle a una IA que le haga un código y luego puede revisar y corregir sus inevitables* errores. O puede corregir los errores de las personas que no saben programar pero usaron un chatbot para escribir código. De hecho hay toda una industria de programadores dedicados a hacer estos arreglos. Muchas empresas de software ahora no están contratando a programadores junior, con la idea de que alguien puede producir código à la vibe coding y luego un programador más experto lo puede corregir. ¿Pero qué van a hacer cuando esos programadores expertos se retiren y las empresas pierdan esas habilidades? Por ahora, muchas confían en las promesas de mejoría de la industria de la inteligencia artificial*.

      "El vibe coding suena bien en el corto plazo, pero se apoya en que todavía hay gente que realmente sabe programar. Si dejamos de formar nuevos programadores porque 'la IA lo hace sola', ¿quién va a entender el código en unos años cuando los expertos se retiren? Es como construir un edificio con piezas prefabricadas sin que nadie sepa cómo funciona la estructura: puede aguantar un tiempo, pero el día que falle algo, ¿quién lo arregla?

    24. A diferencia de la escritura, no es claro cuál es el beneficio concreto que pueda traernos la inteligencia artificial para que se justifique su eventual omnipresencia (y el atrofiamiento que ella implica).

      La inteligencia artificial, más que expandir nuestras capacidades, amenaza con adormecerlas. Nos acostumbra a delegar el pensamiento, el aprendizaje y la creación, hasta el punto de confundir comodidad con progreso. Nos ofrece respuestas inmediatas, pero no comprensión; eficiencia, pero no conocimiento. En lugar de impulsarnos hacia una inteligencia más profunda, corre el riesgo de volvernos dependientes de una ilusión de saber, una versión brillante pero vacía de lo que significa realmente pensar.

    25. Una de las críticas que se le suele hacer a la inteligencia artificial generativa (que como conté en otro post, es una sección muy específica de la IA) y que yo mismo hago, es que va a atrofiar nuestra capacidad de hacer y pensar cosas críticamente. Si decides programar usando sólo un chatbot (una práctica llamada “vibe coding” en inglés), vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo. Nunca vas a aprender a programar bien. Ni siquiera vas a saber cómo corregir los errores que salgan de ese vibe coding, porque no vas a saber identificarlos. Lo mismo puede pasar con cualquier actividad humana que se le delegue a una inteligencia artificial: escribir, componer o tocar música, pensar en argumentos, lo que sea.

      Yo me sentí muy identificada con la parte donde dice que, si dejamos que la IA lo haga todo, terminaremos sin saber hacer nada por nosotros mismos, incluso a veces me pasa que uso ChatGPT o traductores para escribir algo rápido, pero después me doy cuenta de que mi propia capacidad para redactar o pensar argumentos se va oxidando. Creo que el ensayo nos recuerda que la práctica humana sigue siendo esencial.

    26. A Sócrates no le convencía eso de escribir. Su argumento principal era que, al tener las ideas siempre a la mano en un dispositivo externo a la mente humana, esto atrofiaría nuestra memoria: ya no haríamos un esfuerzo por recordar largos poemas épicos, o largas listas de hechos científicos. Pero tampoco haríamos un esfuerzo por recordar nuestros propios argumentos sobre disquisiciones varias. Todo estaría por ahí, en papel o en piedra, listo para consultarse cuando se nos diera la gana.

      Me parece muy interesante cómo el autor compara la escritura con la inteligencia artificial. Al principio parece una analogía exagerada, pero al final tiene mucho sentido que ambas son herramientas que cambian nuestra manera de pensar y sobre todo de aprender, lo que más me gustó es que no aterroriza el uso de la IA, sino que invita a usarla con conciencia, como una extensión del pensamiento humano y no como un reemplazo.

    27. Esta victoria, a pesar de las críticas de “tradicionalistas” como Sócrates, ha sido puesta en paralelo con el estado de las cosas con la inteligencia artificial: una nueva tecnología que tiene muchos críticos, pero que eventualmente se impondrá y cambiará nuestra manera de vivir por completo

      En el pasado, Sócrates, (Que en el pasado fue símbolo de los pensadores tradicionalistas), Critico una innovación, En este caso, la escritura o una nueva forma de conocimiento porque creía que corrompía las costumbres humanas o el pensamiento humano.

      Al poner en paralelo con la inteligencia Artificial , estamos viviendo una situación parecida ya que la IA, Como aquella innovación antigua, Tiene demasiados críticos y muchos temores asociados pero en algún momento , se consolidara y transformará radicalmente nuestra forma de vivir y pensar, del mismo modo que ocurrió con la invención que Sócrates rechazaba.

      Como toda gran innovación, aunque genere resistencia en el principio, termina cambiando al mundo.

    28. como estudiante de cine, me hizo reflexionar, si en un futuro, vamos a dejar que la inteligencia artificial, nos construya las historias para trasmitirlas en imagenes, que no estaria del todo mal, pero hay cosas que la inteligencia artifiicial le hara falta y es, esperiencias vividas y un poco de vision, inplica una parte fundamental al momento de crear una historia, son perspectivas diferentes. Si dejamos que que la IA haga cine, nunca los seres humanos podran desarrolar una mentalidad artistica y crativa en la septima arte, a partir del CGi nos podra ahorra un poco el trabajo, pero la idea es utilizarla como herramienta no como solucion, lo relevante aca, es que nosotros como seres humanos siempre seamos autorores de nuestras propiias historias.

      Dilan Alexander Ortiz

    29. A diferencia de la escritura, no es claro cuál es el beneficio concreto que pueda traernos la inteligencia artificial para que se justifique su eventual omnipresencia

      En esta parte del texto el autor dice algo que me parece muy interesante. Es verdad que la inteligencia artificial es una herramienta bastante útil, pero también ha hecho que muchas personas piensen menos o se reten menos a sí mismas. Aun así, no estoy del todo de acuerdo con la idea de que usarla todos los días nos haría perder nuestras habilidades. Para mí, saber usar bien la inteligencia artificial también es una habilidad importante. Si se utiliza como apoyo y no como sustituto, puede ayudarnos a trabajar con más eficiencia y a cometer menos errores, sobre todo en el ámbito laboral. Creo que, más que quitarnos capacidades, podría potenciarlas si aprendemos a usarla de la manera correcta

    30. Pero, a cambio, la escritura nos abrió la posibilidad de conocer mucho más allá de lo que puede guardar una memoria humana individual

      Esto lo interpreto como una reflexión del autor sobre el doble efecto que puede tener la inteligencia artificial en nuestra forma de pensar y aprender. Así como la escritura debilitó la memoria, la IA podría hacer que dependamos menos de nuestras propias habilidades cognitivas; sin embargo, también nos brinda la oportunidad de acceder a una cantidad de información y conocimiento mucho mayor de la que podríamos alcanzar por nosotros mismos. El autor parece querer mostrar que toda herramienta tecnológica implica una pérdida, pero también una ganancia, y que lo importante es encontrar un equilibrio entre aprovechar sus beneficios sin dejar de ejercitar nuestras capacidades humanas

    31. Si decides programar usando sólo un chatbot (una práctica llamada “vibe coding” en inglés), vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo

      Esto lo interpreto como una advertencia del autor, ya que al volvernos dependientes de la inteligencia artificial podríamos perder nuestra capacidad crítica y de aprendizaje, de la misma forma en que Sócrates pensaba que la escritura debilitaba la memoria.

    32. La escritura fue revolucionaria, por todas las razones ya mencionadas; pero la inteligencia artificial parece cada vez más ser una “tecnología normal”

      La escritura debe seguir siendo una forma de arte, incluso en tiempos de inteligencia artificial. Aunque la IA pueda generar textos o ideas, carece de emociones, vivencias y conciencia, elementos esenciales para crear arte verdadero. Escribir no es solo comunicar, sino expresar lo que sentimos y pensamos, transformar nuestras experiencias en palabras con sentido humano. Por eso, debemos aprender a usar la tecnología como una herramienta de apoyo, sin dejar que sustituya nuestra voz ni nuestra creatividad. El equilibrio está en aprovechar lo que ofrece la IA, pero siempre aportando nuestro toque personal, crítico y sensible, porque solo así la escritura mantiene su esencia artística.

    33. Un discípulo de Platón, Aristóteles, a veces es descrito como una de las últimas personas que sabían todo lo que había por saber. No porque estuviera al tanto de todo el conocimiento en general, sino porque en su época la escritura aún no era tan popular y la cantidad de conocimiento a la que podía potencialmente tener acceso un individuo seguía siendo muy limitada. Quizás conociera todo lo que había que conocer en su mundo, pero ese mundo era bastante pequeño. Probablemente ignoraba conocimientos de China, o América, pero no podía saber que los ignoraba.

      Claro que si, decir que Aristóteles fue de los últimos en “saberlo todo” tiene sentido si entendemos que ese “todo” era el conocimiento accesible en su mundo: en Atenas y el entorno helénico podía reunir y ordenar mucha información, pero fuera de ese horizonte había saberes (por ejemplo de China o América) que ni siquiera se imaginaban. Eso no le quita mérito; más bien nos recuerda que la amplitud del conocimiento siempre está limitada por las herramientas y las redes de su época, y que conviene admirar su logro sin olvidar la modestia intelectual.

    34. Por supuesto, muchos de todas maneras la usan y la seguirán usando para realizar actividades que quizás no les son tan importantes. No podemos negar que la inteligencia artificial esté aquí para quedarse. El asunto es cómo va a quedarse.

      La inteligencia artificial representa una nueva revolución tecnológica que, al igual que las anteriores, exige de nosotros una capacidad de adaptación inteligente y crítica. A lo largo de la historia, cada avance (desde la máquina de vapor hasta la era digital) generó miedo y resistencia, pero también impulsó transformaciones positivas cuando aprendimos a integrarlo sin perder nuestras capacidades humanas. La IA no debería verse como un reemplazo del pensamiento, sino como una extensión de nuestras posibilidades. El verdadero reto está en mantener el equilibrio: usar la tecnología para potenciar la creatividad y quizás la productividad, sin caer en la pasividad ni en la dependencia absoluta. Adaptarnos no significa rendirnos ante la máquina, sino aprender a convivir con ella, usándola con conciencia y criterio para seguir siendo los protagonistas de nuestro propio desarrollo.

    35. la escritura sí atrofió nuestra memoria. No la de todos, por supuesto, pero sin duda relegó el acto de recordar a un segundo plano, tanto individualmente

      Esa parte donde dice que "la escritura nos abrió la posibilidad de conocer mucho más allá de lo que puede guardar una memoria humana" me parece clave. Es el mejor ejemplo de que toda tecnología tiene un trade-off. Perdimos algo de memoria, pero ganamos el conocimiento colectivo de la humanidad. Con la IA pasa igual: el reto no es evitarla, sino usarla para expandir nuestra inteligencia sin dejar de ejercitar nuestro pensamiento crítico. Es encontrar ese punto medio entre la herramienta y nuestra autonomía.

    36. las personas nos daremos cuenta de que obtener habilidades es mucho más valioso de delegárselas a una máquina.

      Este fragmento me llamó mucho la atención porque refleja una idea fundamental sobre el papel de la inteligencia artificial en nuestra vida: la importancia de seguir desarrollando nuestras propias habilidades humanas. En una época en la que cada vez más tareas pueden automatizarse, este pensamiento nos invita a no perder de vista el valor del aprendizaje, la creatividad y el pensamiento crítico.

      Me parece interesante que el texto no solo critique la dependencia tecnológica, sino que también resalte la necesidad de equilibrio. Aprender a usar la IA es importante, pero más importante aún es no dejar que reemplace nuestra capacidad de pensar y crear. Como estudiante, esto me hace reflexionar sobre cómo quiero usar la tecnología: no como una muleta, sino como una herramienta para potenciar mis propias habilidades.

    37. Una de las críticas que se le suele hacer a la inteligencia artificial generativa (que como conté en otro post, es una sección muy específica de la IA) y que yo mismo hago, es que va a atrofiar nuestra capacidad de hacer y pensar cosas críticamente. Si decides programar usando sólo un chatbot (una práctica llamada “vibe coding” en inglés), vas a delegar constantemente no sólo el trabajo, sino la capacidad de aprender cómo hacerlo.

      Esta parte me deja pensando mucho. Siento que tiene algo profundamente cierto: cuando dejamos que una máquina piense o cree por nosotros, no solo perdemos una tarea, sino una parte de nosotros mismos. Me pasa a veces, cuando algo me sale mal y quiero buscar la solución rápida en internet o pedirle a una IA que lo haga, que me doy cuenta de lo fácil que es rendirse ante la comodidad. Pero también, de lo vacía que puede sentirse esa “facilidad”. Aprender algo nuevo, equivocarse, incluso frustrarse, tiene un valor que una máquina no puede darnos. Esa lucha, esa torpeza inicial, es donde realmente se forma el pensamiento crítico, donde se despierta la curiosidad. Si dejamos que la inteligencia artificial piense todo por nosotros, ¿en qué se convierte nuestra mente? Tal vez terminemos sabiendo más cosas, pero sintiendo menos. Y me parece que eso sería una pérdida demasiado grande, porque lo que nos hace humanos no es solo lo que sabemos, sino cómo llegamos a saberlo.

    38. Así como las críticas de Sócrates no pudieron parar el éxito de la escritura, nosotros no podríamos parar el auge de las redes sociales.

      Este punto se me hace escencial para complementar mis comentarios anteriores, hemos pasado como sociedad tantos cambios que parecían difciles de superar o que pensabamos cambiarían nuestra manera de ver el mundo, y sí, el mundo ha cambiado radicalmente, pero hasta ahora esa exageración de pensar que cada cambio es el fin del mundo no nos ha llevado a nada, siempre nos terminamos acostumbrando o incluso encontramos la manera de usar estos cambios tan "extremos" a nuestro favor. Por supuesto hay muchos contras, es dificil adaptarse a algo tan nuevo y tan diferente como la inteligencia artifical, pero no va a ir a ningún lado y ya va siendo hora de buscar la manera de usarla a nuestro favor de manera sana y que no afecte nuestro progreso, no es buscar todas las respuestas sino apoyarse para ampliar nuestro conocimiento.

    39. Sino una tecnología más, que tendrá sus usos y aplicaciones, sus consecuencias y efectos, pero no cambiará a toda la sociedad de pies a cabeza.

      La inteligencia artificial es fascinante, hasta incluso ultimamente se ha vuelto indecifrable para el espectador, estoy de acuerdo con el autor, no es algo que cambiará completamente todo lo que conocemos como "sociedad", pero si se presta para muchos infortunios, no sé si considerarlo como solo una tecnologia más pero poco a poco siento que aprenderemos a vivir usando la inteligencia artifical en la cotidianidad.

    40. Si absolutamente todos adoptáramos su uso en todas las áreas de la vida, pronto nadie tendría habilidades

      Acá el autor hace una afirmación que siento podría ser muy interesante comentar, ya que claro, no podemos negar lo innegable, la inteligencia artificial es una herramienta muy útil pero que tambien ha contribuido en que la gente piense menos, o que al menos se rete menos. Lo que si me interesaría comentar de esta parte del texto es que el autor se refiere a que si la inteligencia artificial fuese usada todos los dias para todas las áreas nosotros nos quedaríamos sin habilidades, no concuerdo del todo, el uso correcto de la inteligencia artificial es una habilidad e incluso siento que si se llegara a usar en algunas áreas, no en todas, (como un apoyo) incrementaría la eficiencia y el porcentaje de error laboralmente.

    41. al hacerlo, se pierde la alternativa, que en este caso es poder hacer cosas nosotros mismos

      El autor dice algo muy cierto: usar la IA tiene su precio. A veces sin darnos cuenta dejamos que piense por nosotros, y eso hace que no usemos tanto nuestra propia cabeza. No está mal apoyarse en ella, pero tampoco deberíamos dejarle todo el trabajo. Hay cosas que uno mismo tiene que pensar y resolver.

    42. ya no haríamos un esfuerzo por recordar largos poemas épicos, o largas listas de hechos científicos. Pero tampoco haríamos un esfuerzo por recordar nuestros propios argumentos sobre disquisiciones varias.

      el sugiere que la dependencia que hemos generado hacia la tecnología reduce nuestro esfuerzo intelectual y en base a este podemos generar una pregunta interesante : ¿La inteligencia artificial nos está volviendo más perezosos mentalmente, quitándonos las ganas de pensar y cuestionar, o nos está ayudando de forma positiva a mejorar nuestro método de aprendizaje?

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      We thank the reviewers for their thoughtful and constructive feedback, which helped us strengthen the study on both the computational and biological side. In response, we added substantial new analyses and results in a total of 26 new supplementary figures and a new supplementary note. Importantly, we demonstrated that our approach generalizes beyond tissue outcomes by predicting final-timepoint morphology clusters from early frames with good accuracy as new Figure 4C. Furthermore, we completely restructured and expanded the human expert panel: six experts now provided >30,000 annotations across evenly spaced time intervals, allowing us to benchmark human predictions against CNNs and classical models under comparable conditions. We verified that morphometric trajectories are robust: PCA-based reductions and nearest-neighbor checks confirmed that patterns seen in t-SNE/UMAP are genuine, not projection artifacts. To test whether z-stacks are required, we re-did all analyses with sum- and maximum-intensity projections across five slices; results were unchanged, showing that single-slice imaging is sufficient. From a bioinformatics perspective, we performed negative-label baselines, downsampling analyses to quantify dataset needs, and statistical tests confirming CNNs significantly outperform classical models. Biologically, we clarified that each well contains one organoid, further introduced the Latent Determination Horizon concept tied to expert visibility thresholds, and discussed limits in cross-experiment transfer alongside strategies for domain adaptation and adaptive interventions. Finally, we clarified methods, corrected terminology and a scaler leak, and made all code and raw data publicly available.

      Together, these revisions in our opinion provide an even clearer, more reproducible, and stronger case for the utility of predictive modeling in retinal organoid development.


      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      This study presents predictive modeling for developmental outcome in retinal organoids based on high-content imaging. Specifically, it compares the predictive performance of an ensemble of deep learning models with classical machine learning based on morphometric image features and predictions from human experts for four different task: prediction of RPE presence and lense presence (at the end of development) as well as the respective sizes. It finds that the DL model outperforms the other approaches and is predictive from early timepoints on, strongly indicating a time-frame for important decision steps in the developmental trajectory.

      Response: We thank the reviewer for the constructive and thoughtful feedback. In response to the review as found below, we have made substantial revisions and additions to the manuscript. Specifically, we clarified key aspects of the experimental setup, changed terminology regarding training/validation/test sets, and restructured our human expert baseline analysis by collecting and integrating a substantially larger dataset of expert annotations according to suggestion. We introduced the Latent Determination Horizon concept with clearer rationale and grounding. Most importantly, we significantly expanded our interpretability analyses across three CNN architectures and eight attribution methods, providing comprehensive quantitative evaluations and supplementary figures that extend beyond the initial DenseNet121 examples (new Supplementary Figures S29-S37). We also ensured full reproducibility by making both code and raw data publicly available with documentation. While certain advanced interpretability methods (e.g., Discover) could not be integrated despite considerable effort, we believe the revised manuscript presents a robust, well-documented, and carefully qualified analysis of CNN predictions in retinal organoid development.

      Major comments: I find the paper over-all well written and easy to understand. The findings are relevant (see significance statement for details) and well supported. However, I have some remarks on the description and details of the experimental set-up, the data availability and reproducibility / re-usability of the data.

      1. Some details about the experimental set-up are unclear to me. In particular, it seems like there is a single organoid per well, as the manuscript does not mention any need for instance segmentation or tracking to distinguish organoids in the images and associate them over time. Is that correct? If yes, it should be explicitly stated so. Are there any specific steps in the organoid preparation necessary to avoid multiple organoids per well? Having multiple organoids per well would require the aforementioned image analysis steps (instance segmentation and tracking) and potentially add significant complexity to the analysis procedure, so this information is important to estimate the effort for setting up a similar approach in other organoid cultures (for example cancer organoids, where multiple organoids per well are common / may not be preventable in certain experimental settings).

      Response: We thank the reviewer for this question. We agree that these preprocessing steps would add more complexity to our presented preprocessing steps and would definitely be required in some organoid systems. In our experimental setup, there is only one organoid per well which forms spontaneously after cell seeding from (almost) all seeded cells. There are no additional steps necessary in order to ensure this behaviour in our setup. We amended the Methods section to now explicitly state this accordingly (paragraph ‘Organoid timelapse imaging’).

      The terminology used with respect to the test and validation set is contrary to the field, and reporting the results on the test set (should be called validation set), should be avoided since it is used to select models. In more detail: the terms "test set" and "validation set" (introduced in 213-221) are used with the opposite meaning to their typical use in the deep learning literature. Typically, the validation set refers to a separate split that is used to monitor convergence / avoid overfitting during training, and the test set refers to an external set that is used to evaluate the performance of trained models. The study uses these terms in an opposite manner, which becomes apparent from line 624: "best performing model ... judged by the loss of the test set.". Please exchange this terminology, it is confusing to a machine learning domain expert. Furthermore, the performance on the test set (should be called validation set) is typically not reported in graphs, as this data was used for model selection, and thus does not provide an unbiased estimate of model performance. I would remove the respective curves from Figures 3 and 4.

      Response: We are thankful for the reviewers comments on this matter. Indeed, we were using an opposite terminology compared to what is commonly used within the field. We have adjusted the Results, Discussion and Methods sections as well as the figures accordingly. Further, we added a corresponding disclaimer for the code base in the github repository. However, we prefer to not remove the respective curves from the figures. We think that this information is crucial to interpret the variability in accuracy between organoids from the same experiments and organoids acquired from a different, independent experiment. The results suggest that the accuracy for organoids within the same experiments is still higher, indicating to users the potential accuracy drop resulting from independent experiments. As we think that this is crucial information for the interpretability of our results, we would like to still include it side-by-side with the test data in the figures.

      The experimental set-up for the human expert baseline is quite different to the evaluation of the machine learning models. The former is based on the annotation of 4,000 images by seven expert, the latter based on a cross-validation experiments on a larger dataset. First of all, the details on the human expert labeling procedure is very sparse, I could only find a very short description in the paragraph 136-144, but did not find any further details in the methods section. Please add a methods section paragraph that explains in more detail how the images were chosen, how they were assigned to annotators, and if there was any redundancy in annotation, and if yes how this was resolved / evaluated. Second, the fact that the set-up for human experts and ML models is quite different means that these values are not quite comparable in a statistical sense. Ideally, human estimators would follow the same set-up as in ML (as in, evaluate the same test sets). However, this would likely prohibitive in the required effort, so I think it's enough to state this fact clearly, for example by adding a comment on this to the captions of Figure 3 and 4.

      Response: We thank the reviewer for this constructive suggestion. We agree that the curves for human evaluations in the original draft were calculated differently compared to the curves for the classification algorithms, mostly stemming from feasibility of data set annotation at the time. In order to still address this suggestion, we went on to repeat and substantially expand the number of images annotated and thus revised the full human expert annotation. Each one of 6 human experts was asked to predict/interpret 6 images of each organoid within the full dataset. In order to select the images, we divided the time course (0-72h) into 6 evenly spaced intervals of 12 hours. For each interval, one image per organoid and human expert was randomly selected and assigned. This resulted in a total of 31,626 classified images (up from 4000 in the original version of the manuscript), from which the assigned images were overlapping between experts for each source interval but not for the individual images. We then changed the calculation of the curves to be the same as for the classification analysis: F1 data were calculated for each experiment over 6 timeframes and all experts, and plotted within the respective figure. We have amended the Methods section accordingly and replaced the respective curves within Figures 3 and 4 and Supplementary Figures S1, S8 and S19.

      It is unclear to me where the theoretical time window for the Latent Determination Horizon in Figure 5 (also mentioned in line 350) comes from? Please explain this in more detail and provide a citation for it.

      Response: We thank the reviewer for this important point. The Latent Determination Horizon (LDH) is a conceptual framework we introduced in this study to describe the theoretical period during which the eventual presence of a tissue outcome of interest (TOI) is being determined but not yet detectable. It is derived from two main observations in our dataset: (i) the inherent intra- and inter-experimental heterogeneity of organoid outcomes despite standardized protocols, and (ii) the progressive increase in predictive performance of our deep learning models over time, which suggests that informative morphological features only emerge gradually. We have now clarified this rationale in the manuscript (Discussion section) further and explicitly stated that the LDH is a concept we introduce here, rather than a previously described or cited term.

      The timewindow is defined by the TOI visibility, which is defined empirically as indicated by the results of our human expert panel (compare also Supplementary Figure S1).

      The intepretability analysis (Figure 4, 634-639) based on relevance backpropagation was performed based on DenseNet121 only. Why did you choose this model and not the ResNet / MobileNet? I think it is quite crucial to see if there are any differences between these model, as this would show how much weight can be put on the evidence from this analysis and I would suggest to add an additional experiment and supplementary figure on this.

      Response: We thank the reviewer for this important comment regarding the interpretability analysis and the choice of model. In the original submission, we restricted the attribution analyses shown in originial Figure 4C to DenseNet121, which served as our main reference model throughout the study. This choice was made primarily for clarity and to avoid redundancy in the main figures, as all three convolutional neural network (CNN) architectures (DenseNet121, ResNet50, MobileNetV3_Large) achieved comparable classification performance on our tasks.

      In response to the reviewer’s concern, we have now extended the interpretability analyses to include all three CNN architectures and a total of eight attribution methods (new Supplementary Note 1). Specifically, we generated saliency maps for DenseNet121, ResNet50, and MobileNetV3_Large across multiple time points and evaluated them using a systematic set of metrics: pairwise method agreement within each model (new Supplementary Figure S29), cross-model consistency per method (new Supplementary Figure S34), entropy and diffusion of saliencies over time (new Supplementary Figure S35), regional voting overlap across methods (new Supplementary Figure S36), and spatial drift of saliency centers of mass (new Supplementary Figure S37).

      These pooled analyses consistently showed that attribution methods differ markedly in the regions they prioritize, but that their relative behaviors were mostly stable across the three CNN architectures. For example, Grad-CAM and Guided Grad-CAM exhibited strong internal agreement and progressively focused relevance into smaller regions, while gradient-based methods such as DeepLiftSHAP and Integrated Gradients maintained broader and more diffuse relevance patterns but were the most consistent across models. Perturbation-based methods like Feature Ablation and Kernel SHAP often showed decreasing entropy and higher spatial drift, again similarly across architectures.

      To further address the reviewer’s point, we visualized the organoid depicted in original Figure 4C across all three CNNs and all eight attribution methods (new Supplementary Figures S30-S33). These comparisons confirm and extend analysis of the qualitative patterns described in original Figure 4C and show that they are not specific to DenseNet121, but are representative of the general behavior across architectures.

      In sum, we observed notable differences in how relevance was assigned and how consistently these assignments aligned. Highlighted organoid patterns were not consistent enough across attribution methods for us to be comfortable to base unequivocal biological interpretation on them. Nevertheless we believe that the analyses in response to the reviewer’s suggestions (new Supplementary Note 1 and new Supplementary Figures S29-S37) add valuable context to what can be expected from machine learning models in an organoid research setting.

      As we did not base further unequivocal biological claims on the relevance backpropagation, we decided to move the analyses to the Supporting Information and now show a new model predicting organoid morphology by morphometrics clustering at the final imaging timepoint in new Figure 4C in line with suggestions by Reviewer #3.

      The code referenced in the code availability statement is not yet present. Please make it available and ensure a good documentation for reproducibility. Similarly, it is unclear to me what is meant by "The data that supports the findings will be made available on HeiDoc". Does this only refer to the intermediate results used for statistical analysis? I would also recommend to make the image data of this study available. This could for example be done through a dedicated data deposition service such as BioImageArchive or BioStudies, or with less effort via zenodo. This would ensure both reproducibility as well as potential re-use of the data. I think the latter point is quite interesting in this context; as the authors state themselves it is unclear if prediction of the TOIs isn't even possible at an earlier point that could be achieved through model advances, which could be studied by making this data available.

      Response: We thank the reviewer for this comment. We have now made the repository and raw data public on the suggested platform (Zenodo) and apologize for this oversight. The links are contained within the github repository which is stated in the manuscript under “Data availability”.

      Minor comments:

      Line 315: Please add a citation for relevance backpropagation here.

      Response: We have included citations for all relevance backpropagation methods used in the paper.

      Line 591: There seems to be typo: "[...] classification of binary classification [...]"

      Response: Corrected as suggested.

      Line 608: "[...] where the images of individual organoids served as groups [...]" It is unclear to me what this means.

      Response: We wanted to express that organoid images belonging to one organoid were assigned in full to a training/validation set. We have now stated this more clearly in the Methods section.

      Reviewer #1 (Significance (Required)):

      General assessment: This study demonstrates that (retinal) organoid development can be predicted from early timepoints with deep learning, where these cannot be discerned by human experts or simpler machine learning models. This fact is very interesting in itself due to its implication for organoid development, and could provide a valuable tool for molecular analysis of different organoid populations, as outlined by the authors. The contribution could be strengthened by providing a more thorough investigation of what features in the image are predictive at early timepoints, using a more sophisticated approach than relevance backprop, e.g. Discover (https://www.nature.com/articles/s41467-024-51136-9). This could provide further biological insight into the underlying developmental processes and enhance the understanding of retinal organoid development.

      Response: We thank the reviewer for this assessment and suggestion. We agree that identifying image features predictive at early timepoints would add important biological context. We therefore attempted to apply Discover to our dataset. However, we were unable to get the system to run successfully. After considerable effort, we concluded that this approach could not be integrated into our current analysis. Instead, we report our substantially expanded results obtained with relevance backpropagation, which provided the most interpretable and reproducible insights for our study as described above (New Supplementary Note 1, new Supplementary Figures S29-S37).

      Advance: similar studies that predict developmental outcome based on image data, for example cell proliferation or developmental outcome exist. However, to the best of my knowledge, this study is the first to apply such a methodology to organoids and convincingly shows is efficacy and argues is potential practical benefits. It thus constitutes a solid technical advance, that could be especially impactful if it could be translated to other organoid systems in the future.

      Response: We thank the reviewer for this positive assessment of our work and for highlighting its novelty and potential impact. We are encouraged that the reviewer recognizes the value of applying predictive modeling to organoids and the opportunities this creates for translation to other organoid systems.

      Audience: This research is of interest to a technical audience. It will be of immediate interest to researchers working on retinal organoids, who could adapt and use the proposed system to support experiments by better distinguishing organoids during development. To enable this application, code and data availability should be ensured (see above comments on reproducibility). It is also of interest to researchers in other organoid systems, who may be able to adapt the methodology to different developmental outcome predictions. Finally, it may also be of interest to image analysis / deep learning researchers as a dataset to improve architectures for predictive time series modeling.

      My research background: I am an expert in computer vision and deep learning for biomedical imaging, especially in microscopy. I have some experience developing image analysis for (cancer) organoids. I don't have any experience on the wet lab side of this work.

      Response: We thank the reviewer for this encouraging feedback and for recognizing the broad relevance of our work across retinal organoid research, other organoid systems, and the image analysis community. We are pleased that the potential utility of our dataset and methodology is appreciated by experts in computer vision and biomedical imaging. We have now made the repository and raw data public and apologize for this oversight. The links are provided in the manuscript under “Data availability”.

      Constantin Pape


      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Response: We thank the reviewer for the detailed and constructive feedback, which has greatly improved the clarity and rigor of our manuscript. In response, we have corrected a potential data leakage issue, re-ran the affected analyses, and confirmed that results remain unchanged. We clarified the use of data augmentation in CNN training, tempered some claims throughout the text, and provided stronger justification for our discretization approach together with new supplementary analyses (New Supplementary Figures S26, S27). We substantially expanded our interpretability analyses across three CNN architectures and eight attribution methods, quantified their consistency and differences (new Supplementary Figures S29, S34-S37, new Supplementary Note 1), and added comprehensive visualizations (New S30-S33). We also addressed technical artifact controls, provided downsampling analyses to support our statement on sample size sufficiency (new Supplementary Figure S28), and included negative-control baselines with shuffled labels in Figures 3 and 4. Furthermore, we improved the clarity of terminology, figures, and methodological descriptions, and we have now made both code and raw data publicly available with documentation. Together, we believe these changes further strengthen the robustness, reproducibility, and interpretability of our study while carefully qualifying the claims.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Response: We thank the reviewer for raising these important methodological points. As Reviewer #1 correctly noted, our use of the terms validation and test may have contributed to confusion. To clarify: in the original analysis the scalers were fitted on the training and validation data and then applied to the test data. This indeed constitutes a form of data leakage. We have corrected the respective code, re-ran all analyses that were potentially affected, and did not observe any meaningful change in the reported results. The Methods section has been amended to clarify this important detail.

      For the neural networks, each image was normalized independently (per image), without using dataset-level statistics, thereby avoiding any risk of data leakage.

      Regarding data augmentation, the convolutional neural network was indeed trained with augmentations. Early experiments without augmentation led to severe overfitting, confirming that the performance advantage would not hold without artificially increasing the effective sample size. We have added a clarifying statement in the Methods section to make this explicit.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Response: We believe our additionally performed computational experiments qualify all the claims we make in the revised version of the manuscript.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that regression frameworks can, in principle, preserve the full resolution of continuous outcome variables. However, in our setting we deliberately chose a discretization approach. First, the discretized outcome categories correspond to ranges of tissue sizes that are biologically meaningful and allow direct comparison to expert annotations. In practice, human experts also tend to judge tissue presence and size in categorical rather than strictly continuous terms, which was mirrored by our human expert annotation strategy. As we aimed to compare deep learning with classical machine learning models and with expert annotations across the same prediction tasks, a categorical outcome formulation provided the most consistent and fair framework. Secondly, the underlying outcome variables did not follow a normal distribution, but instead exhibited a skewed and heterogeneous spread. Regression models trained on such distributions often show biases toward the most frequent value ranges, which may obscure less common but biologically important outcomes. Discretization mitigated this issue by balancing the prediction task across defined size categories.

      In line with the reviewer’s request, we have now analyzed the performance in relation to the distance of each sample from the bin center. These results are provided as new Supplementary Figures S26 and S27. Interestingly, for the classical machine learning classifiers, F1 scores tended to be somewhat higher for samples close to bin edges. For the convolutional neural networks, however, F1 scores were more evenly distributed across distances from bin centers. While the reason for this difference remains unclear, the analysis demonstrates that the discretization did not obscure predictive signals in either framework. We have amended the results section accordingly.

      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.

      Response: We thank the reviewer for raising these important concerns. In the initial version we showed examples of relevance backpropagation that suggested CNNs rely on visible RPE or lens tissue for their predictions (original Figure 4C). Following the reviewer’s comment, we expanded the analysis extensively across all models and attribution methods (compare new Supplementary Note 1), and quantified agreement, consistency, entropy, regional overlap, and drift (new Supplementary Figures S29 and S34-S37), as well as providing comprehensive visualizations across models and methods (new Supplementary Figures S30-S33).

      This extended analysis showed that attribution methods behave very differently from each other, but consistently so across the three CNN architectures. Each method displayed characteristic patterns, for example in entropy or center-of-mass drift, but the overlap between methods was generally low. While integrated gradients and DeepLiftSHAP tended to concentrate on tissue regions, other methods produced broader or shifting relevance patterns, and overall we could not establish robust or interpretable signals from a biological point of view that would support stronger conclusions.

      We have therefore revised the text to focus on descriptive results only, without making claims about early structural information or tissue-specific cues being used by the networks. We also added missing scale bars and clarified methodological details. Together, the revised section now reflects the extensive work performed while remaining cautious about what can and cannot be inferred from saliency methods in this setting.

      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?

      Response: We thank the reviewer for this comment. We have not performed any adjustment beyond manual quality control post organoid seeding. The aforementioned removal of technical artifacts included, among others, seeding at the same time of day, seeding and cell processing by the same investigator according to a standardized protocol, usage of reproducible chemicals (same LOT, frozen only once, etc.) and temperature control during image acquisition. We adhered strictly to internal, previously published workflows that were aimed to reduce any variability due to technical variations during cell harvesting, organoid preparation and imaging. We have clarified this important point in the Methods section.

      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Response: We thank the reviewer for this important comment. To clarify, our statement regarding the sufficiency of ~500 organoids was based on a downsampling-style analysis we had already performed. In this analysis, we systematically reduced the number of experiments used for training and assessed predictive performance for both CNN- and classifier-based approaches (former Supplementary Figure S11, new Supplementary Figure S28). For CNNs, performance curves plateaued at approximately six experiments (corresponding to ~500 organoids), suggesting that increasing the sample size further only marginally improved prediction accuracy. In contrast, we did not observe a clear plateau for the machine learning classifiers, indicating that these models can achieve comparable performance with fewer training experiments. We have revised the manuscript text to clarify that this conclusion is derived from these analyses, and continue to include Supplementary Figure S11 as new Supplementary Figure S28 for transparency (compare Supplementary Note 1).

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments. Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Response: We confirm that the suggested experiments are realistic in terms of time and resources and have been able to complete them within 6 months.

      Are the data and the methods presented in such a way that they can be reproduced? No, the code is not currently available. We were not able to review the source code.

      Response: We have now made the repository public. We apologize for this initial oversight. The links are provided in the revised version of the manuscript under “Data availability”.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.

      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Response: We thank the reviewer for this comment. We have calculated the respective curves with neural networks and machine learning classifiers that were trained on data with shuffled labels and have included these results as a separate curve in the respective Figures 3 and 4. We have also amended the Methods section accordingly.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Response: We thank the reviewer for highlighting the need to clarify terminology. We have revised the manuscript accordingly. Specifically, we now explicitly define comprehensive dataset as longitudinal brightfield imaging of ~1,000 organoids from 11 independent experiments, imaged every 30 minutes over several days, covering a wide range of developmental outcomes at high temporal resolution. Furthermore, we replaced the term significantly with wording that avoids implying statistical significance, where appropriate. We have clarified the morphometrics feature space in the Methods section in a more detailed fashion, describing the custom parameters that we used to enhance the regionprops_table function of skimage.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions? - Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?

      Response: We thank the reviewer for pointing out this potential source of confusion. The distances shown in original Figures 2C and 2D were not calculated in tSNE space. Instead, morphometrics features were first Z-scaled, and then dimensionality reduction by PCA was applied, with the first 20 principal components retaining ~93% of the variance. Euclidean distances were subsequently computed in this 20-dimensional PC space. For inter-organoid distances (Figure 2C), we calculated mean pairwise Euclidean distances between all organoids at each imaging time point, capturing the global divergence of organoid morphologies over time in an experiment-specific manner. For intra-organoid distances (Figure 2D), we calculated Euclidean distances between consecutive time points (n vs. n+1) for each individual organoid, thereby quantifying the extent of morphological change within organoids over time. We have revised the Figure legend and Methods section to make these definitions clearer.

      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.

      Response: We thank the reviewer for this comment. In our initial machine learning analyses, we systematically benchmarked a broad set of classifiers on the morphometrics feature space, using cross-validation and hyperparameter tuning where appropriate. The classifiers that we ultimately focused on were those that consistently achieved the best performance in these comparisons. This process is described in the Methods and summarized in the Supplementary Figures S4 and S15 (for sum- and maximum-intensity z-projections new Supplementary Figures S5/6 and S16/17), which show the results of the benchmarking. We have clarified the text to state that the selected classifiers were chosen on the basis of their superior performance in these evaluations.

      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating?

      Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.

      Response: We thank the reviewer for these thoughtful questions. The retinal organoids in our study were embedded in low concentrations of Matrigel and remained relatively stable in position throughout imaging. We did not observe substantial displacement or lateral movement of organoids, and no systematic rotation could be detected in our dataset. Small morphological rearrangements within organoids were observed, but the gross positioning of organoids within the wells remained consistent across time-lapse recordings.

      Regarding generalization across laboratories, we agree with the reviewer that this is an important limitation. While we minimized technical variability by adhering to a highly standardized, published protocol (see Methods), considerable heterogeneity remained at both intra- and inter-experimental levels. This variability likely reflects inherent properties of the system, similar the reportings in the literature across organoid systems, rather than technical artifacts, and poses a potential challenge for applying our models to independently generated datasets. We therefore highlight the need for future work to test the robustness of our models across laboratories, which will be essential to determine the true generalizability of our approach. We have amended the Discussion accordingly.

      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.

      Response: We thank the reviewer for this comment. We agree that the individual image analysis steps we used, such as morphometric feature extraction, are based on well-established algorithms. By referring to “advanced image analysis,” we intended to highlight not the novelty of each single algorithm, but rather the way in which we systematically combined a large number of quantitative parameters and leveraged them through machine learning models to generate predictive insights into organoid development.

      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.

      Response: We thank the reviewer for pointing out this ambiguity. To clarify, the ground truth definition at the final time point was established by two experts who annotated all organoids. These two annotators were part of the larger group of six experts who contributed to the earlier human expert annotation tasks. Thus, while six experts provided annotations for subsets of images during the expert prediction experiments, the final annotation for every single organoid at its last time frame was consistently performed by the same two experts to ensure a uniform ground truth. We have amended this in the revised manuscript to make this distinction clear.

      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?

      Response: We thank the reviewer for this comment. To clarify, we extracted 165 morphometric features per segmented organoid, combining standard scikit-image region properties with custom implementations (e.g., blur quantified as the variance of the Laplace filter response within the organoid mask). All metrics, including blur, were calculated per segmented organoid rather than per full field of view. This broad feature space was deliberately chosen to capture size, shape, and intensity distributions in a comprehensive and unbiased manner. We now provide a more detailed description of the preprocessing steps, the full feature list, and the exact code implementations are provided in the Methods section (“Large-scale time-lapse Image analysis”) of the revised version of the manuscript as well as in the source code github repository.

      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?

      Response: We thank the reviewer for this comment. The reported image count includes slice 3 only, which we based our models on. The five z-slices that we used to create the MAX- and SUM-intensity z-projections would increase this number 5-fold. While we agree that the number of organoids and time points are highly informative metrics and have provided these details in the manuscript, we also believe that reporting the image count is valuable, as it directly reflects the size of the dataset processed by our analysis pipelines. For this reason, we prefer to keep the current description.

      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?

      Response: We thank the reviewer for this valuable suggestion. To address this point, we repeated all analyses using both sum- and maximum-intensity z-projections and have included the results as new Supplementary Figures S8-S10, S13/S14 for TOI emergence and new Supplementary Figures S19-S21, S24/S25 for TOI sizes (classifier benchmarking and hyperparameter tuning in new Supplementary Figures S5/S6 and S16/S17). These additional analyses did not reveal a noticeable improvement in performance, suggesting that projections incorporating all slices are not strictly necessary in our setting. An analysis that included all five z-slices separately for classification would indeed be of interest, but was not feasible within the scope of this study, as it would substantially increase the computational demands beyond the available resources and timeframe.

      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Response: We thank the reviewer for raising this important point. The current study relied on expert visual review, which is time-intensive, but our findings suggest several ways to streamline future work. For instance, model-assisted prelabeling could be used to automatically accept high-confidence cases while routing only uncertain cases to experts. Active sampling strategies, focusing expert review on boundary cases or rare classes, as well as programmatic checks from morphometrics (e.g., blur or contrast to flag low-quality frames), could further reduce effort. Consensus annotation could be reserved only for cases where the model and expert disagree or confidence is low. Finally, new experiments could be bootstrapped with a small seed set of annotated organoids for fine-tuning before switching to such a model-assisted workflow. These possibilities are enabled by our approach, where organoids are imaged individually, morphometrics provide automated quality indicators, and the CNN achieves reliable performance at early developmental stages, making model-in-the-loop annotation a feasible and efficient strategy for future studies. We have added a clarifying paragraph to the Discussion accordingly.

      Reviewer #2 (Significance (Required)):

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field. The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.

      • The work uses standard convolutional neural networks.

      Response: We thank the reviewer for this assessment. We agree that our work represents one of the early attempts in this direction, applying a straightforward pipeline with standard convolutional neural networks, and we appreciate the reviewer’s acknowledgment of how the study has been placed in context within the Introduction.

      State what audience might be interested in and influenced by the reported findings. - Data scientists performing image-based profiling for time lapse imaging of organoids.

      • Retinal organoid biologists

      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Response: We thank the reviewer for outlining the relevant audiences. We agree that the reported findings will be of interest to data scientists working on image-based profiling, retinal organoid biologists, and more broadly to organoid researchers facing long culture times with uncertain developmental outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate. - Image-based profiling/morphometrics

      • Organoid image analysis

      • Computational biology

      • Cell biology

      • Data science/machine learning

      • Software

      This is a signed review:

      Gregory P. Way, PhD

      Erik Serrano

      Jenna Tomkinson

      Michael J. Lippincott

      Cameron Mattson

      Department of Biomedical Informatics, University of Colorado


      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary:

      This manuscript by Afting et. al. addresses the challenge of heterogeneity in retinal organoid development by using deep learning to predict eventual tissue outcomes from early-stage images. The central hypothesis is that deep learning can forecast which tissues an organoid will form (specifically retinal pigmented epithelium, RPE, and lens) well before those tissues become visibly apparent. To test this, the authors assembled a large-scale time-lapse imaging dataset of ~1,000 retinal organoids (~100,000 images) with expert annotations of tissue outcomes. They characterized the variability in organoid morphology and tissue formation over time, focusing on two tissues: RPE (which requires induction) and lens (which appears spontaneously). The core finding is that a deep learning model can accurately predict the emergence and size of RPE and lens in individual organoids at very early developmental stages. Notably, a convolutional neural network (CNN) ensemble achieved high predictive performance (F1-scores ~0.85-0.9) hours before the tissues were visible, significantly outperforming human experts and classical image-analysis-based classifiers. This approach effectively bypasses the issue of stochastic developmental heterogeneity and defines an early "determination window" for fate decisions. Overall, the study demonstrates a proof-of-concept that artificial intelligence can forecast organoid differentiation outcomes non-invasively, which could revolutionize how organoid experiments are analyzed and interpreted.

      Recommendation:

      While this manuscript addresses an important and timely scientific question using innovative deep learning methodologies, it currently cannot be recommended for acceptance in its present form. The authors must thoroughly address several critical limitations highlighted in this report. In particular, significant issues remain regarding the generalizability of the predictive models across different experimental conditions, the interpretability of deep learning predictions, and the use of Euclidean distance metrics in high-dimensional morphometric spaces-potentially leading to distorted interpretations of organoid heterogeneity. These revisions are essential for validating the general applicability of their approach and enhancing biological interpretability. After thoroughly addressing these concerns, the manuscript may become suitable for future consideration.

      Response: We thank the reviewer for the thoughtful and constructive comments. In response, we expanded our analyses in several key ways. We clarified limitations regarding external datasets. Interpretability analyses were greatly extended across three CNN architectures and eight attribution methods (new Supplementary Figures S29-S37, new Supplementary Note 1), showing consistent but method-specific behaviors; as no reproducible biologically interpretable signals emerged, we now present these results descriptively and clearly state their limitations. We further demonstrated the flexibility of our framework by predicting morphometric clusters in addition to tissue outcomes (new Figure 4C), confirmed robustness of the morphometrics space using PCA and nearest-neighbor analyses (new Supplementary Figure S3), and added statistical tests confirming CNNs significantly outperform classical classifiers (Supplementary File 1). Finally, we made all code and raw data publicly available, clarified species context, and added forward-looking discussion on adaptive interventions. We believe these revisions now further improve the rigor and clarity of our work.

      Major Issues (with Suggestions):

      1. Generalization to Other Batches or Protocols: The drop in performance on independent validation experiments suggests the model may partially overfit to specific experimental conditions. A major concern is how well this approach would work on organoids from a different batch or produced by a slightly different differentiation protocol. Suggestion: The authors should clarify the extent of variability between their "independent experiment" and training data (e.g., were these done months apart, with different cell lines or minor protocol tweaks?). To strengthen confidence in the model's robustness, I recommend testing the trained model on one or more truly external datasets, if available (for instance, organoids generated in a separate lab or under a modified protocol). Even a modest analysis showing the model can be adapted (via transfer learning or re-training) to another dataset would be valuable. If new data cannot be added, the authors should explicitly discuss this limitation and perhaps propose strategies (like domain adaptation techniques or more robust training with diverse conditions) to handle batch effects in future applications.

      Response: We thank the reviewer for this important comment. We fully agree with the reviewer that this would be an amazing addition to the manuscript. Unfortunately we are not able to obtain the requested external data set. Although retinal organoid systems exist and are widely used across different species lines, to the best of our knowledge our laboratory is the only one currently raising retinal organoids from primary embryonic pluripotent stem cells of Oryzias latipes and there is currently only one known (and published) differentiation protocol which allows the successful generation of these organoids. We note that our datasets were collected over the course of nine months, which already introduces variability across time and thus partially addresses concerns regarding batch effects. While we did not have access to truly external datasets (e.g., from other laboratories), we have clarified this limitation as suggested in the revised version of the manuscript and outlined strategies such as domain adaptation and training on more diverse conditions as promising future directions to improve robustness.

      Biological Interpretation of Early Predictive Features: The study currently concludes that the CNN picks up on complex, non-intuitive features that neither human experts nor conventional analysis could identify. However, from a biological perspective, it would be highly insightful to know what these features are (e.g., subtle texture, cell distribution patterns, etc.). Suggestion: I encourage the authors to delve deeper into interpretability. They might try complementary explainability techniques (for example, occlusion tests where parts of the image are masked to see if predictions change, or activation visualization to see what patterns neurons detect) beyond GradientSHAP. Additionally, analyzing false predictions might provide clues: if the model is confident but wrong for certain organoids, what visual traits did those have? If possible, correlating the model's prediction confidence with measured morphometrics or known markers (if any early marker data exist) could hint at what the network sees. Even if definitive features remain unidentified, providing the reader with any hypothesis (for instance, "the network may be sensing a subtle rim of pigmentation or differences in tissue opacity") would add value. This would connect the AI predictions back to biology more strongly.

      Response: We thank the reviewer for this thoughtful suggestion. We agree that linking CNN predictions to specific biological features would be highly valuable. In response, we expanded our interpretability analyses beyond GradientSHAP to a broad set of attribution methods and quantified their behavior across models and timepoints (new Supplementary Figures S29-S37, new Supplementary Note 1). While some methods (e.g., Integrated Gradients, DeepLiftSHAP) occasionally highlighted visible tissue regions, others produced diffuse or shifting relevance, and overall overlap was low. Therefore, our results did not yield reproducible, interpretable biological signals.

      Given these results, we have refrained from speculating about specific early image features and now present the interpretability analyses descriptively. We agree that future studies integrating imaging with molecular markers will be required to directly link early predictive cues to defined biological processes.

      Expansion to Other Outcomes or Multi-Outcome Prediction: The focus on RPE and lens is well-justified, but these are two outcomes within retinal organoids. A major question is whether the approach could be extended to predict other cell types or structures (e.g., presence of certain retinal neurons, or malformations) or even multiple outcomes at once. Suggestion: The authors should discuss the generality of their approach. Could the same pipeline be trained to predict, say, photoreceptor layer formation or other features if annotated? Are there limitations (like needing binary outcomes vs. multi-class)? Even if outside the scope of this study, a brief discussion would reassure readers that the method is not intrinsically limited to these two tissues. If data were available, it would be interesting to see a multi-label classification (predict both RPE and lens presence simultaneously) or an extension to other organoid systems in future. Including such commentary would highlight the broad applicability of this platform.

      Response: We thank the reviewer for this helpful and important suggestion. While our study focused on RPE and lens as the most readily accessible tissues of interest in retinal organoids, our new analyses demonstrate that the pipeline is not limited to these outcomes. In addition to tissue-specific predictions, we trained both a convolutional neural network (on image data) and a decision tree classifier (on morphometrics features) to predict more abstract morphological clusters defined at the final timepoint using the morphometrics features, showing that both approaches could successfully capture non-tissue features from early frames (new Figure 4C). This illustrates that the framework can be extended beyond binary tissue outcomes to multi-class problems, and predict relevant outcomes like the overall organoid morphology. Given appropriate annotations, the framework could in principle be trained to detect additional structures such as photoreceptor layers or malformations. Furthermore, the CNN architecture we employed and the morphometrics feature space are compatible with multi-label classification, meaning simultaneous prediction of several outcomes would also be feasible. We have clarified this point in the discussion to highlight the methodological flexibility and potential generality of our approach and are excited to share this very interesting, additional model with the readership.

      Curse of high dimensionality: Using Euclidean distance in a 165-dimensional morphometric space likely suffers from the curse of dimensionality, which diminishes the meaning of distances as dimensionality increases. In such high-dimensional settings, the range of pairwise distances tends to collapse, undermining the ability to discern meaningful intra- vs. inter-organoid differences. Suggestion: To address this, I would encourage the authors to apply principal component analysis (PCA) in place of (or prior to) tSNE. PCA would reduce the data to a few dominant axes of variation that capture most of the morphometric variance, directly revealing which features drive differences between organoids. These principal components are linear combinations of the original 165 parameters, so one can examine their loadings to identify which morphometric traits carry the most information - yielding interpretable axes of biological variation (e.g., organoid size, shape complexity, etc.). In addition, I would like to mention an important cautionary remark regarding tSNE embeddings. tSNE does not preserve global geometry of the data. Distances and cluster separations in a tSNE map are therefore not faithful to the original high-dimensional distances and should be interpreted with caution. See Chari T, Pachter L (2023), The specious art of single-cell genomics, PLoS Comput Biol 19(8): e1011288, for an enlightening discussion in the context of single cell genomics. The authors have shown that extreme dimensionality reduction to 2D can introduce significant distortions in the data's structure, meaning the apparent proximity or separation of points in a tSNE plot may be an artifact of the algorithm rather than a true reflection of morphometric similarity. Implementing PCA would mitigate high-dimensional distance issues by focusing on the most informative dimensions, while also providing clear, quantitative axes that summarize organoid heterogeneity. This change would strengthen the analysis by making the results more robust (avoiding distance artifacts) and biologically interpretable, as each principal component can be traced back to specific morphometric features of interest.

      Response: We thank the reviewer for this mention. Indeed, high dimensionality and dimensionality reductions can lead to false interpretations. We approached this issue as follows: First, we calculated the same TSNE projections and distances using the first 20 PCs and supplied these data as the new Figure 2 and new Supplementary Figure 2. While the scale of the data shifted slightly, there were no differences in the data distribution that would contradict our prior conclusions.

      In order to confirm the findings and further emphasize the validity of our dimensionality reduction, we calculated the intersection of 30 nearest neighbors in raw data space (or pca space) compared and 30 nearest neighbors in reduced space (TSNE or UMAP, as we wanted to emphasize that this was not an effect specific for TSNE projections and would also be valid in a dimensionality reduction which is more known to preserve global structure rather than local structure). As shown in the new Supplementary Figure S3 (A-D), the high jaccard index confirmed that our projections accurately reflect the data’s structure obtained from raw distance measurements. Moreover, the jaccard index generally increased over time, which is best explained by a stronger morphological similarity of organoids at timepoint 0 and reflected by the dense point cloud in the TSNE projections at that timepoint. The described effects were independent of the usage of data derived from 20 PCs versus data derived from all 165 dimensions.

      We next wanted to confirm the conclusion that data points obtained from organoids at later timepoints were more closely related to each other than data points from different organoids. We therefore identified the 30 nearest neighbor data points, showing that at later timepoints these 30 nearest neighbor data points were almost all attributable to the same organoid (new Supplementary Figure S3 E/F). This was only not the case for experiments that lacked in between timepoints (E007 and E002), therefore misaligning the organoids in the reduced space and convoluting the nearest neighbor analysis.

      We have included the respective new Figures and new Supplementary Figures and linked them in the main manuscript.

      Statistical Reporting and Significance: The manuscript focuses on F1-score as the metric to report accuracy over time, which is appropriate. However, it's not explicitly stated whether any statistical significance tests were performed on the differences between methods (e.g., CNN vs human, CNN vs classical ML). Suggestion: The authors could report statistical significance of the performance differences, perhaps using a permutation test or McNemar's test on predictions. For example, is the improvement of the CNN ensemble over the Random Forest/QDA classifier statistically significant across experiments? Given the n of organoids, this should be assessable. Demonstrating significance would add rigor to the analysis.

      Response: We thank the reviewer for this helpful suggestion. Following the recommendation, we quantified per-experiment differences in predictive performance by calculating the area under the F1-score curves (AUC) for each classifier and experiment. We then compared methods using paired Wilcoxon signed-rank tests across experiments, with Holm-Bonferroni correction for multiple comparisons. This analysis confirmed that the CNN consistently and significantly outperformed the baseline models and classical machine learning classifiers in validation and test organoids, while CNNs were notably but not significantly better performing in test organoids for RPE area and lens sizes compared to the machine learning classifiers. In summary, the findings add the requested statistical rigor to our findings. The results of these tests are now provided in the Supplementary Material as Supplementary File 1.

      Minor Issues (with Suggestions):

      1. Data Availability: Given the resource-intensive nature of the work, the value to the community will be highest if the data is made publicly available. I understand that this is of course at the behest of the authors and they do mention that they will make the data available upon publication of the manuscript. For the time being, the authors can consider sharing at least a representative subset of the data or the trained model weights. This will allow others to build on their work and test the method in other contexts, amplifying the impact of the study.

      Response: We have now made the repository and raw data public and apologize for this oversight. The link for the github repository is now provided in the manuscript under “Data availability”, while the links for the datasets are contained within the github repository.

      Discussion - Future Directions: The Discussion does a good job of highlighting applications (like guiding molecular analysis). One minor addition could be speculation on using this approach to actively intervene: for example, could one imagine altering culture conditions mid-course for organoids predicted not to form RPE, to see if their fate can be changed? The authors touch on reducing variability by focusing on the window of determination; extending that thought to an experimental test (though not done here) would inspire readers. This is entirely optional, but a sentence or two envisioning how predictive models enable dynamic experimental designs (not just passive prediction) would be a forward-looking note to end on.

      Response: We thank the reviewer for this constructive suggestion. We have expanded the discussion to briefly address how predictive modeling could go beyond passive observation. Specifically, we now discuss that predictive models may enable dynamic interventions, such as altering culture conditions mid-course for organoids predicted not to form RPE, to test whether their developmental trajectory can be redirected. While outside the scope of the present work, this forward-looking perspective emphasizes how predictive modeling could inspire adaptive experimental strategies in future studies.

      I believe with the above clarifications and enhancements - especially regarding generalizability and interpretability - the paper will be suitable for broad readership. The work represents an exciting intersection of developmental biology and AI, and I commend the authors for this contribution.

      Response: We thank the reviewer for the positive assessment and their encouraging remarks regarding the contribution of our work to these fields.

      Novelty and Impact:

      This work fills an important gap in organoid biology and imaging. Previous studies have used deep learning to link imaging with molecular profiles or spatial patterns in organoids, but there remained a "notable gap" in predicting whether and to what extent specific tissues will form in organoids. The authors' approach is novel in applying deep learning to prospectively predict organoid tissue outcomes (RPE and lens) on a per-organoid basis, something not previously demonstrated in retinal organoids. Conceptually, this is a significant advance: it shows that fate decisions in a complex 3D culture model can be predicted well in advance, suggesting the existence of subtle early morphogenetic cues that only a sophisticated model can discern. The findings will be of broad interest to researchers in organoid technology, developmental biology, and biomedical AI.

      Response: We thank the reviewer for this thoughtful and encouraging assessment. We agree that our study addresses an important gap by prospectively predicting tissue outcomes at the single-organoid level, and we appreciate the recognition that this represents a conceptual advance with relevance not only for retinal organoids but also for broader applications in organoid biology, developmental biology, and biomedical AI.

      Methodological Rigor and Technical Quality:

      The study is methodologically solid and carefully executed. The authors gathered a uniquely large dataset under consistent conditions, which lends statistical power to their analyses. They employ rigorous controls: an expert panel provided human predictions as a baseline, and a classical machine learning pipeline using quantitative image-derived features was implemented for comparison. The deep learning approach is well-chosen and technically sound. They use an ensemble of CNN architectures (DenseNet121, ResNet50, and MobileNetV3) pre-trained on large image databases, fine-tuning them on organoid images. The use of image segmentation (DeepLabV3) to isolate the organoid from background is appropriate to ensure the models focus on the relevant morphology. Model training procedures (data augmentation, cross-entropy loss with class balancing, learning rate scheduling, and cross-validation) are thorough and follow best practices. The evaluation metrics (primarily F1-score) are suitable for the imbalanced outcomes and emphasize prediction accuracy in a biologically relevant way. Importantly, the authors separate training, test, and validation sets in a meaningful manner: images of each organoid are grouped to avoid information leakage, and an independent experiment serves as a validation to test generalization. The observation that performance is slightly lower on independent validation experiments underscores both the realism of their evaluation and the inherent heterogeneity between experimental batches. In addition, the study integrates interpretability (using GradientSHAP-based relevance backpropagation) to probe what image features the network uses. Although the relevance maps did not reveal obvious human-interpretable features, the attempt reflects a commendable thoroughness in analysis. Overall, the experimental design, data analysis, and reporting are of high quality, supporting the credibility of the conclusions.

      Response: We thank the reviewer for their very positive and detailed assessment. We appreciate the recognition of our efforts to ensure methodological rigor and reproducibility, and we agree that interpretability remains an important but challenging area for future work.

      Reviewer #3 (Significance (Required)):

      Scientific Significance and Conceptual Advances:

      Biologically, the ability to predict organoid outcomes early is quite significant. It means researchers can potentially identify when and which organoids will form a given tissue, allowing them to harvest samples at the right moment for molecular assays or to exclude organoids that will not form the desired structure. The manuscript's results indicate that RPE and lens fate decisions in retinal organoids are made much earlier than visible differentiation, with predictive signals detectable as early as ~11 hours for RPE and ~4-5 hours for lens. This suggests a surprising synchronization or early commitment in organoid development that was not previously appreciated. The authors' introduction of deep learning-derived determination windows refines the concept of a developmental "point of no return" for cell fate in organoids. Focusing on these windows could help in pinpointing the molecular triggers of these fate decisions. Another conceptual advance is demonstrating that non-invasive imaging data can serve a predictive role akin to (or better than) destructive molecular assays. The study highlights that classical morphology metrics and even expert eyes capture mainly recognition of emerging tissues, whereas the CNN detects subtler, non-intuitive features predictive of future development. This underlines the power of deep learning to uncover complex phenotypic patterns that elude human analysis, a concept that could be extended to other organoid systems and developmental biology contexts. In sum, the work not only provides a tool for prediction but also contributes conceptual insights into the timing of cell fate determination in organoids.

      Response: We thank the reviewer for this thoughtful and positive assessment. We agree that the determination windows provide a valuable framework to study early fate decisions in organoids, and we have emphasized this point in the discussion to highlight the biological significance of our findings.

      Strengths:

      The combination of high-resolution time-lapse imaging with advanced deep learning is innovative. The authors effectively leverage AI to solve a biological uncertainty problem, moving beyond qualitative observations to quantitative predictions. The study uses a remarkably large dataset (1,000 organoids, >100k images), which is a strength as it captures variability and provides robust training data. This scale lends confidence that the model isn't overfit to a small sample. By comparing deep learning with classical machine learning and human predictions, the authors provide context for the model's performance. The CNN ensemble consistently outperforms both the classical algorithms and human experts, highlighting the value added by the new method. The deep learning model achieves high accuracy (F1 > 0.85) at impressively early time points. The fact that it can predict lens formation just ~4.5 hours into development with confidence is striking. Performance remained strong and exceeded human capability at all assessed times. Key experimental and analytical steps (segmentation, cross-validation between experiments, model calibration, use of appropriate metrics) are executed carefully. The manuscript is transparent about training procedures and even provides source code references, enhancing reproducibility. The manuscript is generally well-written with a logical flow from the problem (organoid heterogeneity) to the solution (predictive modeling) and clear figures referenced.

      Response: We thank the reviewer for this very positive and encouraging assessment of our study, particularly regarding the scale of our dataset, the methodological rigor, and the reproducibility of our approach.

      Weaknesses and Limitations:

      Generalizability Across Batches/Conditions: One limitation is the variability in model performance on organoids from independent experiments. The CNN did slightly worse on a validation set from a separate experiment, indicating that differences in the experimental batch (e.g., slight protocol or environmental variations) can affect accuracy. This raises the question of how well the model would generalize to organoids generated under different protocols or by other labs. While the authors do employ an experiment-wise cross-validation, true external validation (on a totally independent dataset or a different organoid system) would further strengthen the claim of general applicability.

      Response: We thank the reviewer for this important point. We agree that generalizability across batches and experimental conditions is a key consideration. We have carefully revised the discussion to explicitly address this limitation and to highlight the variability observed between independent experiments.

      Interpretability of the Predictions: Despite using relevance backpropagation, the authors were unable to pinpoint clear human-interpretable image features that drive the predictions. In other words, the deep learning model remains somewhat of a "black box" in terms of what subtle cues it uses at early time points. This limits the biological insight that can be directly extracted regarding early morphological indicators of RPE or lens fate. It would be ideal if the study could highlight specific morphological differences (even if minor) correlated with fate outcomes, but currently those remain elusive.

      Response: We thank the reviewer for raising this important point. Indeed, while our models achieved robust predictive performance, the underlying morphological cues remained difficult to interpret using relevance backpropagation. We believe this limitation reflects both the subtlety of the early predictive signals and the complexity of the features captured by deep learning models, which may not correspond to human-intuitive descriptors. We have clarified this limitation in the Discussion and Supplementary Note 1 and emphasize that further methodological advances in interpretability, or integration with complementary molecular readouts, will be essential to uncover the precise morphological correlates of fate determination.

      Scope of Outcomes: The study focuses on two particular tissues (RPE and lens) as the outcomes of interest. These were well-chosen as examples (one induced, one spontaneous), but they do not encompass the full range of retinal organoid fates (e.g., neural retina layers). It's not a flaw per se, but it means the platform as presented is specialized. The method might need adaptation to predict more complex or multiple tissue outcomes simultaneously.

      Response: We agree with the reviewer that our study focuses on two specific tissues, RPE and lens, which served as proof-of-concept outcomes representing both induced and spontaneous differentiation events. While this scope is necessarily limited, we believe it demonstrates the general feasibility of our approach. We have clarified in the Discussion that the same framework could, in principle, be extended to additional retinal fates such as neural retina layers, or even to multi-label prediction tasks, provided appropriate annotations are available. We now provide additional experiments showing that even abstract morphological classes are well predictable. This will be an important next step to broaden the applicability of our platform.

      Requirement of Large Data and Annotations: Practically, the approach required a very large imaging dataset and extensive manual annotation; each organoid's RPE and lens outcome, plus manual masking for training the segmentation model. This is a substantial effort that may be challenging to reproduce widely. The authors suggest that perhaps ~500 organoids might suffice to achieve similar results, but the data requirement is still high. Smaller labs or studies with fewer organoids might not immediately reap the full benefits of this approach without access to such imaging throughput.

      Response: We thank the reviewer for highlighting this important point. We agree that the generation of a large imaging dataset and the associated annotations represent a substantial investment of time and resources. At the same time, we consider this effort highly relevant, as it reflects the intrinsic heterogeneity of organoid systems rather than technical artifacts, and therefore ensures robust model training. We have clarified this limitation in the discussion. While our full dataset included ~1,000 organoids, our downsampling analysis suggests that as few as ~500 organoids may already be sufficient to reproduce the key findings, which we believe makes the approach feasible for many organoid systems (compare new Supplementary Note 1). Moreover, as we outline in the Discussion, future refinements such as combining image- and tabular-based features or incorporating fluorescence data could further enhance predictive power and reduce annotation effort.

      Medaka Fish vs. Other Systems: The retinal organoids in this study appear to be from medaka fish, whereas much organoid research uses human iPSC-derived organoids. It's not fully clear in the manuscript as to how the findings translate to mammalian or human organoids. If there are species-specific differences, the applicability to human retinal organoids (which are important for disease modeling) might need discussion. This is a minor point if the biology is conserved, but worth noting as a potential limitation.

      Response: We thank the reviewer for pointing out this important consideration. We have now explicitly clarified in the Discussion that our proof-of-concept study was performed in medaka organoids, which offer high reproducibility and rapid development. While species-specific differences may exist, the predictive framework is not inherently restricted to medaka and should, in principle, be transferable to mammalian or human iPSC/ESC-derived organoids, provided sufficiently annotated datasets are available. We have amended the Discussion accordingly.

      Predicting Tissue Size is Harder: The model's accuracy in predicting how much tissue (relative area) an organoid will form, while good, is notably lower than for simply predicting presence/absence. Final F1 scores for size classes (~0.7) indicate moderate success. This implies that quantitatively predicting organoid phenotypic severity or extent is more challenging, perhaps due to more continuous variation in size. The authors do acknowledge the lower accuracy for size and treat it carefully.

      Response: We thank the reviewer for this observation and agree with their interpretation. We have already acknowledged in the manuscript that predicting tissue size is more challenging than predicting tissue presence/absence, and we believe we have treated these results with appropriate caution in the revised version of the manuscript.

      Latency vs. Determination: While the authors narrow down the time window of fate determination, it remains somewhat unclear whether the times at which the model reaches high confidence truly correspond to the biological "decision point" or are just the earliest detection of its consequences. The manuscript discusses this caveat, but it's an inherent limitation that the predictive time point might lag the actual internal commitment event. Further work might be needed to link these predictions to molecular events of commitment.

      Response: We agree with the reviewer. As noted in the Discussion, the time points identified by our models likely reflect the earliest detectable morphological consequences of fate determination, rather than the exact molecular commitment events themselves. Establishing a direct link between predictive signals and underlying molecular mechanisms will require future experimental work.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary: Afting et al. present a computational pipeline for analyzing timelapse brightfield images of retinal organoids derived from Medaka fish. Their pipeline processes images along two paths: 1) morphometrics (based on computer vision features from skimage) and 2) deep learning. They discovered, through extensive manual annotation of ground truth, that their deep learning method could predict retinal pigmented epithelium and lens tissue emergence in time points earlier than either morphometrics or expert predictions. Our review is formatted based on the review commons recommendation.

      Major comments:

      Are the key conclusions convincing?

      Yes, the key conclusion that deep learning outperforms morphometric approaches is convincing. However, several methodological details require clarification. For instance, were the data splitting procedures conducted in the same manner for both approaches? Additionally, the authors note in the methods: "The validation data were scaled to the same range as the training data using the fitted scalers obtained from the training data." This represents a classic case of data leakage, which could artificially inflate performance metrics in traditional machine learning models. It is unclear whether the deep learning model was subject to the same issue. Furthermore, the convolutional neural network was trained with random augmentations, effectively increasing the diversity of the training data. Would the performance advantage still hold if the sample size had not been artificially expanded through augmentation?

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether? Their claims are currently preliminary, pending increased clarity and additional computational experiments described below.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      • The authors discretize continuous variables into four bins for classification. However, a regression framework may be more appropriate for preserving the full resolution of the data. At a minimum, the authors should provide a stronger justification for this binning strategy and include an analysis of bin performance. For example, do samples near bin boundaries perform comparably to those near the bin centers? This would help determine whether the discretization introduces artifacts or obscures signals.
      • The relevance backpropagation interpretation analysis is not convincing. The authors argue that the model's use of pixels across the entire image (rather than just the RPE region) indicates that the deep learning approach captures holistic information. However, only three example images are shown out of hundreds, with no explanation for their selection, limiting the generalizability of the interpretation. Additionally, it is unclear how this interpretability approach would work at all in earlier time points, particularly before the model begins making confident predictions around the 8-hour mark. It is also not specified whether the input used for GradSHAP matches the input used during CNN training. The authors should consider expanding this analysis by quantifying pixel importance inside versus outside annotated regions over time. Lastly, Figure 4C is missing a scale bar, which would aid in interpretability.
      • The authors claim that they removed technical artifacts to the best of their ability, but it is unclear if the authors performed any adjustment beyond manual quality checks for contamination. Did the authors observe any illumination artifacts (either within a single image or over time)? Any other artifacts or procedures to adjust?
      • In line 434-436 the authors state "In this work, we used 1,000 organoids in total, to achieve the reported prediction accuracies. Yet, we suspect that as little as ~500 organoids are sufficient to reliably recapitulate our findings." It is unclear what evidence the authors use to support this claim? The authors could perform a downsampling analysis to determine tradeoff between performance and sample size.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      Yes, we believe all experiments are realistic in terms of time and resources. We estimate all experiments could be completed in 3-6 months.

      Are the data and the methods presented in such a way that they can be reproduced?

      No, the code is not currently available. We were not able to review the source code.

      Are the experiments adequately replicated and statistical analysis adequate?

      • The experiments are adequately replicated.
      • The statistical analysis (deep learning) is lacking a negative control baseline, which would be helpful to observe if performance is inflated.

      Minor comments:

      Specific experimental issues that are easily addressable.

      Are prior studies referenced appropriately?

      Yes.

      Are the text and figures clear and accurate?

      The authors must improve clarity on terminology. For example, they should define a comprehensive dataset, significant, and provide clarity on their morphometrics feature space. They should elaborate on what they mean by "confounding factor of heterogeneity".

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      • Figure 2C describes a distance between what? The y axis is likely too simple. Same confusion over Figure 2D. Was distance computed based on tsne coordinates?
      • The authors perform a Herculean analysis comparing dozens of different machine learning classifiers. They select two, but they should provide justification for this decision.
      • It would be good to get a sense for how these retinal organoids grow - are they moving all over the place? They are in Matrigel so maybe not, but are they rotating? Can the author's approach predict an entire non-emergence experiment? The authors tried to standardize protocol, but ultimately if It's deriving this much heterogeneity, then how well it will actually generalize to a different lab is a limitation.
      • The authors should dampen claims throughout. For example, in the abstract they state, "by combining expert annotations with advanced image analysis". The image analysis pipelines use common approaches.
      • The authors state: "the presence of RPE and lenses were disagreed upon by the two independently annotating experts in a considerable fraction of organoids (3.9 % for RPE, 2.9% for lenses).", but it is unclear why there were two independently annotating experts. The supplements say images were split between nine experts for annotation.
      • Details on the image analysis pipeline would be helpful to clarify. For example, why did they choose to measure these 165 morphology features? Which descriptors were used to quantify blur? Did the authors apply blur metrics per FOV or per segmented organoid?
      • The description of the number of images is confusing and distracts from the number of organoids. The number of organoids and number of timepoints used would provide a better description of the data with more value. For example, does this image count include all five z slices?
      • The authors should consider applying a maximum projection across the five z slices (rather than the middle z) as this is a common procedure in image analysis. Why not analyze three-dimensional morphometrics or deep learning features? Might this improve performance further?
      • There is a lot of manual annotation performed in this work, the authors could speculate how this could be streamlined for future studies. How does the approach presented enable streamlining?

      Significance

      Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.

      The paper's advance is technical (providing new methods for organoid quality control) and conceptual (providing proof of concept that earlier time points contain information to predict specific future outcomes in retinal organoids)

      Place the work in the context of the existing literature (provide references, where appropriate).

      • The authors do a good job of placing their work in context in the introduction.
      • The work presents a simple image analysis pipeline (using only the middle z slice) to process timelapse organoid images. So not a 4D pipeline (time and space), just 3D (time). It is likely that more and more of these approaches will be developed over time, and this article is one of the early attempts.
      • The work uses standard convolutional neural networks.

      State what audience might be interested in and influenced by the reported findings.

      • Data scientists performing image-based profiling for time lapse imaging of organoids.
      • Retinal organoid biologists
      • Other organoid biologists who may have long growth times with indeterminate outcomes.

      Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      • Image-based profiling/morphometrics
      • Organoid image analysis
      • Computational biology
      • Cell biology
      • Data science/machine learning
      • Software

      This is a signed review: Gregory P. Way, PhD Erik Serrano Jenna Tomkinson Michael J. Lippincott Cameron Mattson Department of Biomedical Informatics, University of Colorado

  3. accessmedicina.mhmedical.com accessmedicina.mhmedical.com
    1. McQuaid KR. McQuaid K.R. McQuaid, Kenneth R.Apendicitis. In: Papadakis MA, Rabow MW, McQuaid KR, Gandhi M. Papadakis M.A., & Rabow M.W., & McQuaid K.R., & Gandhi M(Eds.),Eds. Maxine A. Papadakis, et al.eds. Diagnóstico clínico y tratamiento 2025. McGraw Hill Education; 2025. Accessed octubre 18, 2025. https://accessmedicina.mhmedical.com/content.aspx?bookid=3530&sectionid=294839301

      hj

    1. y leveraging transformer-based language models, weaim to capture semantic nuances potentially missed by conventionalscoring, enriching the assessment with comprehensive text features

      A Transformer is a neural network architecture introduced by Google. It revolutionized how machines understand and generate sequences

      Instead of processing words one at a time, transformers look at all words in a sequence simultaneously and use a mechanism called self-attention to understand how each word relates to every other word.

      Self-Attention Mechanism Each token “looks” at other tokens in the sentence and assigns attention weights — numbers that represent how important each word is to understanding the current one.

      Example: In the sentence “The patient who had pneumonia was discharged.”, the word “was” should pay more attention to “patient” than to “pneumonia.” The self-attention mechanism captures this context automatically.

      1. Stacked Layers

      Many layers of self-attention and feed-forward networks are stacked.

      Each layer learns increasingly abstract relationships — syntax, semantics, and even reasoning patterns.

    1. un notable progreso en términos de calificación de los recursos humanos que llega inclusive a los niveles técnicos y universitarios, algo muy avanzado frente a lo que pasaba en los países en desarrollo.

      Esto se lo puede pensar desde el evolucionismo

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02830

      Corresponding author(s): Julien, Sage

      1. General Statements

      We thank the Reviewers for a fair review of our work and helpful suggestions. We have significantly revised the manuscript in response to these suggestions. We provide a point-by-point response to the Reviewers below but wanted to highlight in our response a recurring concern related to the strong cell cycle arrest observed upon the acute FAM53C knock-down being different than the limited phenotypes in other contexts, including the knockout mice and DepMap data.

      First, we now show that we can recapitulate the strong G1 arrest resulting from the FAM53C knock-down using two independent siRNAs in RPE-1 cells, supporting the specificity of the effects.

      Second, the G1 arrest that results from the FAM53C knock-down is also observed in cells with inactive p53, suggesting it is not due to a non-specific stress response due to “toxic” siRNAs. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype.

      Third, we have performed experiments in other human cells, including cancer cell lines. As would be expected for cancer cells, the G1 arrest is less pronounced but is still significant, indicating that the G1 arrest is not unique to RPE-1 cells.

      Fourth, it is not unexpected that compensatory mechanisms would be activated upon loss of FAM53C during development or in cancer – which may explain the lack of phenotypes in vivo or upon long-term knockout. This has been true for many cell cycle regulators, either because of compensation by other family members that have overlapping functions, or by a larger scale rewiring of signaling pathways.

      2. Point-by-point description of the revisions

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. In addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells. The Reviewer raises a great point. Our initial statement needed to be clarified and also need more experimental support. We have performed experiments where we knocked down FAM53C and p21 individually, as well as in combination, in RPE-1 cells. These experiment show that p21 knock-down is not sufficient to negate the cell cycle arrest resulting from the FAM53C knock-down in RPE-1 cells (Figure 4B,C and Figure S4C,D).

      We now extended these experiments to conditions where we inhibited DYRK1A, and we also compared these data to experiments in p53-null RPE-1 cells. Altogether, these experiments point to activation of p53 downstream of DYRK1A activation upon FAM53C knock-down, and indicate that p21 is not the only critical p53 target in the cell cycle arrest observed in FAM53C knock-down cells (Figure 4 and Figure S4).

      The authors do not convincingly show that FAM53C acts as a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.

      We appreciate these comments from the Reviewer and have significantly revised the manuscript to address them.

      The analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We removed previous panel 4B from the revised manuscript. For panels 4E and S4B (now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      The representative Western blot images for 5C-D (now 5F-G) in the original submission are shown in Figure 5E, we apologize if this was not clear. The differences are small, which we acknowledge in the revised manuscript. Note that several factors can affect Cyclin D levels in cells, including the growth rate and the stage of the cell cycle. Our FACS analysis shows that normal organoids have ~63% of cells in G1 and ~13% in S phase; the overall lower proportion of S-phase cells in organoids may make the immunoblot difference appear smaller, with fewer cycling cells resulting in decreased Cyclin D phosphorylation.

      Nevertheless, the Reviewer brings up a good point and comments from this Reviewer and the others made us re-think how to best interpret our results. As discussed above, we re-read carefully the Meyer paper and think that FAM53C’s role and DYRK1A activity in cells may be understood when considering levels of both CycD and p21 at the same time in a continuum. While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is likely that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?

      We repeated the experiments with the DYRK1A inhibitor and counted the cells. In p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells.

      The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.

      We apologize for these duplicated panels in the original submission. We now replaced the wrong panel with the correct data (Fig. 5F,G).

      Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      We agree with the Reviewer that, although we observed significant p-values, this original statement may not be appropriate in the biological sense. We made sure in the revised manuscript to carefully present these data.

      Minor comments:

      Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.

      To address this point, we updated Table S1 (2nd tab) to provide a better rationale for the 38 factors chosen. Our focus was on the canonical RB pathway and we included RB binding proteins whose function had suggested they may also be playing a role in the G1/S transition. We do agree that there is some bias in this selection (e.g., there are more RB binding factors described) but we hope the Reviewer will agree with us that this list and the subsequent analysis identified expected factors, including FAM53C. Future studies using this approach and others will certainly identify new regulators of cell cycle progression.

      Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.

      We agree with the Reviewer that this panel was not necessarily useful and possibly in the wrong place, and we removed it from the manuscript. We replaced it with a cartoon of top hits in the screen.

      The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.

      We re-graphed these panels.

      Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.

      We changed the header to “Consequences of FAM53C inactivation in human cortical organoids in culture”.

      Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?

      Thank you for your feedback. The subG1 population in the original Figure S5F represents cells that died during the dissociation step of the organoids for FACS analysis. To address this point, we performed live & dead staining to exclude dead cells and provide clearer data. We refined gating strategy for better clarity in the new S5F panel.

      Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      We fixed this mistake, thank you.

      __Reviewer #1 (Significance (Required)): __

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Summary

      In this study Hammond et al. investigated the role of Dual-specificity Tyrosine Phosphorylation regulated Kinase 1A (DYRK1) in G1/S transition. By exploiting Dependency Map portal, they identified a previously unexplored protein FAM53C as potential regulator of G1/S transition. Using RNAi, they confirmed that depletion of FAM53C suppressed proliferation of human RPE1 cells and that this phenotype was dependent on the presence protein RB. In addition, they noted increased level of CDKN1A transcript and p21 protein that could explain G1 arrest of FAM53C-depleted cells but surprisingly, they did not observe activation of other p53 target genes. Proteomic analysis identified DYRK1 as one of the main interactors of FAM53C and the interaction was confirmed in vitro. Further, they showed that purified FAM53C blocked the ability of DYRK1 to phosphorylate cyclin D in vitro although the activity of DYRK1 was likely not inhibited (judging from the modification of FAM53C itself). Instead, it seems more likely that FAM53C competes with cyclin D in this assay. Authors claim that the G1 arrest caused by depletion of FAM53C was rescued by inhibition of DYRK1 but this was true only in cells lacking functional p53. This is quite confusing as DYRK1 inhibition reduced the fraction of G1 cells in p53 wild type cells as well as in p53 knock-outs, suggesting that FAM53C may not be required for regulation of DYRK1 function. Instead of focusing on the impact of FAM53C on cell cycle progression, authors moved towards investigating its potential (and perhaps more complex) roles in differentiation of IPSCs into cortical organoids and in mice. They observed a lower level of proliferating cells in the organoids but if that reflects an increased activity of DYRK1 or if it is just an off target effect of the genetic manipulation remains unclear. Even less clear is the phenotype in FAM53C knock-out mice. Authors did not observe any significant changes in survival nor in organ development but they noted some behavioral differences. Weather and how these are connected to the rate of cellular proliferation was not explored. In the summary, the study identified previously unknown role of FAM53C in proliferation but failed to explain the mechanism and its physiological relevance at the level of tissues and organism. Although some of the data might be of interest, in current form the data is too preliminary to justify publication.

      Major points

      1. Whole study is based on one siRNA to Fam53C and its specificity was not validated. Level of the knock down was shown only in the first figure and not in the other experiments. The observed phenotypes in the cell cycle progression may be affected by variable knock-down efficiency and/or potential off target effects. We thank the Reviewer for raising this important point. First, we need to clarify that our experiments were performed with a pool of siRNAs (not one siRNA). Second, commercial antibodies against FAM53C are not of the best quality and it has been challenging to detect FAM53C using these antibodies in our hands – the results are often variable. In addition, to better address the Reviewer’s point and control for the phenotypes we have observed, we performed two additional series of experiments: first, we have confirmed G1 arrest in RPE-1 cells with individual siRNAs, providing more confidence for the specificity of this arrest (Fig. S1B); second, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (Fig. S1E,F and Fig. 4F).

      Experiments focusing on the cell cycle progression were done in a single cell line RPE1 that showed a strong sensitivity to FAM53C depletion. In contrast, phenotypes in IPSCs and in mice were only mild suggesting that there might be large differences across various cell types in the expression and function of FAM53C. Therefore, it is important to reproduce the observations in other cell types.

      As mentioned above, we have new data indicating that other cell lines arrest in G1 upon FAM53C knock-down (three cancer cell lines) (Fig. S1E,F and Fig. 4F).

      Authors state that FAM53C is a direct inhibitor of DYRK1A kinase activity (Line 203), however this model is not supported by the data in Fig 4A. FAM53C seems to be a good substrate of DYRK1 even at high concentrations when phosphorylations of cyclin D is reduced. It rather suggests that DYRK1 is not inhibited by FAM53C but perhaps FAM53C competes with cyclin D. Further, authors should address if the phosphorylation of cyclin D is responsible for the observed cell cycle phenotype. Is this Cyclin D-Thr286 phosphorylation, or are there other sites involved?

      We revised the text of the manuscript to include the possibility that FAM53C could act as a competitive substrate and/or an inhibitor.

      We removed most of the Cyclin D phosphorylation/stability data from the revised manuscript. As the Reviewers pointed out, some of these data were statistically significant but the biological effects were small. As discussed above in our response to Reviewer #1, the analysis of Cyclin D phosphorylation and stability are complicated by the upregulation of p21 upon FAM53C knock-down, in particular because p21 can be part of Cyclin D complexes, which may affect its protein levels in cells (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). Instead of focusing on Cyclin D levels and stability, we refocused the manuscript on RB and p53 downstream of FAM53C loss.

      We note, however, that we used specific Thr286 phospho-antibodies, which have been used extensively in the field. Our data in Figure 1 with palbociclib place FAM53C upstream of Cyclin D/CDK4,6. We performed Cyclin D overexpression experiments but RPE-1 cells did not tolerate high expression of Cyclin D1 (T286A mutant) and we have not been able to conduct more ‘genetic’ studies.

      At many places, information on statistical tests is missing and SDs are not shown in the plots. For instance, what statistics was used in Fig 4C? Impact of FAM53C on cyclin D phosphorylation does not seem to be significant. In the same experiment, does DYRK1 inhibitor prevent modification of cyclin D?

      As discussed above, we removed some of these data and re-focused the manuscript on p53-p21 as a second pathway activated by loss of FAM53C.

      Validation of SM13797 compound in terms of specificity to DYRK1 was not performed.

      This is an important point. We had cited an abstract from the company (Biosplice) but we agree that providing data is critical. We have now revised the manuscript with a new analysis of the compound’s specificity using kinase assays. These data are shown in Fig. S3F-H.

      A fraction of cells in G1 is a very easy readout but it does not measure progression through the G1 phase. Extension of the S phase or G2 delay would indirectly also result in reduction of the G1 fraction. Instead, authors could measure the dynamics of entry to S phase in cells released from a G1 block or from mitotic shake off.

      The Reviewer made a good point. As discussed in our response to Reviewer #1, with p53-null RPE-1 cells, we found that cell numbers do not increase in these conditions where we had observed a cell cycle re-entry (Fig. 4E), which was accompanied by apoptotic cell death (Fig. S4I). Thus, cells re-enter the cell cycle but die as they progress through S-phase and G2/M. We note that inhibition of DYRK1A has been shown to decrease expression of G2/M regulators (PMID: 38839871), which may contribute to the inability of cells treated to DYRK1Ai to divide. Because our data in RPE-1 cells showed that p21 knock-down was not sufficient to allow the FAM53C knock-down cells to re-enter the cell cycle, we did not further analyze p21 in HCT-116 cells. These data indicate that G1 entry by flow cytometry will not always translate into proliferation.

      Other points:

      Fig. 2C, 2D, 2E graphs should begin with 0

      We remade these graphs.

      Fig. 5D shows that the difference in p21 levels is not significant in FAM53C-KO cells but difference is mentioned in the text.

      We replaced the panel by the correct panel; we apologize for this error.

      Fig. 6D comparison of datasets of extremely different sizes does not seem to be appropriate

      We agree and revised the text. We hope that the Reviewer will agree with us that it is worth showing these data, which are clearly preliminary but provide evidence of a possible role for FAM53C in the brain.

      Could there be alternative splicing in mice generating a partially functional protein without exon 4? Did authors confirm that the animal model does not express FAM53C?

      We performed RNA sequencing of mouse embryonic fibroblasts derived from control and mutant mice. We clearly identified fewer reads in exon 4 in the knockout cells, and no other obvious change in the transcript (data not shown). However, immunoblot with mouse cells for FAM53C never worked well in our hands. We made sure to add this caveat to the revised manuscript.

      __Reviewer #2 (Significance (Required)): __

      Main problem of this study is that the advanced experimental models in IPSCs and mice did not confirm the observations in the cell lines and thus the whole manuscript does not hold together. Although I acknowledge the effort the authors invested in these experiments, the data do not contribute to the main conclusion of the paper that FAM53C/DYRK1 regulates G1/S transition.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      This paper identifies FAM53C as a novel regulator of cell cycle progression, particularly at the G1/S transition, by inhibiting DYRK1A. Using data from the Cancer Dependency Map, the authors suggest that FAM53C acts upstream of the Cyclin D-CDK4/6-RB axis by inhibiting DYRK1A.

      Specifically, their experiments suggest that FAM53C Knockdown induces G1 arrest in cells, reducing proliferation without triggering apoptosis. DYRK1A Inhibition rescues G1 arrest in P53KO cells, suggesting FAM53C normally suppresses DYRK1A activity. Mass Spectrometry and biochemical assays confirm that FAM53C directly interacts with and inhibits DYRK1A. FAM53C Knockout in Human Cortical Organoids and Mice leads to cell cycle defects, growth impairments, and behavioral changes, reinforcing its biological importance.

      Strength of the paper:

      The study introduces a novel cell cycle control signalling module upstream of CDK4/6 in G1/S regulation which could have significant impact. The identification of FAM53C using a depmap correlation analysis is a nice example of the power of this dataset. The experiments are carried out mostly in a convincing manner and support the conclusions of the manuscript.

      Critique:

      1) The experiments rely heavily on siRNA transfections without the appropriate controls. There are so many cases of off-target effects of siRNA in the literature, and specifically for a strong phenotype on S-phase as described here, I would expect to see solid results by additional experiments. This is especially important since the ko mice do not show any significant developmental cell cycle phenotypes. Moreover, FAM53C does not show a strong fitness effect in the depmap dataset, suggesting that it is largely non-essential in most cancer cell lines. For this paper to reach publication in a high-standard journal, I would expect that the authors show a rescue of the S-phase phenotype using an siRNA-resistant cDNA, and show similar S-phase defects using an acute knock out approach with lentiviral gRNA/Cas9 delivery.

      We thank the Reviewer for this comment. Please refer to the initial response to the three Reviewers, where we discuss our use of single siRNAs and our results in multiple cell lines. Briefly, we can recapitulate the G1 arrest upon FAM53C knock-down using two independent siRNAs in RPE-1 cells. We also observe the same G1 arrest in p53 knockout cells, suggesting it is not due to a non-specific stress response. In addition, the arrest is dependent on RB, which fits with the genetic and biochemical data placing FAM53C upstream of RB, further supporting a specific phenotype. Human cancer cell lines also arrest in G1 upon FAM53C knock-down, not just RPE-1 cells. Finally, we hope the Reviewer will agree with us that compensatory mechanisms are very common in the cell cycle – which may explain the lack of phenotypes in vivo or upon long-term knockout of FAM53C.

      2) The S-phase phenotype following FAM53C should be demonstrated in a larger variety of TP53WT and mutant cell lines. Given that this paper introduces a new G1/S control element, I think this is important for credibility. Ideally, this should be done with acute gRNA/Cas9 gene deletion using a lentiviral delivery system; but if the siRNA rescue experiments work and validate an on-target effect, siRNA would be an appropriate alternative.

      We now show data with three cancer cell lines (U2OS, A549, and HCT-116 – Fig. S1E,F and Fig. 4F), in addition to our results in RPE-1 cells and in human cortical organoids. We note that the knock-down experiments are complemented by overexpression data (Fig. 1G-I), by genetic data (our original DepMap screen), and our biochemical data (showing direct binding of FAM53C to DYRK1A).

      3) The western blot images shown in the MS appear heavily over-processed and saturated (See for example S4B, 4A, B, and E). Perhaps the authors should provide the original un-processed data of the entire gels?

      For several of our panels (e.g., 4E and S4B, now panels S3J and S3K)), we used a true “immunoassay” (as indicated in the legend – not an immunoblot), which is much more quantitative and avoids error-prone steps in standard immunoblots (“Western blots”). Briefly, this system was developed by ProteinSimple. It uses capillary transfer of proteins and ELISA-like quantification with up to 6 logs of dynamic range (see their web site https://www.proteinsimple.com/wes.html). The “bands” we show are just a representation of the luminescence signals in capillaries. We made sure to further clarify the figure legends in the revised manuscript.

      Data in 4A are also not a western blot but a radiograph.

      For immunoblots, we will provide all the source data with uncropped blots with the final submission.

      4) A critical experiment for the proposed mechanism is the rescue of the FAM53C S-phase reduction using DYRK1A inhibition shown in Figure 4. The legend here states that the data were extracted from BrdU incorporation assays, but in Figure S4D only the PI histograms are shown, and the S-phase population is not quantified. The authors should show the BrdU scatterplot and quantify the phenotype using the S-phase population in these plots. G1 measurements from PI histograms are not precise enough to allow for conclusions. Also, why are the intensities of the PI peaks so variable in these plots? Compare, for example, the HCT116 upper and lower panels where the siRNA appears to have caused an increase in ploidy.

      We apologize for the confusion and we fixed these errors, for most of the analyses, we used PI to measure G1 and S-phase entry. We added relevant flow cytometry plots to supplemental figures (Fig. S1G, H, I, as well as Fig. S4E and S4K, and Fig. S5F).

      5) There's an apparent contradiction in how RB deletion rescues the G1 arrest (Figure 2) while p21 seems to maintain the arrest even when DYRK1A is inhibited. Is p21 not induced when FAM53C is depleted in RB ko cells? This should be measured and discussed.

      This comment and comments from the two other Reviewers made us reconsider our model. We re-read carefully the Meyer paper and think that DYRK1A activity may be understood when considering levels of both CycD and p21 at the same time in a continuum (as was nicely showed in a previous study from the lab of Tobias Meyer – Chen et al., Mol Cell, 2013). While our genetic and biochemical data support a role for FAM53C in DYRK1A inhibition, it is obvious that the regulation of cell cycle progression by FAM53C is not exclusively due to this inhibition. As discussed above and below, we noted an upregulation of p21 upon FAM53C knock-down, and activation of p53 and its targets likely contributes significantly to the phenotypes observed. We added new experiments to support this more complex model (Figure 4 and Figure S4, with new model in S4L).

      __Reviewer #3 (Significance (Required)): __

      In conclusion, I believe that this MS could potentially be important for the cell cycle field and also provide a new target pathway that could be relevant for cancer therapy. However, the paper has quite a few gaps and inconsistencies that need to be addressed with further experiments. My main worry is that the acute depletion phenotypes appear so strong, while the gene is non-essential in mice and shows only a minor fitness effect in the depmap screens. More convincing controls are necessary to rule out experimental artefacts that misguide the interpretation of the results.

      We appreciate this comment and hope that the Reviewer will agree it is still important to share our data with the field, even if the phenotypes in mice are modest.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Taylar Hammond and colleagues identified new regulators of the G1/S transition of the cell cycle. They did so by screening public available data from the Cancer Dependency Map, and identified FAM53C as a positive regulator of the G1/S transition. Using biochemical assays they then show that FAM53 interacts with the DYRK1A kinase to inhibit its function. DYRK1A in its is known to induce degradation of cyclin D, leading the authors to propose a model in which DYRK1A-dependent cyclin D degradation is inhibited by FAM53C to permit S-phase entry. Finally the authors assess the effect of FAM53C deletion in a cortical organoid model, and in Fam53c knockout mice. Whereas proliferation of the organoids is indeed inhibited, mice show virtually no phenotype.

      Major comments:

      The authors show convincing evidence that FAM53C loss can reduce S-phase entry in cell cultures, and that it can bind to DYRK1A. However, FAM53 has multiple other binding partners and I am not entirely convinced that negative regulation of DYRK1A is the predominant mechanism to explain its effects on S-phase entry. Some of the claims that are made based on the biochemical assays, and on the physiological effects of FAM53C are overstated. IN addition, some choices made methodology and data representation need further attention.

      1. The authors do note that P21 levels increase upon FAM53C. They show convincing evidence that this is not a P53-dependent response. But the claim that " p21 upregulation alone cannot explain the G1 arrest in FAM53C-deficient cells (line 138-139) is misleading. A p53-independent p21 response could still be highly relevant. The authors could test if FAM53C knockdown inhibits proliferation after p21 knockdown or p21 deletion in RPE1 cells.
      2. The authors do not convincingly show that FAM53C acts a DYRK1A inhibitor in cells. Figures 4B+C and S4B+C show extremely faint P-CycD1 bands, and tiny differences in ratios. The P values are hovering around the 0.05, so n=3 is clearly underpowered here. Total CycD1 levels also correlate with FAM53C levels, which seems to affect the ratios more than the tiny pCycD1 bands. Why is there still a pCycD1 band visible in 4B in the GFP + BTZ + DYRK1Ai condition? And if I look at the data points I honestly don't understand how the authors can conclude from S4C that knockdown of siFAM53C increases (DYRK1A dependent) increases in pCycD1 (relative to total CycD1). In figure 5C, no blot scans are even shown, and again the differences look tiny. So the authors should either find a way to make these assays more robust, or alter their claims appropriately.
      3. The experiments to test if DYRK1A inhibition could rescue the G1 arrest observed upon FAM53C knockdown are not entirely convincing either. It would be much more convincing if they also perform cell counting experiments as they have done in Figures 1F and 1G, to complement the flow cytometry assays. I suggest that the authors do these cell counting experiments in RPE1 +/- P53 cells as well as HCT116 cells. In addition, did the authors test if P21 is induced by DYRK1Ai in HCT116 cells?
      4. The data in Figure 5C and 5D are identical, although they are supposed to represent either pCycD1 ratios or p21 levels. This is a problem because at least one of the two cannot be true. Please provide the proper data and show (representative) images of both data types.
      5. Line 246: "Fam53c knockout mice display developmental and behavioral defects." I don't agree with this claim. The mutant mice are born at almost the expected Mendelian ratios, the body weight development is not consistently altered. But more importantly, no differences in adult survival or microscopic pathology were seen. The authors put strong emphasis on the IMPC behavioral analysis, but they should be more cautious. The IMPC mouse cohorts are tested for many other phenotypes related to behavior and neurological symptoms and apparently none of these other traits were changed in the IMPC Famc53c-/- cohort. Thus, the decreased exploration in a new environment could very well be a chance finding. The authors need to take away claims about developmental and behavioral defects from the abstract, results and discussion sections; the data are just too weak to justify this.

      Minor comments:

      1. Can the authors provide a rationale for each of the proteins they chose to generate the list of the 38 proteins in the DepMap analysis? I looked at the list and it seems to me that they do not all have described functions in the G1/S transition. The analysis may thus be biased.
      2. Figure 1B is confusing to me. Are these just some (arbitrarily) chosen examples? Consider leaving this heatmap out altogether, of explain in more detail.
      3. The y-axes in Figures 2C, 2D, 2E, and 4D are misleading because they do not start at 0. Please let the axis start at 0, or make axis breaks.
      4. Line 229: " Consequences ... brain development." This subheader is misleading, because the in vitro cortical organoid system is a rather simplistic model for brain development, and far away from physiological brain development. Please alter the header.
      5. Figure S5F: the gating strategy is not clear to me. In particular, how do the authors know the difference between subG1 and G1 DAPI signals? Do they interpret the subG1 as apoptotic cells? If yes, why are there so many? Are the culturing or harvesting conditions of these organoids suboptimal? Perhaps the authors could consider doing IF stainings on EdU or BrdU on paraffin sections of organoids to obtain cleaner data?
      6. Figure S6A; the labeling seems incorrect. I would think that red is heterozygous here, and grey mutant.

      Significance

      The finding that the poorly studied gene FAM53C controls the G1/S transition in cell lines is novel and interesting for the cell cycle field. However, the lack of phenotypes in Famc53-/- mice makes this finding less interesting for a broader audience. Furthermore, the mechanisms are incompletely dissected. The importance of a p53-indepent induction of p21 is not ruled out. And while the direct inhibitory interaction between FAM53C and DYRK1A is convincing (and also reported by others; PMID: 37802655), the authors do not (yet) convincingly show that DYRK1A inhibition can rescue a cell proliferation defect in FAM53C-deficient cells.

      Altogether, this study can be of interest to basic researchers in the cell cycle field.

      I am a cell biologist studying cell cycle fate decisions, and adaptation of cancer cells & stem cells to (drug-induced) stress. My technical expertise aligns well with the work presented throughout this paper, although I am not familiar with biolayer interferometry.

    1. Author response:

      Reviewer #1:

      We agree with the reviewer that a limitation of our study is its focus on cell-based assays rather than in vivo experiments. We did consider evaluating the effects of statins on B cell responses in vivo; however, this approach is complicated by findings that statins can influence antigen presentation by dendritic cells, thereby impacting antibody responses (Xia et al, 2018). One possible solution would be to use B cell-specific conditional knockout models to study the roles of the identified proteins in an in vivo context. However, we currently do not have access to these models and were therefore unable to include such experiments within a feasible timeframe. We will revise the discussion section to acknowledge these points.

      The reviewer also noted that our study assessed the roles of HMGCR, SQLE, and prenylation in B cell activation using pharmacological inhibitors and genetic knockdown/out approaches. Loss-of-function techniques such as RNAi, siRNA, and CRISPR can be challenging to apply to primary B cells, but we are exploring their feasibility for future revisions. While we acknowledge the limitations of using pharmacological inhibitors, we have taken several steps to mitigate these, including targeting multiple steps in the cholesterol biosynthetic pathway using structurally distinct inhibitors and conducting rescue experiments by supplementing downstream metabolites. To further investigate potential off-target effects of statins, we have recently performed proteomic analysis of B cells treated with and without fluvastatin. The data suggest that fluvastatin primarily affects cholesterol metabolism and does not cause widespread off-target effects. We will include this new data in the revised manuscript.

      Reviewer #2:

      The reviewer suggested that the study would be strengthened by determining whether the observed changes are specific to LPS + IL-4 stimulation or represent a more general B cell response to mitogenic signals.

      A complementary study by James et al. (James et al, 2024) investigated murine B cells stimulated via the B cell receptor (BCR) and CD40, using anti-IgM and anti-CD40 antibodies alongside IL-4. Their proteomic analysis showed that such co-stimulation induces a fivefold increase in total cellular protein mass within 24 hours, mirroring our findings with LPS + IL-4. They also reported upregulation of proteins associated with cell cycle progression, ribosome biogenesis, and amino acid transport. Furthermore, by using SLC7A5 knockout mice, they demonstrated that this transporter is required for B cell activation. We will expand our discussion to include and these findings.  We will also expand on the final figure in our paper showing that the effects of statins are not limited to LPS.

      References:

      James O, Sinclair LV, Lefter N, Salerno F, Brenes A & Howden AJM (2024) A proteomic map of B cell activation and its shaping by mTORC1, MYC and iron. bioRxiv 2024.12.19.629506 doi:10.1101/2024.12.19.629506 [PREPRINT]

      Xia Y, Xie Y, Yu Z, Xiao H, Jiang G, Zhou X, Yang Y, Li X, Zhao M, Li L, et al (2018) The Mevalonate Pathway Is a Druggable Target for Vaccine Adjuvant Discovery. Cell 175: 1059-1073.e21

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #2

      Evidence, reproducibility and clarity

      Summary:

      This manuscript studies the effects of genotoxic stress using zeocin, a bleomycin-family drug, in the tardigrade species H. exemplaris. In a first experimental set, the authors evaluate the survival of the organisms as well as the levels of DNA damage.

      A RT-qPCR analysis of a set of DNA repair genes identified in a previous study by another group (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol. 34, Issue 9, 1819-1830.e6) and a comet assay reveal the damage observed during treatment.

      Experiments on fasting animals show variations in animal size that overlap with those seen in groups of animals treated with the genotoxic drug. Physiological variations are also observed, such as lipid loss and cuticle alteration.

      In a subsequent experimental set, the authors indicate that the genotoxic drug blocks DNA replication and activates DNA repair systems in various tissues, particularly the digestive tissue, which appears to be specifically targeted in terms of its replicative capacity following DNA damage caused by the drug. A sensitivity study of tardigrade embryo development then shows that their proliferative capacity, which is highly dependent on replication, mobilizes different sets of DNA repair genes that may be more closely associated with replication than in adults.

      Finally, a comparative study of the development of two organisms (C. elegans and planarian) also shows sensitivity to drugs that disrupt the replication process during development.

      The authors conclude from all of this work that the cells of the animals' intestines are the main target of the genotoxic stress induced by the drug. The effects of disruption of the normal replication process in intestinal cells are thought to be the cause of the observed loss of tissue homeostasis (loss of lipids and tissue renewal capacity).

      Major comments:

      1. Zeocin is a drug derived from bleomycin but has not yet been extensively studied. Could you give examples of the use/validation of zeocin as a radiomimetic in other biological systems?

      2. Similarities in transcriptional responses between UV and dehydration genotoxic stresses have already been observed (Yoshida et al., 2022; BMC Genomics 23, 405) in a tardigrade species closely related to H. exemplaris (R. varieornatus). However, no correlation in transcriptional responses could be observed after treating H. exemplaris with genotoxic stresses such as desiccation and 500 Gy gamma ray irradiation (Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6). These results indicate that, depending on the type of genotoxic stress, transcriptomic responses can appear to be very different and sometimes uncorrelated, particularly in the species H. exemplaris. Bleomycin has been studied in previous reports (refs Yoshida Y, et al. Proc Jpn Acad Ser B Phys Biol Sci. 2024 100(7):414-428; Clark-Hachtel, Courtney M. et al.; Curr Biol, Vol 34, Issue 9, 1819 - 1830.e6; Marwan Anoud et al., 2024, eLife 13:RP92621), which used a transcriptomic study to confirm that it behaves as a radiomimetic for the species H. exemplaris.

      On the other hand, since zeocin is a bleomycin-family drug, it is possible that its effects may differ slightly from those of bleomycin, exhibiting specific effects as observed by comparison of chemical radiomimetic and radiation treatments.

      A control experiment comparing the effects of bleomycin and zeocin using RNAseq would validate that their use is equivalent.

      1. A major conclusion of the manuscript is that DNA damage induced by the genotoxic drug disrupts replication mechanisms and leads to the observed effects. Are RT-qPCR analyses on a subset of drug-induced repair genes induced solely by the drug itself or by its indirect effect on replication?

      It would be interesting to block replication in embryos and assess whether the same sets of DNA repair genes are induced when compared with treatment with zeocin only. Additionally, it will be interesting to redo the same DNA replication block experiments with additional treatment to compare the induced sets of DNA reparation genes. This will help to understand the true effect that will be directly imputable to zeocin.

      Minor comments:

      The data are well presented, and the experiments are well described for general understanding. Previous studies in this field have been well referenced. However, the link between DNA damage caused by the drug and its impact on replication needs to be better explained.

      Finally, the use of the drug zeocin should be validated in this system by comparison with bleomycin.

      Significance

      This study evaluates the resistance of a species of tardigrades to genotoxic stress. Several previous studies have conducted this type of experiment using the same species with consistent results and using the same type of genotoxic chemical drug : bleomycin. In this study, a new genotoxic drug is evaluated for its effects on DNA damage as well as on the survival of organisms and their embryonic development. Definitive validation experiments of this new genotoxic chemical tool are necessary to determine its similarities with drugs already known for their effects in the literature.

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      Summary:

      The authors present MAVISp, a tool for viewing protein variants heavily based on protein structure information. The authors have done a very impressive amount of curation on various protein targets, and should be commended for their efforts. The tool includes a diverse array of experimental, clinical, and computational data sources that provides value to potential users interested in a given target.

      Major comments:

      Unfortunately I was not able to get the website to work properly. When selecting a protein target in simple mode, I was greeted with a completely blank page in the app window, and in ensemble mode, there was no transition away from the list of targets at all. I'm using Firefox 140.0.2 (64-bit) on Ubuntu 22.04. I would have liked to be able to explore the data myself and provide feedback on the user experience and utility.

      I have some serious concerns about the sustainability of the project and think that additional clarifications in the text could help. Currently is there a way to easily update a dataset to add, remove, or update a component (for example, if a new predictor is published, an error is found in a predictor dataset, or a predictor is updated)? If it requires a new round of manual curation for each protein to do this, I am worried that this will not scale and will leave the project with many out of date entries. The diversity of software tools (e.g., three different pipeline frameworks) also seems quite challenging to maintain.

      On the same theme, according to the GitHub repository, the program relies on Python 3.9, which reaches end of life in October 2025. It has been tested against Ubuntu 18.04, which left standard support in May 2023. The authors should update the software to more modern versions of Python to promote the long-term health and maintainability of the project.

      I appreciate that the authors have made their code and data available. These artifacts should also be versioned and archived in a service like Zenodo, so that researchers who rely on or want to refer to specific versions can do so in their own future publications.

      In the introduction of the paper, the authors conflate the clinical challenges of variant classification with evidence generation and it's quite muddled together. The y should strongly consider splitting the first paragraph into two paragraphs - one about challenges in variant classification/clinical genetics/precision oncology and another about variant effect prediction and experimental methods. The authors should also note that they are many predictors other than AlphaMissense, and may want to cite the ClinGen recommendations (PMID: 36413997) in the intro instead.

      Also in the introduction on lines 21-22 the authors assert that "a mechanistic understanding of variant effects is essential knowledge" for a variety of clinical outcomes. While this is nice, it is clearly not the case as we are able to classify variants according to the ACMG/AMP guidelines without any notion of specific mechanism (for example, by combining population frequency data, in silico predictor data, and functional assay data). The authors should revise the statement so that it's clear that mechanistic understanding is a worthy aspiration rather than a prerequisite.

      In the structural analysis section (page 5, lines 154-155 and elsewhere), the authors define cutoffs with convenient round numbers. Is there a citation for these values or were these arbitrarily chosen by the authors? I would have liked to see some justification that these assignments are reasonable. Also there seems to be an error in the text where values between -2 and -3 kcal/mol are not assigned to a bin (I assume they should also be uncertain). There are other similar seemingly-arbitrary cutoffs later in the section that should also be explained.

      On page 9, lines 294-298 the authors talk about using the PTEN data from ProteinGym, rather than the actual cutoffs from the paper. They get to the latter later on, but I'm not sure why this isn't first? The ProteinGym cutoffs are somewhat arbitrarily based on the median rather than expert evaluation of the dataset and I'm not sure why it's even worth mentioning them when proper classifications are available. Regarding PTEN, it would be quite interesting to see a comparison of the VAMP-seq PTEN data and the Mighell phosphatase assay, which is cited on page 9 line 288 but is not actually a VAMP-seq dataset. I think this section could be interesting but it requires some additional attention.

      The authors mention "pathogenicity predictors" and otherwise use pathogenicity incorrectly throughout the manuscript. Pathogenicity is a classification for a variant after it has been curated according to a framework like the ACMG/AMP guidelines (Richards 2015 and amendments). A single tool cannot predict or assign pathogenicity - the AlphaMissense paper was wrong to use this nomenclature and these authors should not compound this mistake. These predictors should be referred to as "variant effect predictors" or similar, and they are able to produce evidence towards pathogenicity or benignity but not make pathogenicity calls themselves. For example, in Figure 4e, the terms "pathogenic" and "benign" should only be used here if these are the classifications the authors have derived from ClinVar or a similar source of clinically classified variants.

      Minor comments:

      The target selection table on the website needs some kind of text filtering option. It's very tedious to have to find a protein by scrolling through the table rather than typing in the symbol. This will only get worse as more datasets are added.

      The data sources listed on the data usage section of the website are not concordant with what is in the paper. For example, MaveDB is not listed.

      I found Figure 2 to be a bit confusing in that it partially interleaves results from two different proteins. I think this would be nicer as two separate figures, one on each protein, or just of a single protein.

      Figure 3 panel b is distractingly large and I wonder if the authors could do a little bit more with this visualization.

      Capitalization is inconsistent throughout the manuscript. For example, page 9 line 288 refers to VampSEQ instead of VAMP-seq (although this is correct elsewhere). MaveDB is referred to as MAVEdb or MAVEDB in various places. AlphaMissense is referred to as Alphamissense in the Figure 5 legend. The authors should make a careful pass through the manuscript to address this kind of issues.

      MaveDB has a more recent paper (PMID: 39838450) that should be cited instead of/in addition to Esposito et al.

      On page 11, lines 338-339 the authors mention some interesting proteins including BLC2, which has base editor data available (PMID: 35288574). Are there plans to incorporate this type of functional assay data into MAVISp?

      Significance

      General assessment:

      This is a nice resource and the authors have clearly put a lot of effort in. They should be celebrated for their achievments in curating the diverse datasets, and the GitBooks are a nice approach. However, I wasn't able to get the website to work and I have raised several issues with the paper itself that I think should be addressed.

      Advance:

      New ways to explore and integrate complex data like protein structures and variant effects are always interesting and welcome. I appreciate the effort towards manual curation of datasets. This work is very similar in theme to existing tools like Genomics 2 Proteins portal (PMID: 38260256) and ProtVar (PMID: 38769064). Unfortunately as I wasn't able to use the site I can't comment further on MAVISp's position in the landscape.

      Audience:

      MAVISp could appeal to a diverse group of researchers who are interested in the biology or biochemistry of proteins that are included, or are interested in protein variants in general either from a computational/machine learning perspective or from a genetics/genomics perspective.

      My expertise:

      I am an expert in high-throughput functional genomics experiments and am an experienced computational biologist with software engineering experience.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Reviewer #1

      Summary:

      Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.


      Major comments:

      1.) It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells

      We appreciate the reviewer's thoughtful suggestion to compare non-transformed and transformed cell lines to evaluate importin α1 localization in MN. Given that HeLa cells are derived from cervical cancer rather than the mammary epithelium, we considered it inappropriate to directly compare them with non-transformed mammary epithelial MCF10A cells. Therefore, HeLa cells were analyzed separately to assess the effects of reversine treatment on importin α1 localization. The results indicated no significant difference between the treated and untreated HeLa cells. (Supplemental Fig. S2F in the revised manuscript). Regarding the comparison between MCF10A and the two cancer cell lines, MCF7 and MDA-MB-231, the proportion of importin α1-positive MN did not significantly differ across the cell lines, regardless of reversine treatment (Supplemental Fig. S3B, Untreated: p = 0.9850 and 0.5533; Reversine: p = 0.2218 and 0.9392). These results suggest that there is no clear difference in the localization of importin α1 in MN between the transformed and non-transformed cell lines tested. However, we acknowledge that this does not exclude the possibility that importin α1 localization to MN is linked to genomic instability under specific conditions.

      2.) While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells.

      We thank the reviewer for the constructive suggestion to quantify nuclear envelope integrity more comprehensively. In response, we compared laminB1 localization at the MN membrane between importin α1-positive and -negative MN in MCF10A, MCF7, MDA-MB-231, and HeLa cells, and included these results in the revised manuscript (Fig. 4C). For each cell, the laminB1 intensity in the MN was normalized to that of the primary nucleus (PN). This analysis showed that laminB1 intensity was significantly lower in importin α1-positive MN across all cell lines, including non-transformed MCF10A cells. These findings support a close association between aberrant importin α1 accumulation and compromised nuclear envelope integrity, a key factor potentially linking MN to chromothripsis and cGAS-STING-mediated genomic instability.

      3.) The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889).

      We agree that the previous schematic illustration (former Fig. 8) did not adequately summarize our findings and may have overstated our conclusions. Accordingly, we have removed this figure from the revised manuscript.

      To address the reviewer's concern, we performed additional analyses and included the results in the new Figure 8. These data show that, in addition to RAD51, both RPA2 and cGAS display mutually exclusive localization with importin α1 in MN. RPA2, a single-stranded DNA-binding protein, stabilizes damaged DNA and enables RAD51 filament assembly during homologous recombination repair. Previous studies have demonstrated that RPA2 accumulates in ruptured MN in a CHMP4B-dependent manner (PMID: 32601372). Likewise, cGAS is a cytosolic DNA sensor that localizes to ruptured MN and activates innate immune signaling through the cGAS-STING pathway, as widely reported (PMID: 28738408; 28759889; see also PMID: 32494070; 27918550).

      Our findings suggest an alternative scenario: even when nuclear envelope rupture occurs, importin α1-positive MN may remain inaccessible to DNA repair and sensing factors such as RPA2 and cGAS. This supports the view that importin α1 defines a distinct MN subset, separate from those characterized by the canonical DNA damage response or innate immune signaling factors. Furthermore, our overexpression experiments with EGFP-importin α1 (Fig. 7G, 7H) raises the possibility that importin α1 enrichment may impede the recruitment of DNA-binding proteins.

      Taken together, these results support the conclusion that importin α1 marks a unique MN state and provides a molecular framework for distinguishing between different MN environments. At the reviewer's suggestion, we have cited all the recommended references (PMID: 32601372, 32494070, 27918550, 28738408, and 28759889) in the revised manuscript to better contextualize our findings. We are grateful for the reviewer's thoughtful suggestions and literature recommendations, which helped us clarify the implications of our findings within the broader context of chromothripsis and cGAS-STING-mediated genomic instability.

      4.) Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      We sincerely thank the reviewer for pointing out the important limitations of the original version of Fig. 4D, as also raised in minor comment #5. As the reviewer correctly noted, this figure was intended to demonstrate that importin-α1 preferentially localizes to euchromatin regions (H3K4me3 and H3K36me3) rather than heterochromatin (H3K9me3 and H3K27me3). However, we acknowledge that in the original figure, the predominantly blue tone of the heatmap made this interpretation unclear and that the Spearman's correlation coefficient for H3K36me3 was missing. In response, we have substantially revised the figure (now shown as Fig. 5E in the revised manuscript). Specifically, we improved the color scale for better visual distinction, added the missing Spearman's coefficients for H3K36me3, and strengthened the analysis by incorporating ChIP-seq data obtained with two independent antibodies against importin α1 (Ab1 and Ab2). We believe that these revisions provide a clear and more accurate representation of euchromatin enrichment of importin-α1, as originally intended.

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Following the reviewer's suggestion, we carefully revised the manuscript to ensure that our statements are consistent with the scope of the data and do not overstate our conclusions. As part of this effort, we removed the schematic illustration (former Fig. 8), which might have overstated our findings, and refined the relevant text to prevent overinterpretation.

      To our knowledge, this study is the first to report the specific accumulation of importin α in MN. Our results suggest a previously unrecognized function of importin α beyond its canonical transport role and add to the growing list of nuclear proteins that exhibit abnormal behavior in MN. We hope that these findings will provide a conceptual and experimental basis for future studies aimed at clarifying the biological significance of MN heterogeneity and quality control in cancer biology.


      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells.

      As part of our response to Major Comment 1, we conducted additional experiments to quantitatively compare importin α1 localization in MN between non-transformed MCF10A cells, breast cancer cell lines (MCF7 and MDA-MB-231), and HeLa cells. These results have been included in the revised manuscript (Supplemental Fig. S2F and Fig. S3B). The analyses showed no significant differences in the proportion of importin α1-positive MN among these cell lines, consistent with the reviewer's request for a more comprehensive evaluation.

      The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      We have quantified the co-localization of importin α1 with the euchromatin marker H3K4me3 and the heterochromatin marker H3K9me3 in micronuclei (MN) across four human cell lines (MCF10A, MCF7, MDA-MB-231, and HeLa). The results of this statistical analysis are included in the revised manuscript in Fig. 5C. These data provide quantitative evidence from independent experiments showing that importin α1 preferentially localizes to euchromatic regions within the MN, thereby supporting our initial observation.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN.

      As noted, these experiments were performed on whole-cell populations of MCF7 cells and therefore reflect the overall chromatin landscape, not specifically that of the MN. We fully acknowledge that MN constitute only a small fraction of the cell population under standard culture conditions (Supplemental Fig. S2D), and thus, the relevance of ChIP-seq data to MN must be interpreted with caution.

      Nevertheless, our intention in presenting these data was to illustrate that importin α1 preferentially associates with euchromatin regions marked by H3K4me3. To examine this more directly, we analyzed importin α1 localization in MN using immunofluorescence with histone modification markers across multiple cell lines. These analyses, together with the quantitative results now included in the revised manuscript (Fig. 5C), confirming that importin α1 preferentially localizes to euchromatic regions within MN.

      Taken together, although the ChIP-seq data were derived from whole-cell populations, the combined results from IF imaging and quantitative analysis support our interpretation that importin α1 retains its euchromatin-associating property within MN. We hope that these additional data will address the reviewer's concerns.

      To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Following the reviewer's suggestion, we have added a new graph (Fig. 7F) in the revised manuscript. This figure presents the quantified frequency of RAD51-positive MN among importin α1-negative and importin α1-positive MN, analyzed across six microscopy fields (n = 6) from three independent experiments.

      To improve clarity and consistency, we reorganized the panels: representative RAD51 images are now shown in Fig. 7B, and the Cell #1 (low RAD51) vs. Cell #2 (high RAD51) classification with etoposide responsiveness is summarized in Fig. 7C. As illustrated in Figs. 7D and 7E, importin α1 and RAD51 exhibit mutually exclusive localization in MN. Fig. 7F provides a unified statistical summary at the population level.

      The results showed that the proportion of RAD51-positive MN was significantly lower among importin α1-positive MN than among importin α1-negative MN, providing robust quantitative support for the proposed mutual exclusivity between importin α1 localization and RAD51 accessibility in MN.

      We are grateful to the reviewer for this constructive suggestion, which helped us clarify and better support the central message of our study.


      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      We appreciate the reviewer's thoughtful consideration of the feasibility of the additional experiments.

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Following the reviewer's suggestion, we have substantially revised the Materials and Methods sections in the main and supplemental manuscripts to provide detailed descriptions of the optical microscopy procedures, including the specifications of the imaging equipment, acquisition settings, and image processing parameters. These revisions follow the best practices recommended by Heddleston et al. (2021, J. Cell Sci., doi:10.1242/jcs.254144).

      We have also expanded the description of our quantitative image analysis using ImageJ, providing details on the parameters for MN identification and the measurement of colocalization rates between importin α and histone modifications. These additions ensured reproducibility and clarity.

      We believe that these modifications will enhance the reproducibility of our results and increase the value of our study for the research community. We sincerely appreciate the reviewer's helpful suggestions.


      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      We sincerely appreciate the reviewer's constructive comments highlighting the importance of transparent and rigorous statistical analyses. In response, we have carefully revised all figure panels, figure legends, and the Materials and Methods (Statistical Analysis) section in both the main and the supplementary manuscripts.

      In the revised figure legends, we now provide the number of independent experiments and sample sizes (n), statistical tests applied (e.g., unpaired or paired two-tailed t-test, one-way ANOVA with Tukey's post-hoc test, two-way ANOVA with Sidak's multiple comparisons), data presentation format (mean {plus minus} SD), and corresponding p-values or significance indicators (*, **, ***). The Statistical Analysis section was also expanded to explain the rationale for selecting each statistical test, the criteria for significance, and the reporting of the replicates. These revisions ensure clarity, reproducibility, and transparency throughout the manuscript, directly addressing the reviewers' concerns. We are grateful for this valuable suggestion, which has significantly improved the rigor of our study.

      Minor comments:

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      We thank the reviewer for their constructive suggestions regarding our FRAP analysis. To address the concern that the original comparison between PN and the micronuclei (MN) might have been biased by differences in bleaching areas, we performed new experiments in which both PN and MN were fully bleached within the same cells (Fig. 3A, and 3C). This approach allowed for a more direct comparison of importin α1 dynamics under equivalent conditions.

      These experiments revealed a markedly slower fluorescence recovery in MN than in PN, indicating reduced nuclear import and/or recycling efficiency of importin α1 in MN. In addition, we retained our original analysis to further characterize the heterogeneous mobility patterns of importin α1 in MN, identifying three distinct mobility classes: high, intermediate, and low (Fig. 3B, and 3D). Together, these results support our observation that importin α1 mobility is restricted in MN, likely due to altered nuclear transport dynamics.

      As suggested by the reviewer, we attempted partial bleaching of MN to assess intranuclear mobility. However, owing to the small size of MN, partial bleaching is technically challenging and inconsistent, with some MN recovering even during the bleaching process. Therefore, reliable quantification was not possible. For transparency, these data are provided as a Reviewer-only Figure but were not included in the revised manuscript.

      Finally, while we agree that examining other nuclear transport factors (e.g., RAN, CAS, RCC1) would be informative, our study focused on importin α1 dynamics. We consider these additional factors to be important directions for future investigations.


      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      We thank the reviewer for carefully pointing out the key references that are highly relevant to framing our findings in the context of previous studies on micronuclear instability, chromothripsis and inflammation. We fully agree with this suggestion.

      In the revised manuscript, we have cited these studies in both the Introduction and Discussion sections. Specifically, we incorporated these studies when discussing the structural fragility of MN, aberrant DNA replication, and the exposure of micronuclear DNA to cytoplasmic sensors, which mechanistically link MN rupture to chromothripsis and cGAS-STING-mediated immune activation. For example, we now refer to the study demonstrating RPA2 recruitment to ruptured MN in a CHMP4B-dependent manner (PMID: 32601372), reports showing defective replication and DNA damage responses in MN (PMID: 32494070; 27918550), and seminal studies establishing cGAS localization to ruptured MN and activation of innate immune signaling (PMID: 28738408; 28759889).

      By incorporating these references, we more clearly position our findings that importin α1 defines a distinct subset of MN lacking access to DNA repair and sensing factors such as RAD51, RPA2, and cGAS. This contextualization emphasizes that our data add to and extend the established view that compromised MN integrity underlies chromothripsis and inflammation by identifying importin α1 as a novel marker of an alternative MN microenvironment. We are grateful for this constructive recommendation, which has allowed us to strengthen the framing of our study in the existing literature.


      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      In response, we have revised the relevant figure panels and their legends to clearly display the statistical significance, including p-values, where appropriate. Specifically, we added statistical annotations (p-values or significance markers such as asterisks) directly on the plots or in the corresponding legends, and clarified the number of replicates, statistical tests used, and definitions of error bars (mean {plus minus} SD). We believe that these revisions improve the interpretability and transparency of our results and strengthen the overall presentation of the data.

      __ 1.) In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.__

      Following the reviewer's comments, we revised Supplemental Fig. S2F shows a direct comparison of the proportion of importin α1-positive MN between untreated and reversine-treated HeLa cells based on indirect IF analysis. The Results section was updated accordingly (page 8, Lines 148-150): "We then examined whether reversine treatment affected the proportion of importin α1-positive MN. The results revealed that the MN formation rate for either untreated or treated cells was 36.2% {plus minus} 7.8 or 38.3% {plus minus} 8.8, respectively, with no significant difference (Fig. S2F). "

      We believe that this revision addresses the reviewer's concern by providing relevant quantitative data for the untreated condition.

      2.) In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.

      In this section, we aimed to clarify that the quantitative analysis focused exclusively on cells harboring MN, as the purpose of the analysis was to compare the localization of EGFP-importin α1 between MN and PN. We excluded cells that contained no MN and showed EGFP-importin α1 localization only in the PN. This criterion was consistently applied to both wild-type and mutant constructs. To avoid confusion, we have removed the sentence "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." from the revised manuscript.

      3.) The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.

      We agree with the reviewer that the statement "However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)" was not essential for understanding the rationale of our study and could be misleading. In response, we have removed this sentence from the revised manuscript, along with the corresponding Supplementary Fig. S4.

      4.) Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.

      In the revised manuscript, we performed one-way ANOVA followed by Holm-Sidak's multiple comparisons test to evaluate the MN localization ratio of EGFP-NES between Imp-α1-negative and Imp-α1-positive MN. This analysis revealed a statistically significant difference (**p

      5.) In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.

      We thank the reviewer for this insightful comment. As addressed in response to Major comment #4, we have substantially revised Fig. 5 and added the missing Spearman's correlation coefficient value for H3K36me3 (now shown in Fig. 5E). These revisions, together with the overall improvements to the figure, more clearly illustrate the euchromatin enrichment of importin-α1.

      6.) For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.

      We sincerely apologize for the misstatements in the legends of the original Fig. 5C. The correct description is that this experiment was performed using MCF7 cells, and we have revised the legend accordingly in the revised manuscript (now Fig. 6C). In addition, because the original data in Fig. 5D were obtained from HeLa cells, we repeated this experiment using MCF7 cells and replaced the panel with new data (now Fig. 6D).

      7.) To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.

      As described above, we addressed this point by adding a new quantification and statistical analysis in Fig. 7F, based on six microscopy fields across three independent experiments. This analysis directly supports our claim that importin α1 inhibits RAD51 accessibility in the MN.

      We would also like to clarify that although the reviewer referred to Figs 7D and 7E, these two panels were designed to illustrate the same phenomenon-the mutually exclusive localization of importin α1 and RAD51 to distinct MN-shown in different contexts. Specifically, Fig. 7D presents examples from separate cells, each with MN containing either importin α1 or RAD51, while Fig. 7E shows a single cell containing two distinct MN, one enriched with importin α1 and the other with RAD51. Because both panels serve as illustrative examples of the same phenomenon, it would not be meaningful to quantify them independently as parallel datasets. Instead, we integrated the statistical analysis into a unified graph (Fig. 7F), which summarizes the frequency of RAD51-positive MN in relation to importin α1 status across the cell population, thereby supporting our interpretation that importin α1-positive MN represent a distinct subset that is less accessible to RAD51.

      8.) The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.

      We appreciate the reviewer's comment regarding the clarity of our statement in the Discussion (former lines 336-338). We agree that the original phrasing is ambiguous. To improve clarity and align with our results, we revised this section to emphasize that importin α1-positive MN represent a restricted environment from which DNA repair and sensing factors are excluded. Specifically, RAD51, RPA2, and cGAS showed mutually exclusive localization with importin α1, indicating that these MN are largely inaccessible to DNA-binding proteins (pages 20-21). This rephrasing removes the unclear phrase "protect its activity" and directly reflects our experimental findings, presenting a clearer interpretation that is consistent with the Results.

      9.) Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small

      We appreciate the reviewer's careful examination of the figure. In the revised manuscript, we added numerical tick labels to both the x- and y-axes and increased the label font size to ensure clear readability, as shown in Fig. 1D. We also applied the same improvements to other fluorescence intensity plots, including Figs. 4A, 4B, 5A, 5B, 7H, and Supplemental Fig. S4C and S5A-S5F to ensure consistency in readability across the manuscript. We thank the reviewer for helping us improve the clarity and accuracy of our figure presentations.

      10.) Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Upon re-examination of the source data, we identified and corrected a minor calculation error in one subset and regenerated the panel. After correction, the three independent PN/MN ratios were 3.1%, 2.9%, and 2.6%, rather than being identical. These corrected values were proportional to the corresponding PN and MN measurements and preserved the expected relationship between their distributions. Although the numerical differences were small, they demonstrated high reproducibility across independent experiments. These corrections do not alter the interpretation of Fig. 1F, and the distribution of PN/MN values is now consistent with the paired PN and MN data presented in the revised manuscript.

      Significance Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways.

      The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A).

      While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking.

      Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability.

      We sincerely appreciate the reviewer's thoughtful and constructive evaluation of the significance of our study. We agree that in the original submission, the conceptual contribution was not fully supported by sufficient evidence. In the revised manuscript, we have substantially strengthened our findings by incorporating new data on RPA2 and cGAS, in addition to RAD51. These results consistently show that importin α1-positive MN are largely inaccessible to multiple DNA-recognizing proteins-including DNA repair factors (RAD51 and RPA2) and the innate immune sensor cGAS-whereas importin α1-negative MN readily recruit these proteins. This broader dataset reinforces the concept that importin α defines a distinct and restricted MN subset, extending beyond our initial observation of RAD51 exclusion.

      By framing importin α as a molecular marker that discriminates between functionally distinct MN environments, our study conceptually advances the understanding of MN heterogeneity. This adds to the prior literature showing that defective nuclear envelope integrity underlies chromothripsis and cGAS-STING activation and positions importin α as a new marker for identifying MN that are refractory to these DNA repair and sensing pathways. While we agree that further work is necessary to directly link importin α enrichment to downstream genomic instability or inflammation in cancer, we believe that our revised data now provide a robust foundation for future investigations.

      Taken together, the revised manuscript presents a clearer and more comprehensive conceptual advance: importin α-positive MN represents a previously unrecognized molecular environment distinct from MN characterized by canonical DNA repair or sensing factors. We are grateful to the reviewer, whose constructive comments greatly improved the clarity, robustness, and overall impact of our study. We believe that these findings will be of particular interest to researchers studying the mechanisms of genomic instability, chromothripsis, and cancer biology.


      Reviewer #2

      Summary:

      The authors have shown that Importin α1, a nuclear transport factor, is enriched in subsets of micronuclei (MN) of cancer cells (MCF7 and HeLa) and, using FRAP, has an altered dynamics in MN. Moreover, the authors have shown that these levels of Importin α1 in the MN are likely not due to its traditional role for signal-dependent protein transport, as suggested by immunofluorescence of other factors important for this function. Additionally, cargo dynamics carrying NLS or NES signals were disrupted in Importin α1-positive micronuclei. Importin α1-positive micronuclei also appear to have a disrupted nuclear envelope, potentially explaining some of these cargo disruptions. The authors also demonstrated that Importin α colocalizes with proteins important for DNA replication, and p53 signaling using RIME, followed by immunofluorescence. Lastly, the authors show that Importin α and RAD51 have mutual exclusivity in the micronuclei.

      Major comments:

      1) A key issue is there are very few statistical tests used in this study. It is crucial to the interpretation of the data. We strongly urge the authors to re-analyze the data using appropriate statistical analyses. Along those lines, in many figures 1 or 2 images are shown without stating how many biological or technical replicates this is representative of or showing quantification of the anlyses. In general, the authors' statements would be strengthened by showing more examples and/or stating "N" in the figure legends or supplement.

      We sincerely thank the reviewer for emphasizing the importance of including sufficient statistical analyses and replication information. As noted in our response to Reviewer #1, we have carefully revised the manuscript to enhance statistical rigor and transparency throughout. Specifically, we expanded the Statistical Analysis section in the Materials and Methods section to provide a clear description of the statistical approaches used. In addition, all figure legends have been revised to explicitly state the number of biological replicates, sample sizes, statistical tests applied, and corresponding p-values or significance indicators. Representative images are consistently accompanied by quantitative analyses derived from multiple independent experiments.

      We believe that these comprehensive revisions directly address the reviewer's concerns and substantially improve the rigor, clarity, and interpretability of our manuscript.

      2) Using RIME and immunofluorescence, the authors identify factors that co-localize with Importin α1 in subsets of micronuclei (Figure 5), which is interesting, but there is no functional data associated with this result. Are the authors stating that these differences account for altered DNA damage or replication? It is unclear what the conclusion is beyond "some MN are different than others." Could the authors knockdown/knockout these factors to determine if they recruit Importin α1 into MN or the reciprocal? For many of these factors, they appear to be broadly present throughout the entire primary nucleus as well, indicating there is nothing unique about their MN localization.

      We agree that our original RIME and indirect IF analyses were primarily descriptive and lacked functional validation. To strengthen this aspect, we added new IF and quantification data (now presented in Fig. 8) showing that importin α1-positive MN are largely mutually exclusive with DNA repair and sensing factors such as RAD51, RPA2, and cGAS, whereas importin α1 frequently co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings indicate that importin α1-positive MN define a distinct molecular environment enriched in replication- and chromatin-associated regulators but inaccessible to canonical DNA repair and sensing proteins.

      This combination of mutual exclusivity with DNA repair/sensing factors and frequent co-localization with chromatin regulators underscores the biological significance of importin α1 localization in MN, as it may contribute to localized chromatin stabilization through association with chromatin regulators while simultaneously restricting access to DNA repair and sensing factors. Thus, importin α1-positive MN represent a restricted subset with potential implications for genome stability and immune signaling, going beyond the descriptive notion that "some MN are different than others."

      Moreover, many chromatin regulators identified by RIME contain classical nuclear localization signals (NLSs), raising the possibility that importin α1 interacts with these proteins via their NLS sequences. We fully agree with the reviewer that knockdown or knockout experiments would be highly valuable to clarify whether such interactions actively recruit importin α1 into MN or occur reciprocally, and we regard this as an important direction for future investigations.

      3) In line 274, the authors state that MN highly enriched for Importin α1 inhibits RAD51 accessibility but this is an overstatement of the data. Instead, the authors show that RAD51 and importin α1 do not colocalize in micronuclei, albeit without quantification which weakens their argument. Also, the consequence of this "mutual exclusivity" is unclear. Can the authors inhibit or knockdown Importin α1 and show that RAD51 goes to all micronuclei? And how is this different than the data shown for factors in Figure 5? Some of those show colocalization with Importin α1-positive micronuclei and others do not. Could you perform live imaging of labeled Importin a1 and RAD51 and show that as Importin α1 accumulates in MN that RAD51 or other DNA repair factors are exported? An alternative experiment would be to show that the C-mutant, which is defective in nuclear export, now colocalizes with RAD51 in MN. Please reconcile this or show experiments to prove the statement above.

      We agree that our original wording "inhibits RAD51 accessibility" was not sufficiently supported by direct evidence, as it was based solely on the immunofluorescence data. Therefore, we have removed this statement from the Results section of the revised manuscript. To strengthen this point, we added a quantitative analysis (Fig. 7F) showing that RAD51 signals were significantly reduced in importin α1-enriched MN.

      Regarding the suggestion to perform knockdown experiments, we note that the depletion of KPNA2 (gene name of importin α1) has been reported to cause severe cell-cycle arrest (Martinez-Olivera et al, 2018; Wang et al, 2012). Consistent with these reports, we also found that siRNA-mediated knockdown of KPNA2 in our system strongly reduced MN induction upon reversine treatment, making it technically unfeasible to analyze RAD51 localization under these conditions. We also sincerely thank the reviewer for suggesting the live imaging experiments. We fully agree that such experiments would provide valuable mechanistic insights, and we regard this as an important direction for future research.

      In addition, to address the reviewer's concern about other DNA repair factors, we added new data (Fig. 8) showing that importin α1-positive MN are mutually exclusive with RPA2 and cGAS. RPA2 is a canonical single-strand DNA (ssDNA)-binding protein that stabilizes exposed ssDNA and facilitates RAD51 recruitment. It has been reported to accumulate in ruptured MN in a CHMP4B-dependent manner (Vietri et al, 2020). cGAS is a cytosolic DNA sensor that detects ruptured MN and activates innate immune signaling via the cGAS-STING pathway. Together with our RAD51 results, these data show that importin α1-positive MN are consistently segregated from multiple DNA-recognizing factors, including RAD51. Simultaneously, importin α1 co-localizes with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings support the view that importin α1-positive MN define a distinct molecular environment enriched in chromatin regulators but largely inaccessible to DNA repair and sensing factors. While the precise mechanism remains unclear, one possibility is that importin α1-associated chromatin interactions limit the access of DNA repair and sensing proteins. However, this interpretation is speculative and requires further investigation.

      4) In the Discussion, line 343-344 states that "importin α1 is uniquely distributed and alters the nuclear/chromatin status when enriched in MN," however this is not currently supported by the present data. The data presented shows correlation (albeit weak) between euchromatic modifications and Importin α1, and it does not definitively show that importin α1 is sufficient to alter the nuclear-chromatin status when enriched in the MN. More substantial experiments would be required to show whether Importin α1 plays an active role in these modifications.

      Following the reviewer's suggestion, in the revised manuscript, we removed this overstatement and rephrased the relevant sections of the Discussion. Rather than implying a causal role, we now describe the mutually exclusive localization of importin α1 with DNA repair and sensing factors (RAD51, RPA2, and cGAS), emphasize its preferential association with euchromatin regions marked by H3K4me3, and note its frequent co-localization with chromatin regulators identified by RIME, such as PARP1 and SUPT16H/FACT. These findings suggest that importin α1-positive MN define a distinct subset characterized by limited accessibility to DNA repair and sensing proteins, whereas cGAS-positive ruptured MN exemplify a state in which these proteins can accumulate.

      We also added a concluding statement that frames importin α1 as defining a previously unrecognized MN subset that is distinct from conventional ruptured MN. This revision provides a more accurate and appropriately cautious interpretation of our data while underscoring the conceptual advance of our study by clarifying how importin α1 localization reveals MN heterogeneity.

      Minor Comments

      1) Summary statement (page 3 Line 40): The use of "their" is confusing. Whose microenvironment are you referring to?

      We have rephrased the sentence as follows: The accumulation of importin α in micronuclei, followed by modulation of the microenvironment of the micronuclei, suggests the non-canonical function of importin α in genomic instability and cancer development. Thank you for this useful suggestion.

      2) In Abstract and introduction (page 4, Line 44 and page 5, line 59) it states that MN are membrane enclosed structures, but this is not always the case (see https://doi.org/10.1038/nature23449 as one example).

      While MN are typically surrounded by a nuclear envelope at the time of their formation during mitosis, we agree that this envelope can later rupture or fail to assemble completely, thereby exposing micronuclear DNA to the cytoplasm. To clarify this point, we revised the Introduction to explicitly acknowledge that MN may lose nuclear envelope integrity, which can have important consequences for genomic instability and immune activation inflammation. Specifically, we have added the following sentence to the Introduction (page 4, lines 77-80): "The nuclear envelope of MN can be partially or completely disrupted, allowing cytoplasmic DNA sensors, such as cyclic GMP-AMP synthase (cGAS), to access micronuclear DNA and trigger innate immune responses via the cGAS-STING pathway (Harding et al, 2017; Li & Chen, 2018; Mackenzie et al, 2017). "

      We hope this addition appropriately addresses the concerns raised by Reviewer #2 while incorporating the valuable suggestions from Reviewer #1 without altering the overall structure and flow of the manuscript.

      3) Given the fact that the RIME result identified proteins involved in DNA replication to be enriched with Importin α1, are these MN enriched in factors described in Fig. 5 simply localizing to MN that are in S phase, as described previously (doi: 10.1038/nature10802)?

      We sincerely thank the reviewer for raising this constructive perspective regarding the potential relationship between importin α1 enrichment in micronuclei (MN) and the S phase. Our RIME analysis identified chromatin-associated proteins, such as PARP1 and SUPT16H/FACT, which are often activated during replication stress and frequently function in the S phase. However, importin α1-positive MN were not exclusively associated with S-phase-specific molecules, and our data do not indicate that these MN are restricted to the S phase.

      Previous studies [e.g., (Crasta et al, 2012)] have established that MN are prone to replication defects and represent hotspots of genomic instability. The recovery of replication stress-responsive molecules, such as PARP1 and FACT, by RIME is therefore consistent with the biology of MN. Based on this valuable suggestion, we have revised the Discussion (page 19) to explicitly mention the potential involvement of replication-related proteins in importin α1-positive MN, as well as the possibility that importin α1 accumulation may contribute to replication defects in these structures. We are grateful to the reviewer for raising this important perspective, which has enabled us to place our findings in a broader mechanistic context.

      We are grateful to the reviewer for this important comment, which has allowed us to place our findings in a broader mechanistic context and outline directions for future research, including testing the relationship between importin α1-positive MN and established S-phase markers such as PCNA.

      4) The FRAP data is not very compelling. While it is clear there are differences between the PN and MN dynamics, what is driving these differences? Are these differences meaningful to the biology of the MN or PN? It is unclear what this data is contributing to the conclusions of the paper. Also, if the mobility of the MN is plotted on the same graph as the PN, the differences in MN mobility might not look as compelling.

      We respectfully emphasize that FRAP analysis is a key component of our study, as it provides important insights into the distinct dynamics of importin α1 in MN compared to PN.

      In the revised manuscript, we included new experiments (now shown in Fig. 3A and 3C) that directly compare the recovery kinetics of importin α1 in PN and MN in the same cells. By plotting the PN and MN recovery curves side by side, we aimed to improve clarity and provide a direct visualization of the pronounced differences in importin α1 dynamics between these compartments.

      Our FRAP results showed that importin α1 accumulated in both PN and MN but exhibited markedly reduced mobility in MN. These findings suggest that, unlike in the PN, canonical nucleocytoplasmic recycling of importin α1 is impaired in MN. Furthermore, the reduced mobility indicates that importin α1 is stably associated with chromatin or chromatin-associated factors in MN, consistent with our additional biochemical and imaging data showing preferential association with euchromatin (e.g., H3K4me3) and chromatin regulators.

      Taken together, the FRAP data provide functional evidence that complements our structural and molecular analyses, supporting our central conclusion that importin α1 accumulation in MN defines a restricted chromatin environment that influences the accessibility of DNA repair and sensing factors.

      5) In Results (line 117), you state that "the cytoplasm of those cell lines emitted quite strong signals" for Importin α1, but that phrasing is a little confusing. Yes, Importin α1 is in present the cytoplasm in most cells, but it appears you are referring to the enrichment in MN. I would recommend re-phrasing this statement to make your intent clearer.

      As the reviewer rightly noted, the original phrasing, "the cytoplasm of those cell lines emitted quite strong signals," was misleading, as it could suggest a broad cytoplasmic distribution of importin α1. Our observations showed that importin α1 accumulated specifically in MN located within the cytoplasm, but not in the cytoplasmic regions. To clarify this, we revised the Results section (page 7, lines 125-127) to read: " Next, we performed indirect immunofluorescence (IF) analysis on human cancer cell lines, including MCF7 and HeLa cells. Notably, we found that importin α1 accumulated prominently in MN located within the cytoplasm (MCF7 cells, Fig. 1B; HeLa cells, Fig. 1C; yellow arrowhead). " .

      We believe that this revised wording more accurately reflects our findings and addresses the reviewer's concerns.

      6) In Results (line 135, Figure S2E,F), the ratio of high, low or no Importin α1 intensity is confusing. Is this percentage relative to the total number of MN? It Is unclear what is meant by "whole number" of MN. Is Importin α1 intensity quantified or is it subjective?

      We apologize for the confusing terminology used in the original manuscript for Supplemental Fig. S2 and thank the reviewer for pointing it out. Although the reviewer did not specifically comment on the classification of importin α1 signal intensity as "high" or "low," we recognized that this approach relied on subjective visual assessment and lacked clearly defined thresholds. To improve clarity and objectivity, we have removed this classification and now analyze importin α1 localization in MN as simply positive or negative (revised Supplemental Fig. S2E). The previous graph (original Fig. S2F) was deleted. In addition, the frequency of Importin α1-positive MN has been reported in the Results section of the main text (page 8). We believe that these revisions have improved the clarity and reproducibility of our data presentation.

      7) Figure 2C is confusing. Are you counting MN with co-localization of Importin α1 and these factors? Please clarify.

      Figure 2C shows the percentage of importin α1-positive MN that displayed localization of importin β1, CAS, or Ran based on IF analysis. In other words, it represents the co-localization rates of these transport factors specifically within the subset of MN positive for importin α1. To improve clarity, we revised the y-axis label in Fig. 2C to "Localization in Impα1-positive MN (%)" and modified the figure legend accordingly. We have clarified this point in the Results section (page 9). We believe that these revisions resolve the confusion and clarify the scope of the analysis.

      8) Figure S3D quantification is very confusing and unclear. Also, how is this normalized? Are you controlling for total signal in each cell? And can the results of this experiment give you any mechanistic insight as to what is regulating MN localization beyond the interpretation of "MN localization is distinct from PN localization"? The "C-mutant" appears quite a bit different than the others. What might that indicate about the role of CAS/CSE1L in MN enrichment?

      We apologize for the confusion caused by the quantification in the Supplemental Fig. S3D (now revised as Fig. S4D). This figure shows the relative enrichment of EGFP-importin α1 in MN compared with that in PN for wild-type and mutant constructs. To control for nuclear size, fluorescence intensity was measured using a fixed circular ROI (1.5-2.0 µm in diameter) placed in both the MN and PN of the same cell, and MN/PN intensity ratios were directly plotted for individual cells (n = 8 per condition). This procedure is described in detail in the Results section (page 10).

      Regarding the C-mutant, the reduced MN/PN ratio primarily reflects increased importin α1 accumulation in the PN rather than a reduced retention in the MN. As discussed in the revised manuscript (page 18), this suggests that CAS/CSE1L-mediated nuclear export is active in the PN but may be impaired or uncoupled in the MN, possibly due to differences in nuclear envelope integrity or chromatin context. We believe that this clarification addresses the reviewer's concerns and highlights the mechanistic implications of the C-mutant phenotype.

      9) For Figures 3A,B and S4, are these images of single z-slices or projections? It would be helpful to clarify for your interpretations as to whether they are truly partial or diffuse or the membrane is in another z-plane. Also, how does the localization of Importin α1 different or similar to other factors that localize to MN with a compromised nuclear envelope, such as cGAS? If it is based on epigenetic marks, it should be different than cGAS, which primarily binds non-chromatinized DNA.

      We thank the reviewer for this valuable suggestion. All images shown in Figs 3A, 3B, and S4 in the original manuscript (now revised as Fig. 4A and 4B, with the original Fig. S4 omitted) were derived from single optical sections rather than projections. We would like to emphasize that similar discontinuities in signals for lamin proteins (including laminB1 and laminA/C) were consistently observed across multiple cells and independent experiments, indicating that these observations are not due to an artifact of image acquisition or a missing z-plane, but rather reflect a genuine partial loss of the MN membrane.

      In contrast to cGAS, which predominantly binds non-chromatinized DNA in ruptured MN, our data indicate that importin α1 preferentially localizes to MN regions enriched in euchromatin-associated histone modifications, such as H3K4me3. The new data presented in Fig. 8 further strengthen this point by directly comparing importin α1 with DNA-recognizing proteins such as cGAS and RPA2, which preferentially localize to MN lacking importin α1. Together, these results highlight that importin α1-positive MN constitute a distinct subset characterized by chromatin-associated localization and reduced accessibility to DNA repair and sensing proteins.

      10) In Results, it is unclear how Fig. 7B was calculated. Are the authors qualitatively assessing if RAD51 is there or looking for MN enrichment relative to PN? Additionally, in Fig. 7C, RAD51 localization is diffuse. It should be enriched in foci. I would recommend the authors repeat this experiment using pre-extraction then quantify RAD51 foci number and/or intensity.

      For the quantification shown in Fig. 7B of the original manuscript, we acquired images containing approximately 15-50 cells per condition and counted all the micronuclei (MN) in those fields. The percentage of RAD51-positive MN relative to the total MN was calculated. In the revised manuscript, we further refined this analysis by classifying RAD51-positive MN into two categories based on signal intensity: weak (Cell #1 type) and strong (Cell #2 type). For each condition, nine independent fields were analyzed (302 MN in untreated cells and 213 MN in etoposide-treated cells). This quantification revealed that etoposide treatment preferentially increased the proportion of MN with strong RAD51 accumulation (Fig. 7C, right panels), indicating enhanced DNA damage in MN. Thus, our analysis was quantitative rather than qualitative, based on systematic counting across multiple fields.

      Regarding the reviewer's suggestion of pre-extraction, we believe that this approach is technically difficult because MN are structurally fragile. Importantly, in the subset of MN with strong RAD51 accumulation, RAD51 was clearly present in foci rather than diffuse signals, as shown in the high-magnification images (Fig. 7E).

      Finally, in response to Reviewer #1, we performed a new quantitative analysis (Fig. 7F) focusing on the frequency of strongly RAD51-positive MN in relation to importin α1 status. This analysis confirmed the mutually exclusive relationship between RAD51 and importin α1 in MN and further strengthened our conclusions.

      11) In line 264, "notably" is misspelled.

      Thank you for pointing this out. We have corrected the spelling.

      12) In line 303, "scenarios" should be changed to the singular form.

      Thank you for this confirmation. We have corrected this to "scenario".

      13) In Figure legend, line 571-582, H3K27me3 is shown in Figure 4D, but the written legend does not mention this mark.

      We have added the marks in the legend for Fig. 5E.


      Significance: Overall, this paper shows compelling evidence for micronuclear localization of regulators of nuclear export, notably Importin α1. Of note, this occurs in subsets of MN that lack an intact nuclear envelope. And while it has been appreciated that compromised micronuclear envelopes lead to genomic instability, this is one of the first that demonstrate alteration in the nuclear envelope may disrupt import or export of nuclear proteins into micronuclei.

      A limitation of the study is that much of the work is based on immunofluorescence and lacks mechanism. While there is much correlative data showing that Importin α1 localizes to micronuclei with compromised envelopes, it is unclear whether Importin α1 drives micronuclear collapse or it is downstream of this process. Additionally, Importin α1 micronuclear localization anti-correlates with RAD51 but does colocalize with other DNA replication factors, yet it is unclear whether their localization is dependent on Importin α1 or its role in nuclear export. Currently, the audience for this manuscript would be focused to those interested in micronuclei. If these concerns about an active role for Importin α1 in micronuclear export are resolved, it would greatly increase the impact of this manuscript to those interested more broadly in genomic instability, DNA repair, and cancer.

      We thank the reviewer for positively evaluating our study and highlighting the importance of defining the biological significance of our findings. In the revised manuscript, we incorporated new data (Fig. 8) demonstrating that importin α1-positive MN are mutually exclusive not only with RAD51 but also with RPA2 and cGAS. These results clearly establish importin α1-positive MN as a distinct subset, defined by the enrichment of chromatin-associated proteins, while being largely inaccessible to canonical DNA repair and DNA-sensing factors.

      Consistent with this, our FRAP experiments and analysis of the CAS/CSE1L-binding mutant (C-mut) further indicated that the recycling dynamics of importin α1 were altered in MN compared to PN. In addition, importin α1 was enriched in lamin-deficient areas of MN, where electron microscopy revealed a fragile nuclear envelope morphology. Together with prior evidence, as discussed in the revised manuscript that recombinant importin α can inhibit nuclear envelope assembly in Xenopus egg extracts (Hachet et al, 2004), these findings raise the possibility that high local concentrations of importin α1 may actively contribute to impaired nuclear envelope formation or stability in MN.

      Such a distinct MN state may have important biological consequences. By limiting the access of DNA repair and DNA-sensing proteins, importin α1 accumulation may influence chromothripsis and immune activation, which, in turn, could play a role in tumor progression and genome instability. We believe that the identification of importin α1 as a marker defining such a restricted MN environment represents a conceptual advance that extends the relevance of our study beyond the MN field to the broader areas of genome instability, DNA repair, and cancer biology. We are grateful to the reviewer for encouraging us to strengthen the framing of our work, which has helped us clarify the novelty and impact of our findings.

      Reviewer #3

      Summary:

      This study reports that importin alpha isoforms enrich strongly in a subset of micronuclei in cancer cells and uses mutagenesis and immunostaining to define how this localization relates to importin alpha's nuclear transport function. This enrichment occurs even though importin-alpha-positive micronuclei also contain Ran and the importin alpha export factor CSE1L, indicating that importin a enrichment is not simply a consequence of the absence of components of the nuclear transport machinery that control its localization. Mutagenesis of importin a indicates that Mn enrichment persists even when the importin beta binding and NLS binding capacities of imp a are impaired. Potential importin alpha interacting proteins are identified by proteomics, although the relationship of these potential binding partners to micronucleus localization is unclear.


      1. In Figure S3, the authors show that mutagenesis of importin alpha's CSE1L binding domain decreases the ratiometric enrichment in Mn vs. Pn. However, is this effect occurring because th CSE1L binding mutant decreases Mn enrichment, or increases Pn enrichment? It seems that the latter is possible based on the images shown. If the Pn specifically becomes brighter on average in cells expressing the C-mut, while Mn remain similar in fluorescence intensity, that might suggest that CSE1L has less of an effect on importin alpha export in Mn compared to Pn.

      We appreciate the reviewer's insightful observations. In the revised analysis (now presented in Supplemental Fig. S4D), we quantified EGFP-importin α1 intensities in both PN and MN using fixed circular regions of interest. This revealed that the reduced MN/PN ratio observed in the CSE1L-binding mutant (C-mut) was mainly due to an increase in the PN signal rather than a decrease in the MN signal. These results are consistent with the reviewer's suggestion and indicate that CSE1L-mediated nuclear export is functional in PN but has a limited impact on MN.

      Importantly, this interpretation is supported by our FRAP experiments (Fig. 3), which show that importin α1 recycles normally in the PN but exhibits markedly reduced mobility in the MN. Together with our proteomic and colocalization analyses (Fig. 6), which identified importin α1 association with chromatin regulators such as PARP1 and SUPT16H/FACT, these findings suggest that importin α1 accumulates in MN not only because the recycling machinery is uncoupled but also because it forms stable interactions with chromatin-associated proteins. As discussed in the revised manuscript, this dual mechanism provides a plausible explanation for the persistent retention of importin α1 in MN and its role in defining a distinct MN environment.

      It is unclear from the text or the methods whether RIME identification of importin-alpha binding partners is performed in reversine-treated cells, which would increase the proportion of importin alpha in Mn, or in untreated cells. In either case, it seems likely that the majority of interactors identified would be cargoes that rely on importin alpha for import into the Pn. The rationale for linking these potential interactions to the Mn is unclear. While some of these factors are indeed shown enriched in Mn in Figure 5, the significance of this is also unclear. These points should be clarified.

      We thank the reviewer for raising this important point. The RIME assay was performed using whole-cell extracts from untreated wild-type MCF7 cells, which primarily identified importin α1-associated nuclear cargo proteins. To assess their potential relevance to MN, we screened the RIME candidates using immunofluorescence data provided by the Human Protein Atlas database and experimentally validated those showing clear MN localization by colocalization with importin α1. This two-step approach enabled us to highlight importin α1 interactors that are functionally relevant to MN biology rather than general nuclear cargoes.

      In response to the reviewer's concerns, we revised the Results section to clarify this rationale. Specifically, we added the explanation that "As importin α1 interactors are typically nuclear proteins, it is plausible that they reside not only in the primary nucleus but also in the MN. To test this possibility, we screened the identified candidates for MN localization using immunofluorescence images provided by the Human Protein Atlas (HPA) database (Pontén et al, 2008; Thul et al, 2017)." (page 14, lines 294-297).

      This is consistent with the idea that a wide range of nuclear proteins carrying NLS motifs can recruit importin α1 into the micronuclei, where they reside. This protein-driven enrichment of importin α1 may create a restricted microenvironment in which canonical DNA repair and sensing proteins, including RAD51, RPA2, and cGAS, are excluded, thereby defining a distinct subset of micronuclei with limited genome surveillance capacity.

      In Figure 6, the authors perform FRAP of importin alpha in Mn and show that it recovers much more slowly in Mn than in Pn. However, it appears from the images shown that the entire Mn was photobleached in each FRAP experiment. It thus is unclear whether the slow FRAP recovery is limited by slow diffusion of importin alpha within Mn/on Mn chromatin or impaired trafficking of importin alpha into and out of Mn. These distinct outcomes have distinct implications: either importin alpha is immobilized on Mn (eu)chromatin, or alternatively importin alpha is poorly transported into / out of Mn. This ambiguity could be resolved by bleaching a portion of a Mn and testing whether importin alpha diffuses within a single Mn.

      We thank the reviewer for this insightful comment regarding the interpretation of FRAP data. As the reviewer rightly pointed out, the original FRAP design-where the entire MN was photobleached-does not allow for a clear discrimination between the intranuclear immobilization of importin α1 and impaired trafficking into or out of the MN.

      In line with a similar suggestion from Reviewer #1, we attempted partial photobleaching of MN to evaluate whether importin α1 can diffuse within MN independently of nucleocytoplasmic transport. However, due to the small size of MN, precise targeting is technically challenging and recovery is often unreliable, with some MN even exhibiting partial recovery during the bleaching process itself. These data were not included in the revised figures; however, we provide representative examples as reviewer-only figures to illustrate these technical limitations.

      To further clarify the nuclear transport dynamics of importin α1, we redesigned our FRAP experiments to fully photobleach both the PN and MN within the same cells under identical conditions. These results, presented in revised Fig. 3A and 3C, demonstrate a markedly slower recovery of importin α1 in MN compared to PN, strongly suggesting that nucleocytoplasmic recycling of importin α1 is impaired in MN. Moreover, the reduced mobility of importin α1 in the MN is consistent with stable chromatin binding, limiting its ability to diffuse freely within the nuclear space.

      We believe that this additional analysis, prompted by the reviewer's comment, significantly strengthens the mechanistic interpretation of our FRAP data.

      References

      Crasta K, Ganem NJ, Dagher R, Lantermann AB, Ivanova EV, Pan Y, Nezi L, Protopopov A, Chowdhury D, Pellman D (2012) DNA breaks and chromosome pulverization from errors in mitosis. Nature 482: 53-58

      Hachet V, Kocher T, Wilm M, Mattaj IW (2004) Importin α associates with membranes and participates in nuclear envelope assembly in vitro. EMBO J 23: 1526-1535

      Martinez-Olivera R, Datsi A, Stallkamp M, Köller M, Kohtz I, Pintea B, Gousias K (2018) Silencing of the nucleocytoplasmic shuttling protein karyopherin a2 promotes cell-cycle arrest and apoptosis in glioblastoma multiforme. Oncotarget 9: 33471-33481

      Vietri M, Schultz SW, Bellanger A, Jones CM, Petersen LI, Raiborg C, Skarpen E, Pedurupillay CRJ, Kjos I, Kip E, Timmer R, Jain A, Collas P, Knorr RL, Grellscheid SN, Kusumaatmaja H, Brech A, Micci F, Stenmark H, Campsteijn C (2020) Unrestrained ESCRT-III drives micronuclear catastrophe and chromosome fragmentation. Nat Cell Biol 22: 856-867

      Wang CI, Chien KY, Wang CL, Liu HP, Cheng CC, Chang YS, Yu JS, Yu CJ (2012) Quantitative proteomics reveals regulation of karyopherin subunit alpha-2 (KPNA2) and its potential novel cargo proteins in nonsmall cell lung cancer. Mol Cell Proteomics 11: 1105-1122

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      Summary:

      Provide a short summary of the findings and key conclusions (including methodology and model system(s) where appropriate). Miyamoto et al. report that importin α1 is highly enriched in a subfraction of micronuclei (about 40%), which exhibit defective nuclear envelopes and compromised accessibility of factors essential for the damage response associated with homologous recombination DNA repair. The authors suggest that the unequal localization and abnormal distribution of importin α1 within these micronuclei contribute to the genomic instability observed in cancer.

      Major comments:

      Are the key conclusions convincing?

      The conclusions drawn by the authors would benefit from additional supportive experiments and a more detailed explanation. 1. It is crucial to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells compared to transformed cell lines (MC7, HeLa, and MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells 2. While the authors provide some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Furthermore, according to the figure legends, the data presented in both figures stem from a single experiment. Current literature suggests that compromised nuclear envelope integrity is one of the major contributors to genomic instability, mediated through mechanisms such as chromothripsis and cGAS-STING-mediated inflammation arising from MN. Therefore, a more comprehensive quantification of nuclear envelope integrity-ideally comparing non-transformed MCM10A cells with transformed cell lines (MC7, HeLa, and MDA-MB-231)-is necessary to substantiate the connection between aberrant importin α1 behavior in MN and chromothripsis processes, as well as regulation of the cGAS-STING pathway linked to genomic instability in cancer cells. 3. The schematic illustration presented in Figure 8 does not adequately summarize all findings from this study nor does it clarify how the localization of importin α1 within MN might hypothetically influence genome stability. Although it is reasonable to propose that "importin α can serve as a molecular marker for characterizing the dynamics of MN" (Line 344), the authors assert (Line 325) that their findings, along with others, have "potential implications for the induction of chromothripsis processes and regulation of the cGAS-STING pathway in cancer cells." However, they fail to provide a clear or even hypothetical explanation regarding how their findings contribute to these molecular events. To address this gap, it would be essential for them to contextualize their results within existing literature that explores and links structural integrity deficits or aberrant DNA replication/damage responses in MN with chromothripsis and inflammation (e.g., PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889). 4. Fig. 4D does not support the idea that importin α1 is euchromatin enriched: H3K9me3, H3K4me3 and H3K37me3 seem to be all deeply blue.

      Should the authors qualify some of their claims as preliminary or speculative, or remove them altogether?

      Indeed, the data presented by the authors do not adequately support a direct link between the presence of importin α1 in MN and genomic instability in human cancer cells. While the experimental correlations provided may not substantiate this connection definitively, they do lay a foundation for a grounded hypothesis and suggest the need for further research to explore this topic in greater depth. Additionally, it is worth noting that the evidence contributes to the growing list of nuclear proteins exhibiting abnormal behavior in micronuclei (MN). This highlights the significance of studying such proteins to understand their roles in genomic stability and cancer progression.

      Would additional experiments be essential to support the claims of the paper? Request additional experiments only where necessary for the paper as it is, and do not ask authors to open new lines of experimentation.

      Additional experiments are necessary to quantitatively assess the localization of importin α1 in micronuclei (MN) across non-transformed MCM10A cells and transformed cell lines (MC7, HeLa, MDA-MB-231). This analysis would help determine whether the localization of importin α1 in MN correlates with genomic stability in human cancer cells. The authors claim that importin α1 preferentially localizes to euchromatic areas rather than heterochromatic regions within MN. While this assertion is supported by the immunofluorescence (IF) images presented in Figures 4A/B and S5A/B, it remains less clear for Figure S5C/B. To strengthen this claim, providing averages of IF distributions from multiple cells across independent experiments would be beneficial to draw more robust conclusions.

      Furthermore, ChIP-seq data are presented to support the idea that importin α1 preferentially distributes over euchromatin areas in MN. However, as described, the epigenetic chromatin status indicated by these ChIP-seq experiments reflects that of the principal nucleus (PN), not specifically the status within MN in MCF7 cells. Given that MN represent only a small fraction of the cell population under normal culture conditions-likely less than 5% for HeLa cells as shown in Figure S2D-the relevance of this data is limited. Additionally, according to data presented in Figure 1B, importin α1 does not localize or distribute within the PN as it does in MN in MCF7 cells. Therefore, further experiments should be conducted to substantiate that importin α1 preferentially targets euchromatin areas within MN and to compare this distribution with that observed in the principal nucleus. Such studies could reveal potential abnormalities regarding the correlation between epigenetic chromatin status and importin α distribution in MN. To support the hypothesis that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should be supplemented with thorough quantification and statistical analysis based on at least three independent experiments. This additional data would enhance confidence in their findings regarding RAD51 accessibility inhibition by importin α1.

      Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated cost and time investment for substantial experiments.

      The additional experiments proposed are controls and direct comparisons using the same techniques and experimental designs used by the authors, so it is reasonable that the authors can carry them out within a realistic timeframe.

      Are the data and the methods presented in such a way that they can be reproduced?

      Given the importance of reproducibility and the need to evaluate results based on imaging and quantitation, I strongly recommend that the authors include a detailed description of the optical microscopy procedures utilized in their study. This should encompass imaging conditions, acquisition settings, and the specific equipment used. Providing this information will enhance transparency and facilitate reproducibility. For reference, some valuable guidance on essential parameters for reproducibility can be found in Heddleston et al. (2021) (doi:10.1242/jcs.254144). Incorporating these details will not only strengthen the manuscript but also support other researchers in reproducing the findings accurately.

      Are the experiments adequately replicated and statistical analysis adequate?

      Many of the plots and values in the manuscript lack appropriate statistical analysis, including p-values, which are not detailed in the figures or their legends. Furthermore, the Statistical Analysis section does not provide adequate information regarding the specific statistical tests employed or the criteria used to determine which analyses were applied in each case. To enhance the rigor and clarity of the study, it is essential that these issues be addressed prior to publication. A comprehensive presentation of statistical analysis will improve the reliability of the findings and allow readers to better understand the significance of the results. I recommend that the authors revise this section to include detailed explanations of all statistical methods used, along with corresponding p-values for all relevant comparisons.

      Minor comments:

      Specific experimental issues that are easily addressable.

      The authors claim that importin α1 exhibits remarkably low mobility in the micronuclei (MN) compared to its mobility in the principal nucleus (PN), as illustrated in Figure 1. However, based on the experimental design, this conclusion may not be appropriate. In the current setup, the FRAP experiment conducted in the PN measures the mobility of importin α1 molecules within the cell nucleus, where the influence of nuclear transport is likely negligible. Conversely, in the MN experiments shown, all molecules of importin α1 are bleached within a given MN. Consequently, what is being measured here primarily reflects the effects of nuclear transport rather than intrinsic molecular mobility. To accurately compare kinetics of nuclear transport, it would be essential to completely bleach the entire PN. If measuring molecular mobility between MN and PN is desired, only a small fraction of either MN or PN area/volume should be bleached during FRAP analysis. Additionally, it would be beneficial to include measurements of mobility for other canonical nuclear transport factors (e.g., RAN, CAS, RCC1) for comparative purposes. This broader context would allow for a more comprehensive understanding of importin α1 behavior relative to other factors involved in nuclear transport. Finally, utilizing cells that exhibit importin α1 signals in both PN and MN could further strengthen comparisons and provide more robust conclusions regarding its mobility dynamics.

      Are prior studies referenced appropriately?

      Prior studies are referenced appropriately in general, but the authors missed some references (PMID: 32601372; PMID: 32494070; PMID: 27918550; PMID: 28738408; PMID: 28759889) that I consider key to put the present findings in frame with previous works which link the lack of structural integrity and/or aberrant DNA replication/damage responses in MN with Cchromothripsis and inflammation.

      Are the text and figures clear and accurate?

      The figures presented in the manuscript are clear; however, where plots are included, they require appropriate statistical analysis. It is essential to display p-values on the plots or within their legends to provide readers with information regarding the significance of the results. Including this statistical information will enhance the interpretability of the data and strengthen the overall findings of the study. I recommend that the authors revise these sections accordingly before publication.

      Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

      1. In lines 134-135, it is stated that "up to 40% of the MN showed importin α1 accumulation under both standard culture conditions and the reversine treatment (Fig. S2F)." However, Figure S2F only displays percentages for reversine-treated cells, and there is no mention in the text or figures regarding the percentage of importin α1-positive MN determined by immunofluorescence (IF) under standard culture conditions. This discrepancy should be addressed.
      2. In line 170, the authors state that "Cells in which overexpressed EGFP-importin α1 localized only in PN were excluded from the analysis (see Fig. 1E, top panels)." It is unclear why this exclusion was made. The authors should clarify whether they are referring to all constructs or only to the wild-type (WT) construct when mentioning EGFP-importin α1 localization solely in PN. This clarification is important as it may affect the results highlighted in line 173.
      3. The statement in line 191 ("However, this antibody could not be further used in this context due to cross-reactivity with highly concentrated importin α1 in MN (Fig. S4)") is somewhat misleading. While it hints at a technical issue, it does not provide additional relevant information for understanding its implications for the rationale of the research. Moreover, Figure S4 is referenced but appears to refer specifically to panels S4D and E, which are not mentioned in the text. I recommend clarifying this point or removing it altogether.
      4. Lines 197-199 contain a sentence that could be misleading and would benefit from clearer explanation. Although Figure 3D provides some clarity on this matter, no statistical analysis is included-only a bar plot is presented. A proper statistical analysis should be provided here to enhance understanding.
      5. In lines 218-221, it states that importin α1 associates with euchromatin regions characterized by H3K4me3 and H3K36me3; however, Figure 4D lacks the Spearman's correlation coefficient value for H3K36me3 within the matrix. This omission needs correction.
      6. For consistency in the experimental design aimed at identifying potential importin α1-interacting proteins, it would be more appropriate for Figures 5C/D to show IF data from MCF7 cells rather than HeLa cells.
      7. To substantiate claims that importin α1 inhibits RAD51 accessibility within MN, Figures 7D and E should include thorough quantitation and statistical analysis based on at least three independent experiments.
      8. The meaning of lines 336-338-"Therefore, the enrichment of importin α1 in MN, along with its interaction with chromatin, may regulate the accessibility of RAD51 to DNA/chromatin fibers in MN and protect its activity"-is unclear. I suggest rephrasing this sentence for improved clarity and comprehension.
      9. Fig. 1D: Numbers on the y-axis are missing, x-axis labeling is too small
      10. Fig. 1F: As the PN/MN values of the three experiments are seemingly identical (third column) the distribution of the three individual data of the PN (first column) should mirror the distribution of the three individual data of the MN (second column). The authors might want to check why this is not the case.

      Significance

      • Describe the nature and significance of the advance (e.g. conceptual, technical, clinical) for the field.
      • Place the work in the context of the existing literature (provide references, where appropriate).

      Micronuclei (MN) primarily arise from defects in mitotic progression and chromatin segregation, often associated with chromatin bridges and/or lagging chromosomes. MN frequently exhibit DNA replication defects and possess a rupture-prone nuclear envelope, which has been linked to genomic instability. The nuclear envelope of MN is notably deficient in crucial factors such as lamin B and nuclear pore complexes (NPCs). This deficiency may be attributed to the influence of microtubules and the gradient of Aurora B activity at the mitotic midzone, which inhibits the recruitment of proper nuclear envelope components. Additionally, several other factors may contribute to this process: for instance, PLK1 controls the assembly of NPC components onto lagging chromosomes; chromosome size and gene density positively correlate with the membrane stability of MN; and abnormal accumulation of the ESCRT complex on MN exacerbates DNA damage within these structures, triggering pro-inflammatory pathways. The work presented by Dr. Miyamoto and colleagues reveals the abnormal behavior of importin α1 in MN during interphase. According to their findings, it is reasonable to consider importin α1 as a molecular marker for characterizing MN dynamics. Furthermore, it could serve as a potential clinical marker if the authors provide additional experiments demonstrating significantly different localization patterns of importin α1 in transformed cells (e.g., MC7, HeLa, MDA-MB-231) compared to non-transformed cells (e.g., MCM10A). While the authors present some evidence indicating partial disruption of nuclear envelopes in MN (Figures 3 and S4), it is noteworthy that this phenomenon also occurs in importin α1-negative MN. Moreover, according to the figure legends, data for both figures originate from a single experiment. As such, convincing evidence linking the aberrant behavior of importin α1 in MN with chromothripsis processes or regulation of the cGAS-STING pathway-and its implications for genomic instability in cancer cells-remains lacking. Overall, it is not entirely clear what significance this advance holds for the field; while there are conceptual contributions made by this work, they do not appear sufficiently robust at this time. Further research is needed to clarify these connections and strengthen their conclusions regarding importin α1's role in MN dynamics and genomic instability. - State what audience might be interested in and influenced by the reported findings.

      Scientist and health care professionals that research on mechanism of genomic instability and cancer - Define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

      Mitosis, mitotic chromatin decondensation, nuclear reformation, hematopoietic cancers, light microscopy, image analysis.

    1. C’est la représentation dans le plan complexe de H(jω) lorsque ω varie de 0 à l’infini. Image tikz Tracés dans le lieu de Nyquist \caption{Exemple de tracés dans le lieu de Nyquist} \label{exemple_nyquist}

      Il y a un problème d'affichage ici ainsi que dans la partie suivante

    1. THIS IS IT!

      Can you add a short section above this final invite to join that touches on the promise on the course? 'This course will take you from x to y/ This program will help you.../ Imagine being having all the resources and guicance to confidenctly ...

    1. Thus, Tier Two sources can provide quality information that is more accessible to non-academics. There are three main categories. First, official reports from government agencies or major international institutions like the World Bank or the United Nations; these institutions generally have research departments staffed with qualified experts who seek to provide rigorous, even-handed information to decision-makers. Second, feature articles from major newspapers and magazines like the New York Times, Wall Street Journal, London Times, or The Economist are based on original reporting by experienced journalists (not press releases) and are typically 1500+ words in length. Third, there are some great books from non-academic presses that cite their sources; they’re often written by journalists. All three of these sources are generally well researched descriptions of an event or state of the world, undertaken by credentialed experts who generally seek to be even-handed.

      Teir Two Sources: official reports from government agencies or major institutions; studies that submit statistics within communities? Magazines, books from non-academic presses that cite their sources written bb y journalists

    1. La parte más grande del clítoris está oculta; apenas podemos ver una pequeña punta

      El cuerpo femenino casi siempre se ha entendido a través de un esquema visual patriarcal (solo existe para fines reproductivos, estéticos y / o médicos) y lp que queda fuera de esta mirada se vuelve inexistente. Esto no es casual es totalmente político ya que hay una relación entre lo que no se representa y lo que no se puede experimentar. El que haya sido invisibilizado durante tanto tiempo es una forma de controlar el placer y de negar la anatomía propia del cuerpo. Esto igual funciona como una metáfora para los sistemas de poder en general en donde todo lo que sale de la norma se tapa, se borra o se vuelve una amenaza. Desprogramar entonces sería una forma de reconfigurar nuestro conocimiento a través de lo que se nos ha negado o escondido.

    2. No hace falta porque fuera todo está resuelto, porque alguien lo ha resuelto para mi ‘comodidad’.

      Esto implica que las respuestas a nuestras preguntas estén muy condicionadas. Un buen ejemplo es el SEM (Search Engine Marketing): el orden y visibilidad de los resultados no dependen simplemente de la relevancia, sino del presupuesto que puede invertir cada página. Así, "nuestras respuestas" ya llega mediado por intereses económicos, no por nuestra curiosidad o libertad de exploración.

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      __Reviewer #1 (Evidence, reproducibility and clarity (Required)): __

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction * Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      TAS predictions derived only from insect-stage RNA-seq data because in a previous study it was shown that there are no significant differences between stages in the 5’UTR procesing in T. cruzi life stages (https://doi.org/10.3389/fgene.2020.00166) We are not testing an additional transcriptome here, because the robustness of the software was already probed in the original article were UTRme was described (Radio S, 2018 doi:10.3389/fgene.2018.00671).

      Results - "There is a distinctive average nucleosome arrangement at the TASs in TriTryps": * You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.

      The reviewer has a good point. We made our statement based on the value of the maximum peak of the sequenced DNA molecules, which in general is a good indicative of the extension of the digestion achieved by the sample (Cole H, NAR, 2011).

      As the reviewer correctly points, we should have also considered the length of the DNA molecules in each percentile. However, in this case both, T. brucei’s and L major’s samples were gel purified before sequencing and it is hard to know exactly what fragments were left behind in each case. Therefore, it is better not to over conclude on that regard.

      We have now comment on this in the main manuscript, and we have clarified in the figure legends which data set we used in each case.

      * It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm.

      The replicates used for the construction of each figure are explicitly indicated in Table S1. Although we have detailed in the table the original publication, the project and accession number for each data set, the reviewer is correct that in this case it was still not completely clear to which length distribution heatmap was each sample associated with. To avoid this confusion, we have now added the accession number for each data set to the figure legends and also clarified in Table S1. Regarding the reviewer’s comment on the correspondence between the observed TAS protection and the extent of samples digestion, he/she is correct that for a more digested sample we would expect a clearer NDR. In this case, the difference in the extent of digestion between these two samples is minor, as observed the length of the main peak in the length distribution histogram for sequenced DNA molecules is the same. These two samples GSM5363006, represented in Fig1 b, and GSM5363007, represented in S2, belong to the same original paper (Maree et al 2017), and both were gel purified before sequencing. Therefore, any difference between them could not only be the result of a minor difference in the digestion level achieved in each experiment but could be also biased by the fragments included or not during gel purification. Therefore, I would not over conclude about TAS protection from this comparison. We have now included a brief comment on this, in the figure discussion

      * The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      We appreciate the reviewer suggestion. We cannot assure if it is due to technical or biological reasons, but there is evidence that L. major ‘s genome has a different dinucleotide content and it might have an impact on nucleosome assembly. We have now added a comment about this observation in the final discussion of the manuscript.

      Results - "An MNase sensitive complex occupies the TASs in T. brucei": * The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fixed time point adding increasing amounts of MNase. However, even when making controlled experimental timepoints, you need to check the length distribution histogram of sequenced DNA molecules to be sure which level of digestion you have achieved.

      In this particular case, we used public available data sets to make this analysis. We made an arbitrary definition of low, intermediate and high level of digestion, not as an absolute level of digestion, but as a comparative output among the tested samples. We based our definition on the comparison of __the main peak in length distribution heatmaps because this parameter is the best metric to estimate the level of digestion of a given sample. It represents the percentage of the total DNA sequenced that contains the predominant length in the sample tested. __Hence, we considered:

      low digestion: when the main peak is longer than the expected protection for a nucleosome (longer than 150 bp). We expect this sample to contain additional longer bands that correspond to less digested material.

      intermediate digestion, when the main peak is the expected for the nucleosome core-protection (˜146-150bp).

      high digestion, when the main peak is shorter than that (shorter than 146 bp). This case, is normally accompanied by a bigger dispersion in fragment sizes.

      To do this analysis, we chose samples that render different MNase protection of the TAS when plotting all the sequenced DNA molecules relative to this point and we used this protection as a predictor of the extent of sample digestion (Figure 2). To corroborate our hypothesis, that the degree of TAS protection was indeed related to the extent of the MNase digestion of a given sample, we looked at the length distribution histogram of the sequenced DNA molecules in each case. It is the best measurement of the extent of the digestion achieved, especially, when sequencing the whole sample without any gel purification and representing all the reads in the analysis as we did. The only caveat is with the sample called “intermediate digestion 1” that belongs to the original work of Mareé 2017, since only this data set was gel purified.

      Whether the sample used in Figure 1 (from Mareé 2017) is also from the same lab and is an MNase-seq. Strictly speaking, there is no methodological difference between MNase-seq and the input of a native MNase-ChIP-seq, since the input does not undergo the IP.

      * Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.

      The sharp cutoff is neither due to gel purification or bioinformatic filtering, it is just due to the length of the paired-end read used in each case. In earlier works the most common was to sequence only 50bp, with the improvement of technologies it went up to 75,100 or 125 bp. We have now clarified in Table S1 the length of the paired-reads used in each case when possible.

      * Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly.

      As explained above, it's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme, which has a preference for AT reach sequences.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would be to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always get some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well, originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, or by containing a poor AT sequence content, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, you end up observing a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or over digested samples. Our main point, is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA.

      Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones": * The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.

      What we learned from other eukaryotic organisms that were deeply studied, such as yeast, is that NDRs are normally generated at regulatory points in the genome. In this sense, yeast tRNA genes have a complex with a bootprint smaller than a nucleosome formed by TFIIIC-TFIIB (Nagarajavel, doi: 10.1093/nar/gkt611). On the other hand, many promotor regions have an MNase-sensitive complex with a nucleosome-size footprint, but it does not contain histones (Chereji, et al 2017, doi:10.1016/j.molcel.2016.12.009). The reviewer is right that from Figure 1 and S2 we could observe that the footprint of whatever occupies the TAS region, especially in T. brucei, is nucleosome-size. However, it only shows the size, but it doesn’t prove the nature of its components. Nevertheless, those are only MNase-seq data sets. Since it does not include a precipitation with specific antibodies, we cannot confirm the protecting complex is made up by histones. In parallel, a complementary study by Wedel 2017, from Siegel’s lab, shows that using a properly digested sample and further immunoprecipitating with a-H3 antibody, the TAS is not protected by nucleosomes at least not when analyzing nucleosome size-DNA molecules. Besides, Briggs et. al 2018 (doi: 10.1093/nar/gky928) showed that at least at intergenic regions H3 occupancy goes down while R-loops accumulation increases. We have now added a supplemental figure associated to Figure 3 (new Suplemental 5) replotting R-loops and MNase-ChIP-seq for H3 relative to our predicted TAS showing this anti-correlation and how it partly correlates with MNase protection as well. As a control we show that Rpb9 trends resembles H3 as Siegel’s lab have shown in Wedel 2018.

      * Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.

      In most of our analysis we used real replicated experiments. Such is the case MNase-seq data used in Figure 1, with the corresponding replicate experiments used in Figure S2; T. cruzi MNase-ChIP-seq data used in Figure 3b and 4a with the respective replicate used in Figures S4 and S5 (now S6 in the revised manuscript). The only case in which we used experiments coming from two different laboratories, is in the case of MNase-ChIP-seq for H3 from T. brucei. Unfortunately, there are only two public data sets coming each of them from different laboratories. The samples used in Fig 3 (from Siegel’s lab) whether the IP from H3 represented in S4 and S5 (S6 n the updated version) comes from another lab (Patterton’s). To be more rigorous, we now call them data 1 and 2 when comparing these particular case.

      The reviewer is right that in this particular case one is native chromatin (Pattertons’) while the other one is crosslinked (Siegel’s). We have now clarified it in the main text that unfortunately we do not count on a replicate but even under both condition the result remains the same, and this is compatible with my own experience, were crosslinking does not affect the global nucleosome patterns (compared nucleosome organization from crosslinked chromatin MNAse-seq inputs Chereji, Mol Cell, 2017 doi: 10.1016/j.molcel.2016.12.009 and native MNase-seq from Ocampo, NAR, 2016 doi: 10.1093/nar/gkw068).

      * Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff.

      We have only filtered adapter dimmer or overrepresented sequences when needed. In Figures 2 and S3 we represented all the sequenced reads. In other figures when we sort fragments sizes in silico, such as nucleosome range, dinucleosome or subnucleosome size, we make a note in the figure legends. What the reviewer points is related to the length of the sequence DNA fragment in each experiment. As we explained above, the older data-sets were performed with 50 bp paired-end reads, the newer ones are 75, 100 or 125bp. This is information is now clarified in Table S1.

      __Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes": __

      __ __* Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.

      We have shown chromatin organization for T. brucei in S5b to show that there is a similar trend. Unfortunately, we did not get a robust list of multi-copy genes for T. brucei as we did get for T. cruzi, therefore we do not want to over conclude showing the RNA-seq for these subsets of genes. The limitation is related to the fact that UTRme restrict the search and is extremely strict when calling sites at repetitive regions.

      * Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.

      The mapping of occurrence and annotations that belong to repetitive regions has great complexity. UTRme is specially designed to avoid overcalling those sites. In other words, there is a chance that we could be underestimating the number of predicted TASs at multi-copy genes. Regarding the impact on chromatin analysis, we cannot rule out that it might have an impact, but the observation favors our conclusion, since even when some TASs at multi-copy genes can remain elusive, we observe more nucleosome density at those places.

      * The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.

      We have fixed this now in the preliminary revised version

      * How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      This classification was done the same way it was explained for T. cruzi

      Genomes and annotations: * If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      The most appropriate way to analyze high throughput data, is to aline it to the same genome were the experiments were conducted. This was clearly illustrated in a previous publication from our group were we explained how should be analyzed data from the hybrid CL Brener strain. A common practice in the past was to use only Esmeraldo-like genome for simplicity, but this resulted in output artifacts. Therefore, we aligned it to CL Brener genome, and then focused the main analysis on the Esmeraldo haplotype (Beati Plos ONE, 2023). Ideally, we should have counted on transcriptomic data for the same strain (CL Brener or Esmeraldo). Since this was not the case at that moment, we used data from Y strain that belongs to the same DTU with Esmeraldo.

      In the case of T. brucei, when we started our analysis and the software code for UTRme was written, the previous version of the genome was available. Upon 2018 version came up, we checked chromatin parameters and observed that it did not change the main observations. Therefore, we continue working with our previous setups.

      Reproducibility and broader integration: * Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.

      We are preparing a full pipeline in GitHub. We will make it available before manuscript full revision

      * As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims.

      We are now including a new suplemental figure including DRIP-seq and Rp9 ChIP-seq (revised S5). Additionally, we added a new panel c to figure 4, representing FAIRE-seq data for T. cruzi fore single and multi-copy genes

      We are working on ATAC-seq analysis and BSF MNase-seq

      Optional analyses that would strengthen the study: * Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      We have now included a panel in suplemental figure 5 (now revised S6), showing the concordance for chromatin organization of stratified genes by RNA-seq levels relative to TAS.

      __Minor / editorial comments: __ * In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.

      We have clarified this in the preliminary revised version

      * Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.

      The dotted line is just to indicate where the maximum peak is located. It is now clarified in figure legends.

      * In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.

      We have now fixed the figure. Thanks for noticing this mistake.

      * Typo in the Introduction: "remodellingremodeling" → "remodeling

      Thanks for noticing this mistake, it is fixed in the current version of the manuscript

      **Referee cross-commenting** Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Reviewer #1 (Significance (Required)):

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation. The significance lies in three aspects: 1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing. 2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids. 3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

      __Reviewer #2 (Evidence, reproducibility and clarity (Required)): __

      Siri et al. perform a comparative analysis using publicly available MNase-seq data from three trypanosomatids (T. brucei, T. cruzi, and Leishmania), showing that a similar chromatin profile is observed at TAS (trans-splicing acceptor site) regions. The original studies had already demonstrated that the nucleosome profile at TAS differs from the rest of the genome; however, this work fills an important gap in the literature by providing the most reliable cross-species comparison of nucleosome profiles among the tritryps. To achieve this, the authors applied the same computational analysis pipeline and carefully evaluated MNase digestion levels, which are known to influence nucleosome profiling outcomes.

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. The manuscript could be improved with some clarifications and adjustments:

      1. The authors state from the beginning that available MNase data indicate altered nucleosome occupancy around the TAS. However, they could also emphasize that the conclusions across the different trypanosomatids are inconsistent and even contradictory: NDR in T. cruzi versus protection-in different locations-in T. brucei and Leishmania.

      We start our manuscript by referring to the first MNase-seq data sets publicly available for each TriTryp and we point that one of the main observations, in each of them, is the occurrence of a change in nucleosome density or occupancy at intergenic regions. In T. cruzi, in a previous publication from our group, we stablished that this intergenic drop in nucleosome density occurs near the trans-splicing acceptor site. In this work, we extend our study to the other members of TriTryps: T. brucei and L. major.

      In T. brucei the papers from Patterton’s lab and Siegel’s lab came out almost simultaneously in 2017. Hence, they do not comment on each other’s work. The first one claims the presence of a well-positioned nucleosome at the TAS by using MNase-seq, while the second one, shows an NDR at the TAS by using MNase-ChIP-seq. However, we do not think they are contradictory, or they have inconsistency. We brought them together along the manuscript because we think these works can provide complementary information.

      On one hand, we infer data from Pattertons lab is slightly less digested than the sample from Siegel’s lab. Therefore, we discuss that this moderate digestion must be the reason why they managed to detect an MNase protecting complex sitting at the TAS (Figure 1). On the other hand, Sigel’s lab includes an additional step by performing MNase-ChIP-seq, showing that when analyzing nucleosome size fragments, histones are not detected at the TAS. Here, we go further in this analysis on figure 3, showing that only when looking at subnucleosome-size fragments, we are able to detect histone H3. And this is also true for T. cruzi.

      By integrating every analysis in this work and the previous ones, we propose that TASs are protected by an MNase-sensitive complex (probed in Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). To be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs, 2018 doi: 10.1093/nar/gky928) and that R-loops have plenty of interacting proteins (Girasol, 2023 10.1093/nar/gkad836). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules, possibly involved in trans-splicing. We have now added a new figure S5 showing R-loop co-localization with the NDR.

      Regarding the comparison between different organisms, after explaining the sensitivity to MNase of the TAS protecting complex, we discuss that when comparing equally digested samples T. cruzi and T. brucei display a similar chromatin landscape with a mild NDR at the TAS (See T. cruzi represented in Figure 1 compared to T. brucei represented in Intermediate digestion 2 in Figure 2, intermediate digestion in the revised manuscript). Unfortunately, we cannot make a good comparison with L. major, since we do not count on a similar level of digestion.

      Another point that requires clarification concerns what the authors mean in the introduction and discussion when they write that trypanosomes have "...poorly organized chromatin with nucleosomes that are not strikingly positioned or phased." On the other hand, they also cite evidence of organization: "...well-positioned nucleosome at the spliced-out region.. in Leishmania (ref 34)"; "...a well-positioned nucleosome at the TASs for internal genes (ref37)"; "...a nucleosome depletion was observed upstream of every gene (ref 35)." Aren't these examples of organized chromatin with at least a few phased nucleosomes? In addition, in ref 37, figure 4 shows at least two (possibly three to four) nucleosomes that appear phased. In my opinion, the authors should first define more precisely what they mean by "poorly organized chromatin" and clarify that this interpretation does not contradict the findings highlighted in the cited literature.

      For a better understanding of nucleosome positioning and phasing I recommend the review: Clark 2010 doi:10.1080/073911010010524945, Figure 4. Briefly, in a cell population there are different alternative positions that a given nucleosome can adopt. However, some are more favorable. When talking about favorable positions, we refer to the coordinates in the genome that are most likely covered by a nucleosome and are predominant in the cell population. Additionally, nucleosomes could be phased or not. This refers not only the position in the genome, but to the distance relative to a given point. In yeast, or in highly transcribed genes of more complex eukaryotes, nucleosomes are regularly spaced and phased relative to the transcription start site (TSS) or to the +1 nucleosome (Ocampo, NAR, 2016, doi:10.1093/nar/gkw068). In trypanosomes, nucleosomes have some regular distribution when making a browser inspection but, given that they are not properly phased with respect to any point, it is almost impossible to make a spacing estimation from paired-end data. This is also consistent with a chromatin that is transcribed in an almost constitutive manner.

      As the reviewer mention, we do site evidence of organization. We think the original observations are correct, but we do not fully agree with some of the original statements. In this manuscript our aim is to take the best we learned from their original works and to make a constructive contribution adding to the original discussions. In this regard, in trypanosomes there are some conserved patterns in the chromatin landscape, but their nucleosomes are far from being well-positioned or phased. For a better understanding, compare the variations observed in the y axis when representing av. nucleosome occupancy in yeast with those observed in trypanosomes and you will see that the troughs and peaks are much more prominent in yeast than the ones observed in any TryTryp member.

      Following the reviewer’s suggestion we have now clarified this in the main text

      The paper would also benefit from the inclusion of a schematic figure to help readers visualize and better understand the findings. What is the biological impact of having nucleosomes, di-nucleosomes, or sub-nucleosomes at TAS? This is not obvious to readers outside the chromatin field. For example, the following statement is not intuitive: "We observed that, when analyzing nucleosome-size (120-180 bp) DNA molecules or longer fragments (180-300 bp), the TASs of either T. cruzi or T. brucei are mostly nucleosome-depleted. However, when representing fragments smaller than a nucleosome-size (50-120 bp) some histone protection is unmasked (Fig. 3 and Fig. S4). This observation suggests that the MNase sensitive complex sitting at the TASs is at least partly composed of histones." Please clarify.

      We appreciate the reviewer’s suggestion to make a schematic figure. We are working on this and will be added to the manuscript upon final revision.

      Regarding the biological impact of having mono, di or subnucleosome fragments, it is important to unveil the fragment size of the protected DNA to infer the nature of the protecting complex. In the case of tRNA genes in yeast, at pol III promoters they found footprints smaller than a nucleosome size that ended up being TFIIB-TFIIC (Nagarajavel, doi: 10.1093/nar/gkt611). Therefore, detecting something smaller than a nucleosome might suggest the binding of trans-acting factors different than histones or involving histones in a mixed complex. These mixed complexes are also observed, and that is the case of the centromeric nucleosome which has a very peculiar composition (Ocampo and Clark, Cells Reports, 2015). On the other hand, if instead we detect bigger fragments, it could be indicative of the presence of bigger protecting molecules or that those regions are part of higher order chromatin organization still inaccessible for MNase linker digestions.

      Here we show on 2Dplots, that complex or components protecting the TAS have nucleosome size, but we cannot assure they are entirely made up by histones, since, only when looking at subnucleosome-size fragments, we are able to detect histone H3. We have now added part of this explanation to the discussion.

      By integrating every analysis in this work and the previous ones, we propose that the TAS is protected by an MNase-sensitive complex (Figure 2). This complex most likely is only partly formed by histones, since only when analyzing sub-nucleosomes size DNA molecules we can detect histone H3 (Figure 3). As explained above, to be absolutely sure that the complex is not entirely made up by histones, future studies should perform an MNse-ChIP-seq with less digested samples. However, it was previously shown that R-loops are enriched at those intergenic NDRs (Briggs 2018) and that R-loops have plenty of interacting proteins (Girasol, 2023). Therefore, most likely, this MNase-sensitive complexed have a hybrid nature made up by H3 and some other regulatory molecules. We have now added a new S5 figure showing R-loop co-localization.

      Some references are missing or incorrect:

      we will make a thorough revision

      "In trypanosomes, there are no canonical promoter regions." - please check Cordon-Obras et al. (Navarro's group). Thank you for the appropiate suggestion.

      We have now added this reference

      Please, cite the study by Wedel et al. (Siegel's group), which also performed MNase-seq analysis in T. brucei.

      We understand that reviewer number 2# missed that we cited this reference and that we did used the raw data from the manuscript of Wedel et. al 2017 form Siegel’s group. We used the MNase-ChIP-seq data set of histone H3 in our analysis for Figures 3, S4b and S5b (S6c in the revised version), also detailed in table S1. To be even more explicit we have now included the accession number of each data set in the figure legend.

      Figure-specific comments: Fig. S3: Why does the number of larger fragments increase with greater MNase digestion? Shouldn't the opposite be expected?

      This a good observation. As we also explained to reviewer#1:

      It's a common observation in MNase digestion of chromatin that more extensive digestion can still result in a broad range of fragment sizes, including some longer fragments. This seemingly counter-intuitive result is primarily due to the non-uniform accessibility of chromatin and the sequence preference of the MNase enzyme.

      The rationale of this is as follows: when you digest chromatin with MNase and the objective is to map nucleosomes genome-wide, the ideal situation would to get the whole material contained in the mononucleosome band. Given that MNase is less efficient to digest protected DNA but, if the reaction proceeds further, it always ends up destroying part of it, the result is always far from perfect. The better situation we can get, is to obtain samples were ˜80% of the material is contained in the mononucloesome band. __And here comes the main point: __even in the best scenario, you always have some additional longer bands, such as those for di or tri nucleosomes. If you keep digesting, you will get less than 80 % in the nucleosome band and, those remaining DNA fragments that use to contain di and tri nucleosomes start getting digested as well originating a bigger dispersion in fragments sizes. How do we explain persistence of Long Fragments? The longest fragments (di-, tri-nucleosomes) that persist in a highly digested sample are the ones that were originally most highly protected by proteins or higher-order structure, making their linker DNA extremely resistant to initial cleavage. Once the majority of the genome is fragmented, these few resistant longer fragments become a more visible component of the remaining population, contributing to a broader size dispersion. Hence, there you end up having a bigger dispersion in length distributions in the final material. Bottom line, it is not a good practice to work with under or overdigested samples. Our main point is to emphasize that especially when comparing samples, it important to compare those with comparable levels of digestion. Otherwise, a different sampling of the genome will be represented in the remaining sequenced DNA Fig. S5B: Why not use MNase conditions under which T. cruzi and T. brucei display comparable profiles at TAS? This would facilitate interpretation.

      The reviewer made a reasonable observation. The reason why we used MNase-ChIP_seq instead of just MNase to test occupancy at TAS at the subsets of genes, is because we intended to be more certain if we were talking about the presence of histones or something else. By using IP for histone H3 we can see that at multi-copy genes this protein is present when looking at nucleosome-size fragments. Additionally, as shown in figure S4b, length distribution histograms are also similar for the compared IPs.

      Minor points:

      There are several typos throughout the manuscript.

      Thanks for the observation. We will check carefully.

      Methods: "Dinucelotide frecuency calculation."

      We will add a code in GitHub

      Reviewer #2 (Significance (Required)):

      In my view, the main conclusion is that the profiles are indeed similar-even when comparing T. brucei and T. cruzi. This was not clear in previous studies (and even appeared contradictory, reporting nucleosome depletion versus enrichment) largely due to differences in chromatin digestion across these organisms. Audience: basic science and specialized readers.

      Expertise: epigenetics and gene expression in trypanosomatids.

      __Reviewer #3 (Evidence, reproducibility and clarity (Required)): __

      The authors analysed publicly accessible MNase-seq data in TriTryps parasites, focusing on the chromatin structure around trans-splicing acceptor sites (TASs), which are vital for processing gene transcripts. They describe a mild nucleosome depletion at the TAS of T. cruzi and L. major, whereas a histone-containing complex protects the TASs of T. brucei. In the subsequent analysis of T. brucei, they suggest that a Mnase-sensitive complex is localised at the TASs. For single-copy versus multi-copy genes, the authors show different di-nucleotide patterns and chromatin structures. Accordingly, they propose this difference could be a novel mechanism to ensure the accuracy of trans-splicing in these parasites.

      Before providing an in- depth review of the manuscript, I note that some missing information would have helped in assessing the study more thoroughly; however, in the light of the available information, I provide the following comments for consideration.

      The numbering of the figures, including the figure legends, is missing in the PDF file. This is essential for assessing the provided information.

      We apologized for not including the figure numbers in the main text, although they are located in the right place when called in the text. The omission was unwillingly made when figure legends were moved to the bottom of the main text. This is now fixed in the updated version of the manuscript.

      The publicly available Mnase- seq data are manyfold, with multiple datasets available for T. cruzi, for example. It is unclear from the manuscript which dataset was used for which figure. This must be clarified.

      This was detailed in Table S1. We have now replaced the table by an improved version, and we have also included the accession number of each data set used in the figure legends.

      Why do the authors start in figure 1 with the description of an MNase- protected TAS for T.brucei, given that it has been clearly shown by the Siegel lab that there is a nucleosome depletion similar to other parasites?

      We did not want to ignore the paper from Patterton’s lab because it was the first one to map nucleosomes genome-wide in T. brucei and the main finding of that paper claimed the existence of a well-positioned nucleosome at intergenic regions, what we though constitutes a point worth to be discussed. While Patterton’s work use MNase-seq from gel-purified samples and provides replicated experiments sequenced in really good depth; Siegel’s lab uses MNase-ChIP-seq of histone H3 but performs only one experiment and its input was not sequenced. So, each work has its own caveats and provides different information that together contributes to make a more comprehensive study. We think that bringing up both data sets to the discussion, as we have done in Figures 1 and 3, helps us and the community working in the field to enrich the discussion.

      If the authors re- analyse the data, they should compare their pipeline to those used in the other studies, highlighting differences and potential improvements.

      We are working on this point. We will provide a more detail description in the final revision.

      Since many figures resemble those in already published studies, there seems little reason to repeat and compare without a detailed comparison of the pipelines and their differences.

      Following the reviewer advice, we are now working on highlighting the main differences that justify analyzing the data the way we did and will be added in the finally revised method section.

      At a first glance, some of the figures might look similar when looking at the original manuscripts comparing with ours. However, with a careful and detailed reading of our manuscripts you can notice that we have added several analyses that allow to unveil information that was not disclosed before.

      First, we perform a systematic comparison analyzing every data set the same way from beginning to end, being the main difference with previous studies the thorough and precise prediction of TAS for the three organisms. Second, we represent the average chromatin organization relative to those predicted TASs for TriTryps and discuss their global patterns. Third, by representing the average chromatin into heatmaps, we show for the very first time, that those average nucleosome landscape are not just an average, they keep a similar organization in most of the genome. These was not done in any of the previous manuscripts except for our own (Beati, PLOS One 2023). Additionally, we introduce the discussion of how the extension of MNase reaction can affect the output of these experiments and we show 2D-plots and length distribution heatmaps to discuss this point (a point completely ignored in all the chromatin literature for trypanosomes). Furthermore, we made a far-reaching analysis by considering the contributions of each publish work even when addressed by different techniques. Finally, we discuss our findings in the context of a topic of current interest in the field, such as TriTryp’s genome compartmentalization.

      Several previous Mnase- seq analysis studies addressing chromatin accessibility emphasized the importance of using varying degrees of chromatin digestion, from low to high digestion (30496478, 38959309, 27151365).

      The reviewer is correct, and this point is exactly what we intended to illustrate in figure number 2. We appreciate he/she suggests these references that we are now citing in the final discussion. Just to clarify, using varying degrees of chromatin digestion is useful to make conclusions about a given organism but when comparing samples, strains, histone marks, etc. It is extremely important to do it upon selection of similar digested samples.

      No information on the extent of DNA hydrolysis is provided in the original Mnase- seq studies. This key information can not be inferred from the length distribution of the sequenced reads.

      The reviewer is correct that “No information on the extent of DNA hydrolysis is provided in the original Mnase-seq studies” and this is another reason why our analysis is so important to be published and discussed by the scientific community working in trypanosomes. We disagree with the reviewer in the second statement, since the level of digestion of a sequenced sample is actually tested by representing the length distribution of the total DNA sequenced. It is true that before sequencing you can, and should, check the level of digestion of the purified samples in an agarose gel and/or in a bioanalyzer. It could be also tested after library preparation, but before sequencing, expecting to observe the samples sizes incremented in size by the addition of the library adapters. But, the final test of success when working with MNase digested samples is to analyze length of DNA molecules by representing the histograms with length distribution of the sequenced DNA molecules. Remarkably, on occasions different samples might look very similar when run in a gel, but they render different length distribution histograms and this is because the nucleosome core could be intact but they might have suffered a differential trimming of the linker DNA associated to it or even be chewed inside (see Cole Hope 2011, section 5.2, doi: 10.1016/B978-0-12-391938-0.00006-9, for a detailed explanation).

      As the input material are selected, in part gel- purified mono- nucleosomal DNA bands. Furthermore the datasets are not directly comparable, as some use native MNase, while others employ MNase after crosslinking; some involve short digestion times at 37 {degree sign} C, while others involve longer digestion at lower temperatures. Combining these datasets to support the idea of an MNase- sensitive complex at the TAS of T. brucei therefore may not be appropriate, and additional experiments using consistent methodologies would strengthen the study's conclusions.

      In my opinion, describing an MNase- sensitive complex based solely on these data is not feasible. It requires specifically designed experiments using a consistent method and well- defined MNase digestion kinetics.

      As the reviewer suggests, the ideal experiment would be to perform a time course of MNase reaction with all the samples in parallel, or to work with a fix time point adding increasing amounts of MNase. However, the information obtained from the detail analysis of the length distribution histogram of sequenced DNA molecules the best test of the real outcome. In fact, those samples with different digestion levels were probably not generated on purpose.

      The only data sets that were gel purified are those from Mareé 2017 (Patterton’s lab), used in Figures 1, S1 and S2 and those from L. major shown in Fig 1. It was a common practice during those years, then we learned that is not necessary to gel purify, since we can sort fragment sizes later in silico when needed.

      As we explained to reviewer #1, to avoid this conflict, we decided to remove this data from figures 2 and S3. In summary, the 3 remaining samples comes from the same lab, and belong to the same publication (Mareé 2022). These sample are the inputs of native MNase ChIp-seq, obtain the same way, totally comparable among each other.

      Reviewer #3 (Significance (Required)):

      Due to the lack of controlled MNase digestion, use of heterogeneous datasets, and absence of benchmarking against previous studies, the conclusions regarding MNase-sensitive complexes and their functional significance remain speculative. With standardized MNase digestion and clearly annotated datasets, this study could provide a valuable contribution to understanding chromatin regulation in TriTryps parasites.

      As we have explained in the previous point our conclusions are valid since we do not compare in any figure samples coming from different treatments. The only exception to this comment could be in figure 3 when talking about MNase-ChIP-seq. We have now added a clear and explicit comment in the section and the discussion that despite having subtle differences in experimental procedures we arrive to the same results. This is the case for T. cruzi IP, run from crosslinked chromatin, compared to T. brucei’s IP, run from native chromatin.

      Along the years it was observed in the chromatin field that nucleosomes are so tightly bound to DNA that crosslinking is not necessary. However, it is still a common practice specially when performing IPs. In our own hands, we did not observe any difference at the global level neither in T. cruzi or in my previous work with yeast.

      ...

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      This study explores chromatin organization around trans-splicing acceptor sites (TASs) in the trypanosomatid parasites Trypanosoma cruzi, T. brucei and Leishmania major. By systematically re-analyzing MNase-seq and MNase-ChIP-seq datasets, the authors conclude that TASs are protected by an MNase-sensitive complex that is, at least in part, histone-based, and that single-copy and multi-copy genes display differential chromatin accessibility. Altogether, the data suggest a common chromatin landscape at TASs and imply that chromatin may modulate transcript maturation, adding a new regulatory layer to an unusual gene-expression system.

      I value integrative studies of this kind and appreciate the careful, consistent data analysis the authors implemented to extract novel insights. That said, several aspects require clarification or revision before the conclusions can be robustly supported. My main concerns are listed below, organized by topic/result section.

      TAS prediction:

      • Why were TAS predictions derived only from insect-stage RNA-seq data? Restricting TAS calls to one life stage risks biasing predictions toward transcripts that are highly expressed in that stage and may reduce annotation accuracy for lowly expressed or stage-specific genes. Please justify this choice and, if possible, evaluate TAS robustness using additional transcriptomes or explicitly state the limitation.

      Results

      • "There is a distinctive average nucleosome arrangement at the TASs in TriTryps":
      • You state that "In the case of L. major the samples are less digested." However, Supplementary Fig. S1 suggests that replicate 1 of L. major is less digested than the T. brucei samples, while replicate 2 of L. major looks similarly digested. Please clarify which replicates you reference and correct the statement if needed.
      • It appears you plot one replicate in Fig. 1b and the other in Suppl. Fig. S2. Please indicate explicitly which replicate is in each plot. For T. brucei, the NDR upstream of the TAS is clearer in Suppl. Fig. S2 while the TAS protection is less prominent; based on your digestion argument, this should correspond to the more-digested replicate. Please confirm. The protected region around the TAS appears centered on the TAS in T. brucei but upstream in L. major. This is an interesting difference. If it is technical (different digestion or TAS prediction offset), explain why; if likely biological, discuss possible mechanisms and implications.

      Results

      • "An MNase sensitive complex occupies the TASs in T. brucei":
      • The definition of "MNase activity" and the ordering of samples into Low/Intermediate/High digestion are unclear. Did you infer digestion levels from fragment distributions rather than from controlled experimental timepoints? In Suppl. Fig. S3a it is not obvious how "Low digestion" was defined; that sample's fragment distribution appears intermediate. Please provide objective metrics (e.g., median fragment length, fraction 120-180 bp) used to classify digestion levels.
      • Several fragment distributions show a sharp cutoff at ~100-125 bp. Was this due to gel purification or bioinformatic filtering? State this clearly in Methods. If gel purification occurred, that can explain why some datasets preserve the MNase-sensitive region.
      • Please reconcile cases where samples labeled as more-digested contain a larger proportion of >200 bp fragments than supposedly less-digested samples; this ordering affects the inference that digestion level determines the loss/preservation of TAS protection. Based on the distributions I see, "Intermediate digestion 1" appears most consistent with an expected MNase curve - please confirm and correct the manuscript accordingly. Results - "The MNase sensitive complexes protecting the TASs in T. brucei and T. cruzi are at least partly composed of histones":
      • The evidence that histones are part of the MNase-sensitive complex relies on H3 MNase-ChIP signal in subnucleosomal fragment bins. This seems to conflict with the observation (Fig. 1) that fragments protecting TASs are often nucleosome-sized. Please reconcile these points: are H3 signals confined to subnucleosomal fragments flanking the TAS while the TAS itself is depleted of H3? Provide plots that compare MNase-seq and H3 ChIP signals stratified by consistent fragment-size bins to clarify this.
      • Please indicate which datasets are used for each panel in Suppl. Fig. S4 (e.g., Wedel et al., Maree et al.), and avoid calling data from different labs "replicates" unless they are true replicates.
      • Several datasets show a sharp lower bound on fragment size in the subnucleosomal range (e.g., ~80-100 bp). Is this a filtering artifact or a gel-size selection? Clarify in Methods and, if this is an artifact, consider replotting after removing the cutoff. Results - "The TASs of single and multi-copy genes are differentially protected by nucleosomes":
      • Please include T. brucei RNA-seq data in Suppl. Fig. S5b as you did for T. cruzi.
      • Discuss how low or absent expression of multigene families affects TAS annotation (which relies on RNA-seq) and whether annotation inaccuracies could bias the observed chromatin differences.
      • The statement that multi-copy genes show an "oscillation" between AT and GC dinucleotides is not clearly supported: the multi-copy average appears noisier and is based on fewer loci. Please tone down this claim or provide statistical support that the pattern is periodic rather than noisy.
      • How were multi-copy genes defined in T. brucei? Include the classification method in Methods.

      Genomes and annotations:

      • If transcriptomic data for the Y strain was used for T. cruzi, please explain why a Y strain genome was not used (e.g., Wang et al. 2021 GCA_015033655.1), or justify the choice. For T. brucei, consider the more recent Lister 427 assembly (Tb427_2018) from TriTrypDB. Use strain-matched genomes and transcriptomes when possible, or discuss limitations.

      Reproducibility and broader integration:

      • Please share the full analysis pipeline (ideally on GitHub/Zenodo) so the results are reproducible from raw reads to plots.
      • As an optional but helpful expansion, consider including additional datasets (other life stages, BSF MNase-seq, ATAC-seq, DRIP-seq) where available to strengthen comparative claims. Optional analyses that would strengthen the study:
      • Stratify single-copy genes by expression (high / medium / low) and examine average nucleosome occupancy at TASs for each group; a correlation between expression and NDR depth would strengthen the functional link to maturation.

      Minor / editorial comments:

      • In the Introduction, the sentence "transcription is initiated from dispersed promoters and in general they coincide with divergent strand switch regions" should be qualified: such initiation sites also include single transcription start regions.
      • Define the dotted line in length distribution plots (if it is not the median, please clarify) and consider placing it at 147 bp across plots to ease comparison.
      • In Suppl. Fig. 4b "Replicate2" the x-axis ticks are misaligned with labels - please fix.
      • Typo in the Introduction: "remodellingremodeling" → "remodeling."

      Referee cross-commenting

      Comment 1: I think Reviewer #2 and Reviewer #3 missed that they authors of this manuscript do cite and consider the results from Wedel at al. 2017. They even re-analysed their data (e.g. Figure 3a). I second Reviewer #2 comment indicating that the inclusion of a schematic figure to help readers visualize and better understand the findings would be an important addition.

      Comment 2: I agree with Reviewer #3 that the use of different MNase digestion procedures in the different datasets have to be considered. On the other hand, I don't think there is a problem with figure 1 showing an MNase-protected TAS for T. brucei as it is based on MNase-seq data and reproduces the reported results (Maree et al. 2017). What the Siegel lab did in Wedel et al. 2017 was MNase-ChIPseq of H3 showing nucleosome depletion at TAS, but both results are not necessary contradictory: There could still be something else (which does not contain H3) sitting on the TAS protecting it from MNase digestion.

      Significance

      This study provides a systematic comparative analysis of chromatin landscapes at trans-splicing acceptor sites (TASs) in trypanosomatids, an area that has been relatively underexplored. By re-analyzing and harmonizing existing MNase-seq and MNase-ChIP-seq datasets, the authors highlight conserved and divergent features of nucleosome occupancy around TASs and propose that chromatin contributes to the fidelity of transcript maturation.

      The significance lies in three aspects:

      1. Conceptual advance: It broadens our understanding of gene regulation in organisms where transcription initiation is unusual and largely constitutive, suggesting that chromatin can still modulate post-transcriptional processes such as trans-splicing.
      2. Integrative perspective: Bringing together data from T. cruzi, T. brucei and L. major provides a comparative framework that may inspire further mechanistic studies across kinetoplastids.
      3. Hypothesis generation: The findings open testable avenues about the role of chromatin in coordinating transcript maturation, the contribution of DNA sequence composition, and potential interactions with R-loops or RNA-binding proteins. Researchers in parasitology, chromatin biology, and RNA processing will find it a useful resource and a stimulus for targeted experimental follow-up.

      My expertise is in gene regulation in eukaryotic parasites, with a focus on bioinformatic analysis of high-throughput sequencing data

    1. Ahora bien, las propuestas presentadas tienen diferencias importantes. En primer lugar, con ELSOC cuenta con información que permite observar con mayores niveles de granularidad y, consecuentemente, identificar con mayor precisión las dimensiones y subdimensiones de la cohesión social en comparación con la propuesta de que se construye para Latinoamérica con LAPOP, cuya perspectiva es más minimalista. Por su parte, en la versión con ELSOC se consideran varias subdimensiones que no están presentes en el índice de Latinoamérica debido a decisiones basadas en inconsistencia teórica y/o estadística, tal como el factor de comportamiento prosocial, el cual intentó integrarse en con datos de WVS y, si bien la consistencia interna del factor era aceptable, no correlacionaba con los demás factores de su dimensión. Finalmente, hay que decir que las propuestas no comparten ningún factor de sus subdimensiones. Si bien ambas integran el índice confianza interpersonal en su medición, ELSOC la considera como un factor de segundo orden, mientras que en la propuesta de LAPOP es un factor de primer orden.

      Esto se va a la siguiente sección

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      This manuscript reports a descriptive study of changes in gene expression after knockdown of the nuclear envelope proteins lamin A/C and Nesprin2/SYNE2 in human U2OS cells. The readout is RNA-seq, which is analyzed at the level of gene ontology and focused investigation of isoform variants and non-coding RNAs. In addition, the mobility of telomeres is studied after these knockdowns, although the rationale in relation to the RNA-seq analyses is rather unclear.

      We sincerely thank the reviewer for the thoughtful summary and valuable feedback. Regarding the telomere mobility analyses, our intention was to provide additional evidence supporting the hypothesis that knockdown of lamins and nesprins disrupts nuclear architecture. Although the connection to the RNA-seq data was not explicitly detailed, we believe that the increased telomere mobility may reflect broader changes in chromatin organization, which could contribute to the observed differential gene expression. We have revised the manuscript to clarify this rationale and improve the integration between the two analyses.

      RNA-seq after knockdown of lamin proteins has been reported many times, and the current study does not provide significant new insights that help us to understand how lamins control gene expression. This is particularly because the vast majority of the observed effects on gene expression appear to occur in regions that are not bound by lamin A. It seems likely that these effects are indirect. There is also virtually no overlap between genes affected by laminA/C and by SYNE2, which remains unexplained; for example, it would be good to know whether laminA/C and SYNE2 bind to different genomic regions. The claim in the Title and Abstract that LMNA governs gene expression / acts through chromatin organization appears to be based only on an enrichment of gene ontology terms "DNA conformation change" and "covalent chromatin conformation" in the RNA-seq data. This is a gross over-interpretation, as no experimental data on chromatin conformation are shown in this study. The analyses of transcript isoform switching and ncRNA expression are potentially interesting but lack a mechanistic rationale: why and how would these nuclear envelope proteins regulate these aspects of RNA expression? The effects of lamin A on telomere movements have been reported before; the effects of SYNE2 on telomere mobility are novel (to my knowledge), but should be discussed in the light of previously documented effects of SUN1/2 on the dynamics of dysfunctional telomeres (Lottersberger et al, Cell 2015).

      We sincerely thank the reviewer for this thoughtful and detailed critique. We agree that RNA-seq following knockdown of lamin proteins has been previously reported and appreciate the concern regarding the novelty and mechanistic interpretation of our findings. However, For our study, we revealed novel findings that there is distinct isoform switching and lncRNA affected by lamins and nesprins, which have not been reported yet by previous studies. Furthermore, we also revealed not only lamin A, but also nesprin-2 could also affect chromatin mobility.

      For the analysis of LMNA ChIP-seq data from  human fibroblast (Kohta Ikegami, 2021). Their data revealed that Lamin A/C modulates gene expression through interactions with enhancers. The pathogenesis of disorders associated with LMNA mutations may stem primarily from disruptions in this gene regulatory function, rather than from impaired tethering of chromatin to LADs.

      We acknowledge the reviewer’s concern that gene ontology enrichment related to chromatin conformation alone is insufficient to support claims about chromatin structural changes. We have therefore revised the “Title” and “Abstract” to avoid overstating conclusions and to more accurately reflect the scope of our data.

      Regarding telomere dynamics, while Lamin A's role has indeed been previously documented, our study provides evidence that SYNE2/Nesprin-2 also regulates telomere mobility. We have now expanded the discussion to include prior work, particularly the findings of Lottersberger et al. (Cell, 2015), to better contextualize our results and distinguish the contributions of SYNE2.

      Finally, we appreciate the reviewer’s suggestion about transcript isoform and noncoding RNA expression. While our study primarily provides descriptive data, we agree that further mechanistic investigation is warranted. We have clarified this point in the “Discussion” and framed our findings as a foundation for future studies exploring the broader regulatory roles of nuclear envelope proteins.

      We are grateful for the reviewer’s comments, which have helped us improve the clarity and rigor of our manuscript. Please see the revised highlights in our revised manuscript.

      As indicated below, I have substantial concerns about the experimental design of the knockdown experiments.

      Altogether, the results presented here are primarily descriptive and do not offer a significant advance in our understanding of the roles of LaminA and SYNE2 in gene regulation or chromatin biology, because the results remain unexplained mechanistically and functionally. Furthermore, the RNAseq datasets should be interpreted with caution until off-target effects of the shRNAs can be ruled out.

      We fully acknowledge that the original version of our manuscript lacked sufficient mechanistic insight. In response, we have revised the manuscript to include additional analyses and explanations that clarify the potential functional relevance of our findings. For example, we added following text “These findings further underscore the functional relevance of lamin A in coordinating transcriptional programs through modulation of nuclear architecture. In contrast, LMNA knockdown led to differential expression of genes enriched in pathways related to chromatin organization, suggesting potential disruptions in chromatin regulatory networks. Although direct measurements of chromatin conformation were not performed, these transcriptional changes indicate that LMNA may contribute to maintaining nuclear architecture and genomic stability, which aligns with its established involvement in laminopathies and genome integrity disorders.“ More analyses could be found in the main text.

      Regarding the concern about off-target effects of the shRNA-based knockdowns, we agree that this is an important consideration. While shRNA approaches inherently carry the risk of off-target effects, we have now performed additional analyses that help address this issue. These analyses support the specificity of our observations and suggest that the majority of gene expression changes are likely to be directly related to the targeted knockdown. Nonetheless, we have clearly stated the limitations of the approach in the revised discussion and emphasized the need for future validation using complementary methods.

      We hope that these revisions strengthen the overall impact and interpretability of our study.

      Specific comments:

      (1) Knockdowns were only monitored by qPCR. Efficiency at the protein level (e.g., Western blots) needs to be determined.

      We agree that complementary protein-level validation (e.g., by Western blot) would strengthen the findings, and we are in the process of obtaining suitable reagents to address this point in future experiments. We have now clarified this limitation in the revised manuscript  

      (2) For each knockdown, only a single shRNA was used. shRNAs are infamous for offtarget effects; therefore, multiple shRNAs for each protein, or an alternative method such as CRISPR deletion or degron technology, must be tested to rule out such offtarget effects.

      We fully acknowledge the concern regarding the use of only a single shRNA per knockdown and agree that shRNAs are prone to off-target effects. We recognize the importance of validating our findings using multiple independent shRNAs or alternative knockdown strategies, such as CRISPR deletion or degron-based approaches, to ensure specificity. To address this concern, we have conducted qPCR confirmation the knockdown of target proteins from RNA-seq findings, further supporting the validity of our data. In line with this, we are currently optimizing an auxin-inducible degron system (AtAFB2) for targeted and controlled depletion of lamin C. Our preliminary results indicate approximately a 40% knockdown efficiency after 16 hours of auxin induction, highlighting the necessity for further system optimization (Author response image 1). Future experiments will integrate this improved degron technology alongside multiple independent approaches to rigorously address and mitigate concerns about off-target effects, thereby enhancing the robustness and reproducibility of our data.

      Author response image 1.

      FACS analysis of the lamin C degron system at 0, 1, 3, and 16 hours postinduction with 500 μM indole-3-acetic acid (IAA) (Sigma).

      (3) It is not clear whether the replicate experiments are true biological replicates (i.e., done on different days) or simply parallel dishes of cells done in a single experiment (= technical replicates). The extremely small standard deviations in the RT-qPCR data suggest the latter, which would not be adequate.

      We appreciate the reviewer’s insightful comment regarding the nature of our replicates. The RT-qPCR experiments were indeed performed as true biological replicates, with samples collected on different days and from independently cultured cell batches. We have added this to the manuscript Methods. While we observed some variability in the Scramble control group, the low standard deviations in the shRNAtreated samples likely reflect the consistent and efficient knockdown of target genes.

      For the RNA-seq experiments, samples were collected as two batches during RNA extraction and library preparation. The samples still represent biological replicates, as they were derived from independently prepared cultures in separate experimental setups. This approach was chosen to strike a balance between biological variation and technical consistency, thereby improving the reliability of the RNA-seq results.

      Reviewer #2 (Public review):

      Summary:

      This study focused on the roles of the nuclear envelope proteins lamin A and C, as well as nesprin-2, encoded by the LMNA and SYNE2 genes, respectively, on gene expression and chromatin mobility. It is motivated by the established role of lamins in tethering heterochromatin to the nuclear periphery in lamina-associated domains (LADs) and modulating chromatin organization. The authors show that depletion of lamin A, lamin A and C, or nesprin-2 results in differential effects of mRNA and lncRNA expression, primarily affecting genes outside established LADs. In addition, the authors used fluorescent dCas9 labeling of telomeric genomic regions combined with live-cell imaging to demonstrate that depletion of either lamin A, lamin A/C, or nesprin-2 increased the mobility of chromatin, suggesting an important role of lamins and nesprin2 in chromatin dynamics.

      We sincerely appreciate the reviewer’s thoughtful summary of our study and the key findings. Our work is indeed motivated by the well-established roles of lamin A/C in chromatin tethering at the nuclear periphery and the emerging understanding of their broader influence on chromatin organization and gene regulation. In our study, we aimed to further explore these roles by examining the consequences of depleting lamin A, lamin A/C, and nesprin-2 (SYNE2) on both gene expression and chromatin mobility.

      As the reviewer accurately notes, we observed differential effects on mRNA and lncRNA expression, with many changes occurring outside of previously defined LADs. This finding suggests that lamins and nesprin-2 may also influence transcriptional regulation through mechanisms beyond direct LAD association. Furthermore, using live-cell imaging of fluorescently labeled telomeric regions, we demonstrated that loss of these nuclear envelope components leads to increased chromatin mobility, supporting their role in maintaining chromatin stability and nuclear architecture.

      We thank the reviewer for highlighting these aspects, which we believe contribute to a more nuanced understanding of how nuclear envelope proteins modulate chromatin behavior and gene regulation.

      Strengths:

      The major strength of this study is the detailed characterization of changes in transcript levels and isoforms resulting from depletion of either lamin A, lamin A/C, or nesprin-2 in human osteosarcoma (U2OS) cells. The authors use a variety of advanced tools to demonstrate the effect of protein depletion on specific gene isoforms and to compare the effects on mRNA and lncRNA levels.

      The TIRF imaging of dCas9-labeled telomeres allows for high-resolution tracking of multiple telomeres per cell, thus enabling the authors to obtain detailed measurements of the mobility of telomeres within living cells and the effect of lamin A/C or nesprin-2 depletion.

      We are grateful that the reviewer recognized the comprehensive analysis of transcript and isoform changes upon depletion of lamin A, lamin A/C, or nesprin-2 in U2OS cells. We also thank the reviewer for acknowledging our use of advanced tools to investigate isoform-specific effects and to distinguish between changes in mRNA and lncRNA expression.

      Furthermore, we are pleased that the reviewer highlighted the strength of our TIRF imaging approach using dCas9-labeled telomeres. This technique enabled us to capture high-resolution, multi-locus dynamics within single living cells, and we agree that it is instrumental in revealing the impact of lamin A/C and nesprin-2 depletion on telomere mobility.

      Weaknesses:

      Although the findings presented by the authors overall confirm existing knowledge about the ability of lamins A/C and nesprin to broadly affect gene expression, chromatin organization, and chromatin dynamics, the specific interpretation and the conclusions drawn from the data presented in this manuscript are limited by several technical and conceptual challenges.

      One major limitation is that the authors only assess the knockdown of their target genes on the mRNA level, where they observe reductions of around 70%. Given that lamins A and C have long half-lives, the effect at the protein level might be even lower. This incomplete and poorly characterized depletion on the protein level makes interpretation of the results difficult. The description for the shRNA targeting the LMNA gene encoding lamins A and C given by the authors is at times difficult to follow and might confuse some readers, as the authors do not clearly indicate which regions of the gene are targeted by the shRNA, and they do not make it obvious that lamin A and C result from alternative splicing of the same LMNA gene. Based on the shRNA sequences provided in the manuscript, one can conclude that the shLaminA shRNA targets the 3' UTR region of the LMNA gene specific to prelamin A (which undergoes posttranslational processing in the cell to yield lamin A). In contrast, the shRNA described by the authors as 'shLMNA' targets a region within the coding sequence of the LMNA gene that is common to both lamin A and C, i.e., the region corresponding to amino acids 122-129 (KKEGDLIA) of lamin A and C. The authors confirm the isoform-specific effect of the shLaminA isoform, although they seem somewhat surprised by it, but do not confirm the effect of the shLMNA construct. Assessing the effect of the knockdown on the protein level would provide more detailed information both on the extent of the actual protein depletion and the effect on specific lamin isoforms. Similarly, given that nesprin-2 has numerous isoforms resulting from alternative splicing and transcription initiation. In the current form of the manuscript, it remains unclear which specific nesprin-2 isoforms were depleted, and to what extent (on the protein level).

      We have revised the Methods section to include a clearer and more detailed description of the shRNA design, including the specific regions of the LMNA gene targeted by each construct, as well as the relationship between lamin A and C isoforms resulting from alternative splicing. We agree that this clarification will help prevent confusion for readers.

      Regarding the shLMNA construct, we acknowledge the importance of confirming the knockdown at the protein level, especially given the long half-lives of lamin proteins. In our revised manuscript, we now refer to Supplementary Figure S2, which demonstrates that the shLMNA construct effectively reduces both lamin A and lamin C transcript levels. While we initially focused on mRNA quantification, we recognize that additional proteinlevel validation is valuable and have accordingly emphasized this point in the revised discussion.

      We also appreciate the comment on nesprin-2 isoforms. Given the complexity of nesprin-2 splicing, we are currently working to further characterize the specific isoforms affected and will aim to include protein-level data in a future study. 

      Another substantial limitation of the manuscript is that the current analysis, with the exception of the chromatin mobility measurements, is exclusively based on transcriptomic measurements by RNA-seq and qRT-PCR, without any experimental validation of the predicted protein levels or proposed functional consequences. As such, conclusions about the importance of lamin A/C on RNA synthesis and other functions are derived entirely from gene ontology terms and are not sufficiently supported by experimental data. Thus, the true functional consequences of lamin A/C or nesprin depletion remain unclear. Statements included in the manuscript such as "our findings reveal that lamin A is essential for RNA synthesis, ..." (Lines 79-80) are thus either inaccurate or misleading, as the current data do not show that lamin A is ESSENTIAL for RNA synthesis, and lamin A/C and lamin A deficient cells and mice are viable, suggesting that they are capable of RNA synthesis.

      We agree that our current data do not support the claim that lamin A is essential for RNA synthesis, and we acknowledge the importance of distinguishing between correlation and causal relations in our conclusions. In light of this, we have revised the statement in the manuscript to more accurately reflect our findings:

      “Our findings suggest that lamin A contributes to RNA synthesis, supports chromatin spatial organization through LMNA, and that SYNE2 influences chromatin modifications as reflected in transcript levels.”

      We hope this revision better aligns with the limitations of our dataset and addresses the reviewer’s concerns regarding the interpretation of functional consequences based solely on transcriptomic data.

      Another substantial weakness is that the data and analysis presented in the manuscript raise some concerns about the robustness of the findings. Given that the 'shLMNA' construct is expected to deplete both lamin A and C, i.e., its effect encompasses the depletion of lamin A, which is achieved by the 'shLaminA' construct, one would expect a substantial overlap between the DEGs in the shLMNA and shLaminA conditions, with the shLMNA depletion producing a broader effect as it targets both lamin A and C. However, the Venn Diagram in Figure 4a, the genomic loci distribution in Figure 4b, and the correlation analysis in Supplementary Figure S2 show little overlap between the shLMNA and shLaminA conditions, which is quite surprising. In the mapping of the DEGs shown in Figure 4b, it is also surprising not to see the gene targeted by the shRNA, LMNA, found on chromosome 1,  in the results for the shLMNA and shLamin A depletion.

      We have added the discussion into the revised edition: “Interestingly, although both shLMNA and shLaminA constructs target lamin A, with shLMNA additionally depleting lamin C, the DEGs identified under these two conditions show limited overlap. This unexpected finding suggests that depletion of lamin C in the shLMNA condition may trigger distinct or compensatory transcriptional responses that are not elicited by lamin A knockdown alone. Furthermore, variation in shRNA efficiency or off-target effects may contribute to these differences. Notably, despite directly targeting LMNA, the overlap in DEGs between the two conditions remained limited under our stringent threshold criteria. Together, these observations highlight the complex and non-linear regulatory roles of lamin isoforms in gene expression and underscore the need for further mechanistic studies to dissect their individual and combined contributions [28,29].”

      The correlation analysis in Supplementary Figure S2 raises further questions. The authors use doc-inducible shRNA constructs to target lamin A (shLaminA), lamin A/C (shLMNA), or nesprin-2 (shSYNE2). Thus, the no-dox control (Ctr) for each of these constructs would be expected to be very similar to the non-target scrambled controls (Ctrl.shScramble and Dox.shScramble). However, in the correlation matrix, each of the no-dox controls clusters more closely with the corresponding dox-induced shRNA condition than with the Ctrl.shScramble or Dox.shScramble conditions, suggesting either a very leaky dox-inducible system, strong effects from clonal selection, or substantial batch effects in the processing. Either of these scenarios could substantially affect the interpretation of the findings. For example, differences between different clonal cell lines used for the studies, independent of the targeted gene, could explain the limited overlap between the different shRNA constructs and result in apparent differences when comparing these clones to the scrambled controls, which were derived from different clones.

      We thank the reviewer for this thoughtful observation. We would like to clarify that the samples shown in Supplementary Figure S2 were processed and sequenced in two separate batches, and the data presented in the correlation matrix are unnormalized. As such, batch effects are indeed present and likely contribute to the clustering pattern observed, particularly the closer similarity between the dox-induced and no-dox samples for each individual shRNA construct.

      Importantly, our analyses focus on within-construct comparisons (i.e., doxycyclinetreated vs untreated samples for the same shRNA), rather than direct comparisons across different constructs or scrambled controls. Each experimental pair (dox vs nodox) was processed in parallel within its respective batch to ensure internal consistency. Thus, while the global clustering pattern may reflect batch-related differences or baseline variations between independently derived cell lines, these factors do not affect the main conclusions drawn from the within-construct differential expression analysis.

      The manuscript also contains several factually inaccurate or incorrect statements or depictions. For example, the depiction of the nuclear envelope in Figure 1 shows a single bilipid layer, instead of the actual double bi-lipid layer of the inner and outer nuclear membranes that span the nuclear lumen. The depiction further lacks SUN domain proteins, which, together with nesprins, form the LINC complex essential to transmit forces across the nuclear envelope. The statement in line 214 that "Linker of nucleoskeleton and cytoskeleton (LINC) complex component nesprin-2 locates in the nuclear envelope to link the actin cytoskeleton and the nuclear lamina" is not quite accurate, as nesprin-2 also links to microtubules via dynein and kinesin.

      We sincerely thank the reviewer for pointing out these important inaccuracies. In response, we have revised Figure 1 to accurately depict the nuclear envelope as a double bi-lipid membrane and included SUN domain proteins to better reflect the structural components of the LINC complex. Additionally, we have updated the statement and citations 

      This is the revised part that is incorporated in the manuscript “The linker of nucleoskeleton and cytoskeleton (LINC) complex component nesprin-2 is a nuclear envelope protein that connects the nucleus to the cytoskeleton by interacting not only with actin filaments but also with microtubules through motor proteins such as dynein and kinesin. This structural linkage contributes to cellular architecture and facilitates mechanotransduction between the nuclear interior and the extracellular matrix (ECM) [8,21]

      ”We appreciate the reviewer’s insights, which have helped improve the accuracy and clarity of our manuscript.

      The statement that "Our data show that Lamin A knockdown specifically reduced the usage of its primary isoform, suggesting a potential role in chromatin architecture regulation, while other LMNA isoforms remained unaffected, highlighting a selective effect" (lines 407-409) is confusing, as the 'shLaminA' shRNA specifically targets the 3' UTR of lamin A that is not present in the other isoforms. Thus, the observed effect is entirely consistent with the shRNA-mediated depletion, independent of any effects on chromatin architecture.

      We have rephrased the statement “Our data show that knockdown with shLaminA, which specifically targets the 3' UTR unique to the lamin A isoform, selectively reduced lamin A expression without affecting other LMNA isoforms.”

      The premise of the authors that lamins would only affect peripheral chromatin and genes at LADs neglects the fact that lamins A and C are also found in the nuclear interior, where they form stable structure and influence chromatin organization, and the fact that lamins A and C and nesprins additionally interact with numerous transcriptional regulators such as Rb, c-Fos, and beta-catenins, which could further modulate gene expression when lamins or nesprins are depleted.

      Based on the reviewer’s comment we have added the statement into Discussion part “Beyond their well-established role in tethering heterochromatin at the nuclear periphery through lamina-associated domains (LADs), A-type lamins (lamins A and C) also localize to the nuclear interior, where they contribute to chromatin organization and gene regulation independently of LADs [27,28]. Nuclear lamins can form intranuclear foci that associate with active chromatin and are implicated in supporting transcriptional activity. Additionally, both lamins and nesprins participate in diverse protein-protein interactions that may influence transcriptional regulation. For example, lamin A/C interacts with the retinoblastoma protein (Rb) to modulate E2F-dependent transcription [29], and with c-Fos to regulate its nuclear retention and activity [30]. While βcatenin acts as a co-activator in Wnt signaling relies on nuclear translocation and interaction with transcriptional complexes, and evidence suggests that nuclear architecture and envelope components, including nesprins, can influence this process [31]. Therefore, the observed gene expression changes following depletion of lamins or nesprins are likely not restricted to genes located within lamina-associated domains (LADs), but may also result from broader perturbations in nuclear architecture and transcriptional regulatory networks. This is consistent with our findings that lamins and nesprins influence gene expression in distal, non-LAD regions.”

      The comparison of the identified DEGs to genes contained in LADs might be confounded by the fact that the authors relied on the identification of LADs from a previous study (ref #28), which used a different human cell type (human skin fibroblasts) instead of the U2OS osteosarcoma cells used in the present study. As LADs are often highly cell-type specific, the use of the fibroblast data set could lead to substantial differences in LADs.

      DamID in various mammalian cell types has shown that some LADs are cell-type invariant (constitutive LADs [cLADs]), while others interact with the NL in only certain cell types (facultative LAD [fLADs]) (Bas van Steensel, 2017). We agree that facultative LADs (fLADs), which comprise approximately half of all LADs, are often highly cell-type specific. We acknowledge that this specificity may influence the interpretation of our findings. At present, publicly available LAD datasets for U2OS cells are limited to those associated with LMNB. We concur that generating LMNA-specific LAD maps in U2OS cells would enhance the accuracy and relevance of our analyses, and we view this as an important direction for future research.

      Another limitation of the current manuscript is that, in the current form, some of the figures and results depicted in the figures are difficult to interpret for a reader not deeply familiar with the techniques, based in part on the insufficient labeling and figure legends. This applies, for example, to the isoform use analysis shown in Figure 3d or the GenometriCorr analysis quantifying spatial distance between LADs and DEGs shown in Figure 4c.

      For Figure 3, we added text in the caption to make the figure more readable “Isoform switching analysis reveals differential expression of alternative transcript variants between conditions, highlighting a shift in predominant isoform usage.” For Figure 4c, we added text in the caption “GenometriCorr analysis was used to quantify the spatial relationship between LADs and DEGs, evaluating whether the observed genomic proximity deviates from random expectation through empirical distributionbased statistical testing of pairwise distances between genomic intervals.” And also in the ‘Methods”.

      Overall appraisal and context:

      Despite its limitations, the present study further illustrates the important roles the nuclear envelope proteins lamin A, lamin C, and nesprin-2 have in chromatin organization, dynamics, and gene expression. It thus confirms results from previous studies (not always fully acknowledged in the current manuscript) previously reported for lamin A/C depletion. For example, the effect of lamin A/C depletion on increasing mobility of chromatin had already been demonstrated by several other groups, such as Bronshtein et al. Nature Comm 2015 (PMID: 26299252) and Ranade et al. BMC Mol Cel Biol 2019 (PMID: 31117946). Additionally, the effect of lamin A/C depletion on gene and protein expression has already been extensively studied in a variety of other cell lines and model systems, including detailed proteomic studies (PMIDs 23990565 and 35896617).

      We add more discussions as below “Our findings reinforce the pivotal roles of nuclear envelope proteins lamin A, LMNA and nesprin 2 in regulating chromatin organization, chromatin mobility, and gene expression. These results are consistent with and extend prior studies investigating the consequences of lamin depletion. For instance, increased chromatin mobility following the loss of lamin A/C has been previously demonstrated using live-cell imaging approaches [26,35], supporting our observations of nuclear structural relaxation and chromatin redistribution. Additionally, proteomic profiling following lamin A depletion has been extensively documented across both cellular and mouse models, providing valuable insights into the molecular consequences of nuclear envelope disruption [36,37]. While these earlier studies provide a strong foundation, our work contributes novel insights by integrating isoform-specific perturbations with spatial chromatin measurements. This approach emphasizes contextdependent regulatory mechanisms that involve not only lamina-associated regions but also nesprin-associated domains and distal genomic loci, thereby expanding the current understanding of nuclear envelope protein function in gene regulation.”

      The finding that that lamin A/C or nesprin depletion not only affects genes at the nuclear periphery but also the nuclear interior is not particularly surprising giving the previous studies and the fact that lamins A and C are also founding within the nuclear interior, where they affect chromatin organization and dynamics, and that lamins A/C and nesprins directly interact with numerous transcriptional regulators that could further affect gene expression independent from their role in chromatin organization.

      We have added the following statement into the Discussion part “Beyond their well-established role in tethering heterochromatin at the nuclear periphery through lamina-associated domains (LADs), A-type lamins (lamins A and C) also localize to the nuclear interior, where they contribute to chromatin organization and gene regulation independently of LADs [27,28]. Nuclear lamins can form intranuclear foci that associate with active chromatin and are implicated in supporting transcriptional activity. Additionally, both lamins and nesprins participate in diverse protein-protein interactions that may influence transcriptional regulation. For example, lamin A/C interacts with the retinoblastoma protein (Rb) to modulate E2F-dependent transcription [29], and with c-Fos to regulate its nuclear retention and activity [30]. While β-catenin acts as a co-activator in Wnt signaling relies on nuclear translocation and interaction with transcriptional complexes, and evidence suggests that nuclear architecture and envelope components, including nesprins, can influence this process [31]. Therefore, the observed gene expression changes following depletion of lamins or nesprins are likely not restricted to genes located within lamina-associated domains (LADs), but may also result from broader perturbations in nuclear architecture and transcriptional regulatory networks. This is consistent with our findings that lamins and nesprins influence gene expression in distal, non-LAD regions.”

      The authors provide a detailed analysis of isoform switching in response to lamin A/C or nesprin depletion, but the underlying mechanism remains unclear. Similarly, their analysis of the genomic location of the observed DEGs shows the wide-ranging effects of lamin A/C or nesprin depletion, but lets the reader wonder how these effects are mediated. A more in-depth analysis of predicted regulator factors and their potential interaction with lamins A/C or nesprin would be beneficial in gaining more mechanistic insights.

      We agree that the current findings, while highlighting the broad impact of lamin A/C or nesprin depletion on isoform usage and gene expression, do not fully elucidate the underlying regulatory mechanisms. We acknowledge the importance of identifying upstream regulators and understanding their potential interactions with lamins and nesprins. Future investigations integrating epigenetic approaches, such as ChIP-seq for transcription factors and chromatin-associated proteins, will be essential to clarify how lamins and nesprins contribute to isoform switching and to uncover the mechanistic basis of these regulatory effects.

      Reviewer #3 (Public review):

      Summary:

      This manuscript describes DOX inducible RNAi KD of Lamin A, LMNA coded isoforms as a group, and the LINC component SYNE2. The authors report on differentially expressed genes, on differentially expressed isoforms, on the large numbers of differentially expressed genes that are in iLADs rather than LADs, and on telomere mobility changes induced by 2 of the 3 knockdowns.

      Strengths:

      Overall, the manuscript might be useful as a description for reference data sets that could be of value to the community.

      We acknowledge that the initial version of our manuscript lacked comprehensive comparisons with previous studies. In our revised manuscript, we have included more detailed discussions highlighting how our findings complement and extend existing knowledge. Specifically, our study presents novel insights into the role of lamins and nesprins in regulating non-coding RNAs and isoform switching, areas that have not been extensively explored in prior literatures. We hope these additions will clarify the contribution of our work and demonstrate the potential value to the field.

      Weaknesses:

      The results are presented as a type of data description without formulation of models or explanations of the questions being asked and without follow-up. Thus, conceptually, the manuscript doesn't appear to break new ground.

      In our study, we proposed a conceptual model in which gene expression changes are linked to RNA synthesis, chromatin conformation alterations, and chromatin modifications, potentially mediated by lamin A, LMNA, and nesprin-2 at the transcriptional level. However, we acknowledge that this model remains preliminary and largely unexplored. We agree that additional mechanistic insights and identification of specific regulatory factors are needed to strengthen this framework. Future studies will aim to experimentally validate these hypotheses and clarify the pathways and regulators involved.

      Not discussed is the previous extensive work by others on the nucleoplasmic forms of LMNA isoforms. Also not discussed are similar experiments- for instance, gene expression changes others have seen after lamin A knockdowns or knockouts, or the effect of lamina on chromatin mobility, including telomere mobility - see, for example, a review by Roland Foisner (doi.org/10.1242/jcs.203430) on nucleoplasmic lamina. The authors need to do a thorough search of the literature and compare their results as much as possible with previous work.

      We sincerely thank the reviewer for pointing out the important body of previous work on the nucleoplasmic forms of LMNA isoforms and the impact of lamin A depletion on gene expression and chromatin mobility. In the revised version, we have now included relevant citations. Please see the highlights in the Discussion.

      The authors don't seem to make any attempt to explore the correlation of their findings with any of the previous data or correlate their observed differential gene expression with other epigenetic and chromatin features. There is no attempt to explore the direction of changes in gene expression with changes in nuclear positioning or to ask whether the genes affected are those that interact with nucleoplasmic pools of LMNA isoforms. The authors speculate that the DEG might be related to changing mechanical properties of the cells, but do not develop that further.

      We sincerely appreciate the reviewer’s insightful comments. In our revised manuscript, we have addressed this concern by comparing our telomere mobility results with previously published data (Bronshtein et al., 2015), and we observe consistent findings showing that lamin A depletion leads to increased telomere motility. Furthermore, our study provides novel evidence that nesprin-2 depletion similarly enhances telomere migration, suggesting a broader role for nuclear envelope components in chromatin dynamics.

      We acknowledge the importance of integrating gene expression data with epigenetic and chromatin features. However, to our knowledge, such datasets are currently limited for U2OS cells, particularly in the context of lamin and nesprin perturbation. We agree that understanding the correlation between differentially expressed genes and nuclear positioning or interactions with nucleoplasmic pools of LMNA isoforms is a promising direction. We are actively planning future studies that include chromatin profiling and mechanical perturbation assays to further explore these mechanisms.

      The technical concerns include: 1) Use of only one shRNA per target. Use of additional shRNAs would have reduced concern about possible off-target knockdown of other genes; 2) Use of only one cell clone per inducible shRNA construct. Here, the concern is that some of the observed changes with shRNA KDs might show clonal effects, particularly given that the cell line used is aneuploid. 3) Use of a single, "scrambled" control shRNA rather than a true scrambled shRNA for each target shRNA.

      (1) Regarding the use of a single shRNA per target, we agree that utilizing multiple independent shRNAs would strengthen the conclusions. In our study, we selected validated shRNA sequences with minimal predicted off-targets and confirmed knockdown efficiency at mRNA level (by qPCR).

      (2) As for the use of a single cell clones per inducible construct, we understand the concern that clonal variability, particularly in an aneuploid cell line, could influence the observed phenotypes. To clarify this, we have revised in the manuscript “Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNA-seq”. To mitigate this, we ensured consistent results across biological replicates and used inducible systems to reduce variability introduced by random integration. 

      (3) We also acknowledge that the use of a single scrambled shRNA control, rather than matched scrambled controls for each construct, is a limitation. While we used a standard non-targeting scrambled shRNA commonly applied in similar studies, we understand that distinct scrambled sequences might better control for construct-specific effects. .

      Reviewer #1 (Recommendations for the authors):

      Please make the processed RNA-seq data available for each individual experiment, not only the raw reads and averaged data.

      In response to your suggestion, we have now included the raw count data for each individual experiment in Supplementary Table S5 to enhance transparency and reproducibility.   

      Reviewer #2 (Recommendations for the authors):

      The current text contains numerous typos, and some of the text could benefit from additional editing for clarity and conciseness. In addition, several statements, particularly in the section encompassing lines 321-329, lack supporting references.

      In our revised version, we have carefully edited the text for clarity and conciseness.

      We have included related citations from lines 321-329: “The majority of genes located within LADs tend to be transcriptionally repressed or expressed at low levels. This is because LADs are associated with heterochromatin , a tightly packed form of DNA that is generally inaccessible to the cellular machinery required for gene expression 12,23. Lamin mutations and levels have shown to disrup LAD organization and gene expression that have been implicated in various diseases, including cancer and laminopathies 24,25.”

      The figures would benefit from better labeling, including a clear schematic of which specific regions of the LMNA and SYNE2 genes are targeted by the different shRNA constructs, and by labeling the different isoforms in Figure S1 with the common names. Furthermore, note that lamin A arises from posttranslational processing of prelamin A, not from a different transcript. Likely, the "different LMNA genes" shown in Supplementary Figure S1 are just different annotations, with the exceptions of the splice isoforms lamin C and lamin delta10.

      In the Method, we have clearly denoted the design of corresponding shRNAs as suggested “The shRNA designated as shLMNA targets a region within the coding sequence of LMNA that is shared by both lamin A and lamin C, corresponding to amino acids 122–129 (KKEGDLIA) of lamin A/C (RefSeq: NM_001406985.1). The shRNA against SYNE2 (shSYNE2) targets a sequence encoding amino acids 5133– 5140 (KRYERTEF) of the SYNE2 protein (RefSeq: NM_182914.3).”

      For Figure S1, we have added common isoform names to figure and captions. “lamin A (ENST00000368300.9), LMNA 227 (ENST00000675431.1), pre-lamin A/C (ENST00000676385.2), and lamin C (ENST00000677389.1)."

      Several statements about the novelty of the findings or approach are inaccurate. For example, the authors state in the introduction that "However, whether lamins and nesprins actively govern chromatin remodeling and isoform switching beyond their wellcharacterized functions in mechanotransduction remains an open question", as several previous studies have provided detailed characterization of lamin A/C depletion or mutations on chromatin organization, mobility, and gene expression. The authors should revise these statements and better acknowledge the previous work.

      We have added the citations of previous works and revised the text “While significant progress has been made in understanding the role of lamins in genome organization, the precise mechanisms by which lamins and nesprins regulate gene expression through distal chromatin interactions remain incompletely understood [10,11]. Notably, recent evidence suggests a reciprocal interplay between transcription and chromatin conformation, where gene activity can influence chromatin folding and vice versa [12]. However, whether lamins and nesprins actively govern chromatin remodeling and isoform switching beyond their well-characterized functions in mechanotransduction remains an open question.”

      Reviewer #3 (Recommendations for the authors):

      Overall, the manuscript might be useful as a description for reference data sets that could be of value to the community. Otherwise, I did not derive meaningful biological insights from the manuscript. It was not clear to me also how much might be repeating previous work already reported in the literature (see below). For example, I cited a review on nucleoplasmic lamins by Roland Foisner at the end of the specific comments - scanning it very quickly shows that there are already papers on increased chromatin mobility after lamin perturbations, including telomeres. I know there have also been studies of changes in gene expression after lamin A and B KD. The authors need to do a thorough search of the literature and compare their results as much as possible with previous work.

      We acknowledge that the roles of lamins in regulating chromatin dynamics and gene expression, including the effects of lamin perturbations on chromatin mobility and telomere behavior, have been previously reported. In response, we have revised the manuscript to incorporate relevant citations and to better contextualize our results within the existing literature. Importantly, to our knowledge, the finding that nesprin-2 influences telomere mobility has not been previously reported, and we have highlighted this novel observation in the revised text.

      In response, we have now conducted a more comprehensive literature review and revised the manuscript accordingly to better contextualize our findings. Specifically, we have added comparisons to prior studies reporting chromatin mobility changes following lamin A/C depletion. We also now emphasize the novel aspects of our study, such as the isoform-specific perturbations and the integration of spatial chromatin organization with transcriptomic outcomes.

      We hope these revisions strengthen the manuscript’s contribution as both a useful resource and a mechanistic investigation.

      Not even acknowledged is the previous extensive work on the nucleoplasmic forms of LMNA isoforms - I know Robert Goldman published extensively on this, implicating lamin A, for example, on DNA replication in the nuclear interior as well as transcription. More recently, Roland Foisner worked on this, including with molecular approaches. For example, a 2017 review mentions previous ChIP-seq mapping of lamin A binding to iLAD genes and also describes previous work on chromatin mobility, including telomere mobility. Yet the entire writing in the manuscript seems to only discuss the role of LMNA isoforms in the nuclear lamina per se, explaining the surprise in seeing many iLAD genes differentially expressed after KD.

      We have added related studies as suggested by the reviewer and  added the following statement: “Nucleoplasmic lamins bind to chromatin and have been indicated to regulate chromatin accessibility and spatial chromatin organization [24]. Lamins in the nuclear interior regulate gene expression by dynamically binding to heterochromatic and euchromatic regions, influencing epigenetic pathways and chromatin accessibility. They also contribute to chromatin organization and may mediate mechanosignaling [25]. However, the contribution of nesprins and lamins to isoform switch and chromatin dynamics has not been fully understood [7,10,26]. ”

      Overall, I found a surprising lack of review and citation of previous work (see Specific comments below), including the lack of citations for various declarative statements about previous conclusions in the field about lamin A.

      (1) Introduction:

      "However, the contribution of nesprins and lamins to gene 220 expression has not been fully understood."

      There is a literature about changes in gene expression- at least for lamin KD and KO- both in vitro and in vivo- that the authors could and should review and summarize here.

      To address this, we have now revised the manuscript to include a more comprehensive discussion of the relevant literature and added appropriate citations in the corresponding section. We hope this addition provides better context for our current findings and clarifies the contribution of lamins and nesprins to gene regulation.

      (2) Results:

      "A fragment of shRNA that targeting 3' untranslated region (UTR) in LMNA genes was chosen to knockdown lamin A (shLaminA). A fragment of shRNA that targeting coding sequence (CDS) region in LMNA genes was chosen to knockdown LMNA (shLMNA)". The authors should explain more - does one KD both lamin A and C (shLMNA), versus the other being specific to lamin A but not lamin C? It appears so from later text, but the authors should explicitly explain their targeting strategy right at the beginning to make this clear.

      To make the method clearer, we have clear added the text “The shRNA against lamin A (shLaminA) targets the 3′ untranslated region (UTR) of the LMNA gene, specific to prelamin A, which is post-translationally processed into mature lamin A. The shRNA designated as shLMNA targets a region within the coding sequence of LMNA that is shared by both lamin A and lamin C, corresponding to amino acids 122–129 (KKEGDLIA) of lamin A/C (RefSeq: NM_001406985.1). The shRNA against SYNE2 (shSYNE2) targets a sequence encoding amino acids 5133–5140 (KRYERTEF) of the SYNE2 protein (RefSeq: NM_182914.3).”

      But more importantly, the convention with RNAi is to demonstrate consistent results with at least two different small RNAs. This is to rule out that a physiological result is due to the KD of a non-target gene(s) rather than the target gene. The scrambled shRNA controls are not sufficient for this as they test a general effect of the shRNA culture conditions, including tranfection and dox treatment, etc, rather than a specific KD of a different gene(s) than the target due to off-target RNAi.

      We fully acknowledge the concern regarding the use of only a single shRNA per knockdown and agree that shRNAs are prone to off-target effects. However, we have conducted qPCR confirmation of key RNAseq findings, which strongly supports the specificity and validity of our observed results. Additionally, we recognize the importance of validating our findings using multiple independent shRNAs or alternative knockdown strategies, such as CRISPR deletion or degron-based approaches. To address this rigorously, we are currently optimizing an auxin-inducible degron system (AtAFB2) for targeted depletion of lamin C. Our preliminary data indicate approximately 40% knockdown efficiency after 16 hours of auxin induction, highlighting ongoing optimization efforts (Author response image 1). Future experiments will integrate this improved degron system and multiple independent shRNAs to further substantiate our results and definitively rule out potential off-target effects, thereby enhancing the robustness and reproducibility of our data.

      (3) "Single-cell clones 114 were subsequently isolated and expanded in the presence of 2 μg ml-1 puromycin to 115 establish doxycycline-inducible shRNA-knockdown stable cell lines."

      The authors need to describe explicitly in the Results how exactly they did these experiments. Did they do their analysis using a single clone from each lentivirus shRNA transduction? Did they do analysis - ie RNA-seq- on several clones from the same shRNA transduction and compare? Did they pool clones together?

      In our study, single-cell clones and pooled the three independent clones were mixed following lentiviral transduction with doxycycline-inducible shRNA constructs and selected with 2 μg/ml puromycin. For each shRNA, we screened multiple clones for knockdown efficiency and selected a representative clone exhibiting robust knockdown for downstream experiments, including RNA-seq. We did pool three multiple clones; all functional analyses were performed on pooled clones. We have now revised the Method section to explicitly describe this experimental design: “Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNAseq.”

      One confounding problem is that there are clonal differences among cells cloned from a single cell line. This is particularly true for aneuploid cell lines like U2OS. Ideally, they would use mixed clones, but if not, they should at least explain what they did.

      We added the text to method “Three single-cell clones exhibiting robust knockdown efficiency were individually expanded and subsequently pooled. The pooled clones were maintained in medium containing 2 µg ml ¹ puromycin to establish stable cell lines with doxycycline-inducible shRNA expression. Multiple independent clones per shRNA were screened for knockdown efficiency using reverse transcription quantitative real-time PCR (RT-qPCR). Three clones demonstrating robust and consistent knockdown were selected and expanded. These clones were subsequently pooled to minimize clonal variability and used for downstream analyses, including RNA-seq.”

      (4) I am confused by their shScramble control. This is typically done for each shRNA- ie, a separate scrambled control for each of the different target shRNAs. This is because there are nucleotide composition effects, so the scrambled idea is to keep the nucleotide composition the same.

      However, looking at STable 1 and SFig. 2- shows they used a single scrambled control, thus not controlling for different nucleotide composition among the three shRNAs that they used.

      In our study, we used a single non-targeting shRNA (shScramble) as a control to account for potential effects of the shRNA vector and delivery system. This approach is commonly accepted in the field when the scrambled sequence is validated as non-targeting and does not share significant homology with the genes of interest. While we acknowledge that using separate scrambled controls matched in nucleotide composition for each targeting shRNA can further minimize sequence-dependent effects, we believe that the use of a single validated scramble control is appropriate for the scope of this study.

      (5) In Figure 2 - what is on the x-axis? Number of DEG? Please state this explicitly in the figure legend.

      We have added “Counts” as figure legend, and added the caption “Gene counts are displayed on the x-axis.”

      (6) More importantly, in Figure 2 they only show pathway analysis of DEG. They should show more: a) Fold-change of DEG displayed for all DEG; b) Same for genes in LADs vs iLADs. More explicitly, are the DEG primarily in LADs or iLADs, or a mix? Are the DEGs in LADs biased towards increased expression, as might be expected for LAD derepression? Conversely, what about iLADs - is there a bias towards increased or decreased expression?

      We agree that a more detailed characterization of the differentially expressed genes (DEGs) will strengthen the conclusions. In response we have revised the manuscript as following: “Furthermore, differential expression analysis revealed that the majority of DEGs following depletion of lamins and nesprins were located outside lamina-associated domains (non-LADs). Specifically, for shLaminA knockdown, 8 DEGs within LADs were downregulated and 8 were upregulated, whereas 59 non-LAD DEGs were downregulated and 79 were upregulated. For shLMNA, 7 LAD-associated DEGs were downregulated and 15 were upregulated, with 88 downregulated and 140 upregulated DEGs in non-LAD regions. In the case of shSYNE2 knockdown, 161 LAD DEGs were downregulated and 108 were upregulated, while 2,009 non-LAD DEGs were downregulated and 1,851 were upregulated (Figure 2d). These results indicate that the transcriptional changes resulting from the loss of lamins or nesprins predominantly occur at non-LAD genomic regions.”

      We appreciate the reviewer’s comments, which helped improve the clarity and depth of our analysis.

      (7) Is there a scientific rationale for the authors' focus on DE of isoforms? Is this somehow biologically meaningful and different from the overall DE of all genes? The authors should explain in the Results section what their motivation was in deciding to do this analysis.

      We have add the following statement in response to the reviewer “To uncover transcript-specific regulatory changes, we performed isoform-level differential expression analysis. Many genes produce functionally distinct isoforms, and shifts in their usage can occur without changes in total gene expression, making isoform-level analysis essential for detecting subtle but meaningful transcriptional regulation.  Our analysis demonstrated that depletion of lamins and nesprins induced significant alterations in specific transcript isoforms, indicating regulatory changes in alternative splicing or transcription initiation that are not captured by gene-level differential expression analysis.”

      (8) "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 seldom intersected with genes in 328 LADs (Figure 4a)."

      Why was this expected? The authors have only cited one review paper. Others have seen significant numbers of genes in LADs that are DE after KD of lamina proteins. What was the fold cutoff used for DE? Was there a cutoff for the level of expression prior to KD? The authors should cite relevant primary literature showing that there are active genes in LADs and that some perturbations of the lamina proteins do result in DE of genes in LADs.

      We acknowledge the reviewer's concerns regarding our statement: "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 seldom intersected with genes in 328 LADs (Figure 4a)." To clarify, this expectation stems from previous observations that LAD-associated genes are typically transcriptionally silent or expressed at very low levels (Guelen et al., 2008). However, dynamic changes in LADs and gene expression status do occur during cellular differentiation (Peric-Hupkes et al., 2010), and some LAD-resident genes can become active and transcriptionally responsive under specific conditions, such as T cell activation. We applied specific foldchange and baseline expression level thresholds in our analysis, as detailed in the Methods section. We added the following text in the “Method”: “Differential gene expression analysis was performed using thresholds of baseMean > 50, absolute log fold change > 0.5, and p-value < 0.05.”  We agree that additional relevant primary literature demonstrating active gene expression changes within LADs upon perturbation of lamina proteins should be cited and we have added the following statement:

      “LADs exhibit dynamic reorganization and changes in gene expression during cellular differentiation [30]. Although genes within LADs are generally transcriptionally silent or expressed at low levels [31], some LAD-resident genes remain active and can be transcriptionally modulated in response to specific stimuli, such as T cell activation [32].”

      (9) "Expectedly, the DEGs from 327 depletion of lamin A, LMNA, and SYNE2 were seldomly intersected with genes in 328 LADs (Figure 4a)." I disagree with the wording of "seldom" which by definition means rarely. I don't see that this applies to the significant number of genes that are in LADs that are DE as shown in the Venn diagram, Fig. 4a. For example, this includes 57 genes for the shLamin A and ~400 genes for the shSYNE2.

      Is there anything of note about which genes are DE within LADs?

      We have rephrased the text to the following “The Venn diagram analysis revealed limited overlap between DEGs resulting from knockdown of lamin A (shLaminA), LMNA (shLMNA), or SYNE2 (shSYNE2) and genes located within laminaassociated domains (LADs). Specifically, only a small subset of DEGs intersected with LAD-associated genes across all three knockdowns, suggesting that the majority of transcriptional changes occur outside LAD regions”. The DEGs in LADs and non-LADs were shown in supplementary Table S4.

      (10) "The relative distance from DE genes (query features) to LADs (reference feature) is plotted by GenometriCorr package (v 1.1.24). The color depicting deviation from the expected distribution and the line indicating the density of the data at relative distance are shown." The authors should explicitly describe what the reference "expected distribution" was based on. This is all very cryptic right now, so we can't assess the biological possible significance. Third, they should clearly explain what is plotted on the x and y axes of Figure 4C. I really don't have a clue. I assume the x-axis is some measure of "relative distance" but what on earth does that mean? I really don't understand this plot, which is crucial to the whole story. What is on the y-axis? Density of DEGs? What? And they need to explain not only what is plotted on the x and y axes but also provide units.

      We have revised the text to clarify that the GenometriCorr analysis (v1.1.24) was used to assess the spatial association between differentially expressed genes (DEGs, query features) and lamina-associated domains (LADs, reference features). Specifically, this method evaluates whether the observed distances between query and reference genomic intervals significantly deviate from a null distribution generated by random permutation of query features across the genome, while preserving size and chromosomal context.

      In the revised figure legend and main text, we now clarify that the x-axis represents the relative genomic distance between each differentially expressed gene (DEG) and the nearest LAD, scaled between –1 and 1, where values near 0 indicate close proximity, and values approaching –1 or 1 reflect greater distances on either side of the LADs. The y-axis denotes the density (or proportion) of query features (DEGs) at each relative distance bin. The color gradient overlays the plot to indicate deviation from the expected null distribution (based on randomized query positions): red indicates enrichment (closer than expected), while blue indicates depletion (further than expected).

      “GenometriCorr analysis (v1.1.24) was used to assess the spatial relationship between DEGs (query) and LADs (reference) [48]. The x-axis shows the relative genomic distance between each DEG and the nearest LAD, scaled from –1 (far upstream) to 1 (far downstream), with 0 indicating closest proximity. The y-axis represents the density of DEGs at each distance bin. A color gradient indicates deviation from a randomized null distribution: red signifies enrichment (closer than expected), and blue signifies depletion. Statistical significance was determined using the Jaccard test (p < 0.05).”

      Second, to correlate with other features and to give more meaning, the authors should show the chromosome location of the DEGs and scale this by the actual DNA sequence distances. This will be needed to correlate with other features from other studies.

      The genomic positions of DEGs have now been displayed in Figure 4b, with distances shown in base pairs to facilitate cross-reference with other features in future studies.

      Third, they should attempt some kind of analysis themselves to try to understand what might correlate with the DEGs. To begin with, they might try to correlate with lamin A ChiP-seq or other molecular proximity assays. Others in fact have shown that lamin A interacts with 5' regulatory regions of a subset of genes- presumably this is the diffuse nucleoplasmic pool of lamin A that has been studied by others in the past.

      We agree that understanding potential regulatory mechanisms underlying DEG distribution is essential. In response, we have expanded our analysis (Figure 2d) to highlight that a substantial portion of DEGs are located outside of LADs, suggesting potential regulation by the nucleoplasmic pool of lamin A. This is consistent with previous studies showing lamin A interaction with regulatory elements such as 5′ UTRs and enhancers, independent of LAD localization. We have now cited relevant literature to support this hypothesis.

      Fourth, in the table, they should go beyond just giving the fold change in expression. Particularly for genes that are expressed at very low levels, this is not particularly meaningful as it is very sensitive to noise. They should provide a metric related to levels of expression both before and after the KD.

      We acknowledge the reviewer’s concern regarding fold-change interpretation for low-abundance transcripts. To improve clarity and interpretability, we have now included Supplementary Table S4, which provides the raw counts and baseMean values (average normalized expression across all samples) for all DEGs. Additionally, we note that in our differential expression analysis, genes with baseMean < 50 and absolute log<sub>2</sub>fold change > 0.5 were filtered out to reduce potential noise from low-expression genes.

      (11) The figure legend and description in the Results section were completely inadequate. I had little understanding of what was being plotted. It is not sufficient to simply state the name of some software package that they used to measure "XYZ" and to show the results. It has no meaning for the average reader.

      Without some type of explanation of rationale, questions being asked, and conclusions made of biological relevance, this section made zero impact on me.

      Yes- details can be provided in the Methods. But conceptually, the methods and the conceptual underpinnings of the approach and as the question being asked and the rationale for the approach, with the significance of the results, need to be developed in the Results section.

      In response, we have revised the “Results” section to better articulate the rationale behind the analysis, the specific biological questions we aimed to address, and the conceptual relevance of the method used. We have also clarified the meaning of the plotted data and how it supports our conclusions.

      While technical details remain in the “Methods” section, we now provide a more accessible narrative in the Results to guide the reader through the approach and highlight the biological significance of our findings. We hope these revisions make the section more informative and impactful.

      (12) The telomere movement part of the manuscript seems to come out of nowhere. Why telomeres? Where are telomeres normally positioned, particularly relative to the nuclear lamina? Does this change with the KDs - particularly for those that increase motion? The MSD for SYNE2 appears unconstrained- they should explore longer delta time periods to see if it reaches a point of constrained movement.

      If the telomeres are simply tethered at the nuclear lamina, then is that the explanation- that they become untethered? But if they are not typically at the periphery, then where are they relative to other nuclear compartments? And why is there mobility changing? Is it related to the loss of nuclear lamina positioning of adjacent LAD regions to the telomeres? Is it an indirect, secondary effect? What would they see after an acute KD? What about other chromosome regions? Again, there is little explanation for the rationale for these observations. It is one of many possible experiments they could have done. Why did they do this one?

      We added the following explanation “Although telomeres are not uniformly tethered to the nuclear lamina, they can transiently associate with the nuclear periphery, particularly during post-mitotic nuclear reassembly, through interactions involving SUN1 and RAP1 36. Given that lamins and nesprins are key components of the nuclear envelope that regulate chromatin organization and mechanics 37,38, we examined telomere dynamics as a proxy for changes in nuclear architecture. Using EGFP-tagged dCas9 to label telomeric regions in live U2OS cells, we assessed whether knockdown of these proteins leads to increased telomere mobility, reflecting a loss of structural constraint or altered chromatin–nuclear envelope interactions 17.” And “To probe how nuclear envelope components regulate chromatin dynamics, we tracked telomeres as a representative genomic locus whose mobility reflects changes in nuclear mechanics and chromatin organization. Although telomeres are not stably tethered to the nuclear lamina, their motion can be influenced by nuclear architecture and transient peripheral associations [36]. Upon depletion of lamin A, LMNA, or SYNE2, we observed significantly increased telomere mobility and nuclear area explored, quantified by mean square displacement and net displacement (Figure 6b–c, Supplementary Movie S1). These changes likely reflect altered chromatin–lamina interactions or disrupted nuclear mechanical constraints, consistent with prior studies showing that lamins modulate chromatin dynamics and nuclear stiffness [37,38,39]. Thus, our findings support a role for lamins and nesprins in constraining chromatin motion through nuclear structural integrity.”

      (13) "Notably, Lamin A depletion led to enrichment of 392 pathways associated with RNA biosynthesis, supporting its previously suggested role 393 in transcriptional activation and ribonucleotide metabolism."

      There is a literature on this. Say more and cite the references.

      Notably, lamin A depletion led to enrichment of pathways associated with RNA biosynthesis, supporting its previously suggested role in transcriptional activation and ribonucleotide metabolism 45.  

      (14) "This aligns with prior studies indicating that Lamin A contributes to chromatin accessibility and RNA polymerase activity." Again, there is a literature on this. Say more and cite the references.

      This aligns with prior studies indicating that lamin A contributes to chromatin accessibility and RNA polymerase activity 46. These findings further underscore the functional relevance of lamin A in coordinating transcriptional programs through modulation of nuclear architecture.

      (15) "In contrast, LMNA knockdown was linked to alterations in chromatin conformation." No. The authors show gene ontology and implicate perturbed RNA levels for genes implicated in "chromatin conformation". That is not the same thing as measuring chromatin conformation, which is not done, and showing changes in conformation.

      Based on the reviewer’s comment we have revised the text as the following: “In contrast, LMNA knockdown led to differential expression of genes enriched in pathways related to chromatin organization, suggesting potential disruptions in chromatin regulatory networks. Although direct measurements of chromatin conformation were not performed, these transcriptional changes indicate that LMNA may contribute to maintaining nuclear architecture and genomic stability, which aligns with its established involvement in laminopathies and genome integrity disorders.”

      (16) "The findings that DEGs are predominantly located in non-LAD regions highlight a unique regulatory aspect of lamins and nesprins, emphasizing their spatial specificity in gene expression". Is this novel? Can the authors separate direct from indirect effects? Is the percentage of genes in LADs that are altered in expression different from the percentage of genes in iLADs that are altered in expression? There are many more active genes in iLADs, so one expects more DEGs in iLADs even if this is random. Also - how does this correlate with lamin A binding near 5' regulatory regions detected by ChIP-seq? See the following review for references to this question and also previous work on lamin A versus chromatin mobility, including telomeres. J Cell Sci (2017) 130 (13): 2087-2096. https://doi.org/10.1242/jcs.203430

      We appreciate the reviewer’s valuable comments and feedback, we have revised the manuscript as the following to address the feedback. “Furthermore, differential expression analysis revealed that the majority of DEGs following depletion of lamins and nesprins were located outside lamina-associated domains (non-LADs). Specifically, for shLaminA knockdown, 8 DEGs within LADs were downregulated and 8 were upregulated, whereas 59 non-LAD DEGs were downregulated and 79 were upregulated. For shLMNA, 7 LAD-associated DEGs were downregulated and 15 were upregulated, with 88 downregulated and 140 upregulated DEGs in non-LAD regions. In the case of shSYNE2 knockdown, 161 LAD DEGs were downregulated and 108 were upregulated, while 2,009 non-LAD DEGs were downregulated and 1,851 were upregulated (Figure 2d, Supplementary Table S4). These results indicate that the transcriptional changes resulting from the loss of lamins or nesprins predominantly occur at non-LAD genomic regions.

      The percentage of DEGs was consistently higher in non-LADs, which are gene rich and transcriptionally active, whereas LADs, known to be enriched for silent or lowly expressed genes, showed fewer expression changes. These findings are consistent with previous studies demonstrating that active genes are more prevalent in non-LADs and that LAD associated genes are generally repressed or less responsive to perturbation [27,28]. Together, these results support a model in which lamins and nesprins influence gene expression through both structural organization and promoter proximal interactions, particularly within euchromatic nuclear regions [10,26,29].”

  4. bafybeid7gjtxre33jbpmnzs6avjyvludaufoeubczurkrkrncisabad7w4.ipfs.localhost:8080 bafybeid7gjtxre33jbpmnzs6avjyvludaufoeubczurkrkrncisabad7w4.ipfs.localhost:8080
    1. The same morphic page

      that is the capability and the information as a unit would be different in another browser profile

      As it is the content is available only on my machine Need to add the ability to get a publically shared version to show up

      with the abiity for anyone to make it their own and work on it and share their version back

      In all this the common social sharing is mediated via hypothesis

      I am now using the ready availability of annotations

      even on pages that are constantly being changed

      as a way to

      write on the margins to facilitate the formulative thinking as developing IT to make it all work work

    1. Author response:

      The following is the authors’ response to the original reviews

      We thank all the reviewers for their constructive comments. We have carefully considered your feedback and revised the manuscript accordingly. The major concern raised was the applicability of SegPore to the RNA004 dataset. To address this, we compared SegPore with f5c and Uncalled4 on RNA004, and found that SegPore demonstrated improved performance, as shown in Table 2 of the revised manuscript.

      Following the reviewers’ recommendations, we updated Figures 3 and 4. Additionally, we added one table and three supplementary figures to the revised manuscript:

      · Table 2: Segmentation benchmark on RNA004 data

      · Supplementary Figure S4: RNA translocation hypothesis illustrated on RNA004 data

      · Supplementary Figure S5: Illustration of Nanopolish raw signal segmentation with eventalign results

      · Supplementary Figure S6: Running time of SegPore on datasets of varying sizes

      Below, we provide a point-by-point response to your comments.

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors describe a new computational method (SegPore), which segments the raw signal from nanopore-direct RNA-Seq data to improve the identification of RNA modifications. In addition to signal segmentation, SegPore includes a Gaussian Mixture Model approach to differentiate modified and unmodified bases. SegPore uses Nanopolish to define a first segmentation, which is then refined into base and transition blocks. SegPore also includes a modification prediction model that is included in the output. The authors evaluate the segmentation in comparison to Nanopolish and Tombo, and they evaluate the impact on m6A RNA modification detection using data with known m6A sites. In comparison to existing methods, SegPore appears to improve the ability to detect m6A, suggesting that this approach could be used to improve the analysis of direct RNA-Seq data.

      Strengths:

      SegPore addresses an important problem (signal data segmentation). By refining the signal into transition and base blocks, noise appears to be reduced, leading to improved m6A identification at the site level as well as for single-read predictions. The authors provide a fully documented implementation, including a GPU version that reduces run time. The authors provide a detailed methods description, and the approach to refine segments appears to be new.

      Weaknesses:

      In addition to Nanopolish and Tombo, f5c and Uncalled4 can also be used for segmentation, however, the comparison to these methods is not shown.

      The method was only applied to data from the RNA002 direct RNA-Sequencing version, which is not available anymore, currently, it remains unclear if the methods still work on RNA004.

      Thank you for your comments.

      To clarify the background, there are two kits for Nanopore direct RNA sequencing: RNA002 (the older version) and RNA004 (the newer version). Oxford Nanopore Technologies (ONT) introduced the RNA004 kit in early 2024 and has since discontinued RNA002. Consequently, most public datasets are based on RNA002, with relatively few available for RNA004 (as of 30 June 2025).

      Nanopolish and Tombo were developed for raw signal segmentation and alignment using RNA002 data, whereas f5c and Uncalled4are the only two software supporting RNA004 data.  Since the development of SegPore began in January 2022, we initially focused on RNA002 due to its data availability. Accordingly, our original comparisons were made against Nanopolish and Tombo using RNA002 data.

      We have now updated SegPore to support RNA004 and compared its performance against f5c and Uncalled4 on three public RNA004 datasets.

      As shown in Table 2 of the revised manuscript, SegPore outperforms both f5c and Uncalled4 in raw signal segmentation. Moreover, the jiggling translocation hypothesis underlying SegPore is further supported, as shown in Supplementary Figure S4.

      The overall improvement in accuracy appears to be relatively small.

      Thank you for the comment.

      We understand that the improvements shown in Tables 1 and 2 may appear modest at first glance due to the small differences in the reported standard deviation (std) values. However, even small absolute changes in std can correspond to substantial relative reductions in noise, especially when the total variance is low.

      To better quantify the improvement, we assume that approximately 20% of the std for Nanopolish, Tombo, f5c, and Uncalled4 arises from noise. Using this assumption, we calculate the relative noise reduction rate of SegPore as follows:

      Noise reduction rate = (baseline std − SegPore std) / (0.2 × baseline std) ​​

      Based on this formula, the average noise reduction rates across all datasets are:

      - SegPore vs Nanopolish: 49.52%

      - SegPore vs Tombo: 167.80%

      - SegPore vs f5c: 9.44%

      - SegPore vs Uncalled4: 136.70%

      These results demonstrate that SegPore can reduce the noise level by at least 9% given a noise level of 20%, which we consider a meaningful improvement for downstream tasks, such as base modification detection and signal interpretation. The high noise reduction rates observed in Tombo and Uncalled4 (over 100%) suggest that their actual noise proportion may be higher than our 20% assumption.

      We acknowledge that this 20% noise level assumption is an approximation. Our intention is to illustrate that SegPore provides measurable improvements in relative terms, even when absolute differences appear small.

      The run time and resources that are required to run SegPore are not shown, however, it appears that the GPU version is essential, which could limit the application of this method in practice.

      Thank you for your comment.

      Detailed instructions for running SegPore are provided in github (https://github.com/guangzhaocs/SegPore). Regarding computational resources, SegPore currently requires one CPU core and one Nvidia GPU to perform the segmentation task efficiently.

      We present SegPore’s runtime for typical datasets in Supplementary Figure S6 in the revised manuscript.  For a typical 1 GB fast5 file, the segmentation takes approximately 9.4 hours using a single NVIDIA DGX‑1 V100 GPU and one CPU core.

      Currently, GPU acceleration is essential to achieve practical runtimes with SegPore. We acknowledge that this requirement may limit accessibility in some environments. To address this, we are actively working on a full C++ implementation of SegPore that will support CPU-only execution. While development is ongoing, we aim to release this version in a future update.

      Reviewer #2 (Public review):

      Summary:

      The work seeks to improve the detection of RNA m6A modifications using Nanopore sequencing through improvements in raw data analysis. These improvements are said to be in the segmentation of the raw data, although the work appears to position the alignment of raw data to the reference sequence and some further processing as part of the segmentation, and result statistics are mostly shown on the 'data-assigned-to-kmer' level.

      As such, the title, abstract, and introduction stating the improvement of just the 'segmentation' does not seem to match the work the manuscript actually presents, as the wording seems a bit too limited for the work involved.

      The work itself shows minor improvements in m6Anet when replacing Nanopolish eventalign with this new approach, but clear improvements in the distributions of data assigned per kmer. However, these assignments were improved well enough to enable m6A calling from them directly, both at site-level and at read-level.

      Strengths:

      A large part of the improvements shown appear to stem from the addition of extra, non-base/kmer specific, states in the segmentation/assignment of the raw data, removing a significant portion of what can be considered technical noise for further analysis. Previous methods enforced the assignment of all raw data, forcing a technically optimal alignment that may lead to suboptimal results in downstream processing as data points could be assigned to neighbouring kmers instead, while random noise that is assigned to the correct kmer may also lead to errors in modification detection.

      For an optimal alignment between the raw signal and the reference sequence, this approach may yield improvements for downstream processing using other tools.<br /> Additionally, the GMM used for calling the m6A modifications provides a useful, simple, and understandable logic to explain the reason a modification was called, as opposed to the black models that are nowadays often employed for these types of tasks.

      Weaknesses:

      The work seems limited in applicability largely due to the focus on the R9's 5mer models. The R9 flow cells are phased out and not available to buy anymore. Instead, the R10 flow cells with larger kmer models are the new standard, and the applicability of this tool on such data is not shown. We may expect similar behaviour from the raw sequencing data where the noise and transition states are still helpful, but the increased kmer size introduces a large amount of extra computing required to process data and without knowledge of how SegPore scales, it is difficult to tell how useful it will really be. The discussion suggests possible accuracy improvements moving to 7mers or 9mers, but no reason why this was not attempted.

      Thank you for pointing out this important limitation. Please refer to our response to Point 1 of Reviewer 1 for SegPore’s performance on RNA004 data. Notably, the jiggling behavior is also observed in RNA004 data, and SegPore achieves better performance than both f5c and Uncalled4.

      The increased k-mer size in RNA004 affects only the training phase of SegPore (refer to Supplementary Note 1, Figure 5 for details on the training and testing phases). Once the baseline means and standard deviations for each k-mer are established, applying SegPore to RNA004 data proceeds similarly to RNA002. This is because each k-mer in the reference sequence has, at most, two states (modified and unmodified). While the larger k-mer size increases the size of the parameter table, it does not increase the computational complexity during segmentation. Although estimating the initial k-mer parameter table requires significant time and effort on our part, it does not affect the runtime for end users applying SegPore to RNA004 data.

      Extending SegPore from 5-mers to 7-mers or 9-mers for RNA002 data would require substantial effort to retrain the model and generate sufficient training data. Additionally, such an extension would make SegPore’s output incompatible with widely used upstream and downstream tools such as Nanopolish and m6Anet, complicating integration and comparison. For these reasons, we leave this extension for future work.

      The manuscript suggests the eventalign results are improved compared to Nanopolish. While this is believably shown to be true (Table 1), the effect on the use case presented, downstream differentiation between modified and unmodified status on a base/kmer, is likely limited as during actual modification calling the noisy distributions are usually 'good enough', and not skewed significantly in one direction to really affect the results too terribly.

      Thank you for your comment. While current state-of-the-art (SOTA) methods perform well on benchmark datasets, there remains significant room for improvement. Most SOTA evaluations are based on limited datasets, primarily covering DRACH motifs in human and mouse transcriptomes. However, m6A modifications can also occur in non-DRACH motifs, where current models may underperform. Additionally, other RNA modifications—such as pseudouridine, inosine, and m5C—are less studied, and their detection may benefit from improved signal modeling.

      We would also like to emphasize that raw signal segmentation and RNA modification detection are distinct tasks. SegPore focuses on the former, providing a cleaner, more interpretable signal that can serve as a foundation for downstream tasks. Improved segmentation may facilitate the development of more accurate RNA modification detection algorithms by the community.

      Scientific progress often builds incrementally through targeted improvements to foundational components. We believe that enhancing signal segmentation, as SegPore does, contributes meaningfully to the broader field—the full impact will become clearer as the tool is adopted into more complex workflows.

      Furthermore, looking at alternative approaches where this kind of segmentation could be applied, Nanopolish uses the main segmentation+alignment for a first alignment and follows up with a form of targeted local realignment/HMM test for modification calling (and for training too), decreasing the need for the near-perfect segmentation+alignment this work attempts to provide. Any tool applying a similar strategy probably largely negates the problems this manuscript aims to improve upon.

      We thank the reviewer for this insightful comment.

      To clarify, Nanopolish provides three independent commands: polya, eventalign, and call-methylation.

      - The polya command identifies the adapter, poly(A) tail, and transcript region in the raw signal.

      - The eventalign command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference.

      - The call-methylation command detects methylated bases from DNA sequencing data.

      The eventalign command corresponds to “the main segmentation+alignment for a first alignment,” while call-methylation corresponds to “a form of targeted local realignment/HMM test for modification calling,” as mentioned in the reviewer’s comment. SegPore’s segmentation is similar in purpose to Nanopolish’s eventalign, while its RNA modification estimation component is similar in concept to Nanopolish’s call-methylation.

      We agree the general idea may appear similar, but the implementations are entirely different. Importantly, Nanopolish’s call-methylation is designed for DNA sequencing data, and its models are not trained to recognize RNA modifications. This means they address distinct research questions and cannot be directly compared on the same RNA modification estimation task. However, it is valid to compare them on the segmentation task, where SegPore exhibits better performance (Table 1).

      We infer the reviewer may suggest that because m6Anet is a deep neural network capable of learning from noisy input, the benefit of more accurate segmentation (such as that provided by SegPore) might be limited. This concern may arise from the limited improvement of SegPore+m6Anet over Nanopolish+m6Anet in bulk analysis (Figure 3). Several factors may contribute to this observation:

      (i) For reads aligned to the same gene in the in vivo data, alignment may be inaccurate due to pseudogenes or transcript isoforms.

      (ii) The in vivo benchmark data are inherently more complex than in vitro datasets and may contain additional modifications (e.g., m5C, m7G), which can confound m6A calling by altering the signal baselines of k-mers.

      (iii) m6Anet is trained on events produced by Nanopolish and may not be optimal for SegPore-derived events.

      (iv) The benchmark dataset lacks a modification-free (IVT) control sample, making it difficult to establish a true baseline for each k-mer.

      In the IVT data (Figure 4), SegPore shows a clear improvement in single-molecule m6A identification, with a 3~4% gain in both ROC-AUC and PR-AUC. This demonstrates SegPore’s practical benefit for applications requiring higher sensitivity at the molecule level.

      As noted earlier, SegPore’s contribution lies in denoising and improving the accuracy of raw signal segmentation, which is a foundational step in many downstream analyses. While it may not yet lead to a dramatic improvement in all applications, it already provides valuable insights into the sequencing process (e.g., cleaner signal profiles in Figure 4) and enables measurable gains in modification detection at the single-read level. We believe SegPore lays the groundwork for developing more accurate and generalizable RNA modification detection tools beyond m6A.

      We have also added the following sentence in the discussion to highlight SegPore’s limited performance in bulk analysis:

      “The limited improvement of SegPore combined with m6Anet over Nanopolish+m6Anet in bulk in vivo analysis (Figure 3) may be explained by several factors: potential alignment inaccuracies due to pseudogenes or transcript isoforms, the complexity of in vivo datasets containing additional RNA modifications (e.g., m5C, m7G) affecting signal baselines, and the fact that m6Anet is specifically trained on events produced by Nanopolish rather than SegPore. Additionally, the lack of a modification-free control (in vitro transcribed) sample in the benchmark dataset makes it difficult to establish true baselines for each k-mer. Despite these limitations, SegPore demonstrates clear improvement in single-molecule m6A identification in IVT data (Figure 4), suggesting it is particularly well suited for in vitro transcription data analysis.”

      Finally, in the segmentation/alignment comparison to Nanopolish, the latter was not fitted(/trained) on the same data but appears to use the pre-trained model it comes with. For the sake of comparing segmentation/alignment quality directly, fitting Nanopolish on the same data used for SegPore could remove the influences of using different training datasets and focus on differences stemming from the algorithm itself.

      In the segmentation benchmark (Table 1), SegPore uses the fixed 5-mer parameter table provided by ONT. The hyperparameters of the HHMM are also fixed and not estimated from the raw signal data being segmented. Only in the m6A modification task,  SegPore does perform re-estimation of the baselines for the modified and unmodified states of k-mers. Therefore, the comparison with Nanopolish is fair, as both tools rely on pre-defined models during segmentation.

      Appraisal:

      The authors have shown their method's ability to identify noise in the raw signal and remove their values from the segmentation and alignment, reducing its influences for further analyses. Figures directly comparing the values per kmer do show a visibly improved assignment of raw data per kmer. As a replacement for Nanopolish eventalign it seems to have a rather limited, but improved effect, on m6Anet results. At the single read level modification modification calling this work does appear to improve upon CHEUI.

      Impact:

      With the current developments for Nanopore-based modification largely focusing on Artificial Intelligence, Neural Networks, and the like, improvements made in interpretable approaches provide an important alternative that enables a deeper understanding of the data rather than providing a tool that plainly answers the question of whether a base is modified or not, without further explanation. The work presented is best viewed in the context of a workflow where one aims to get an optimal alignment between raw signal data and the reference base sequence for further processing. For example, as presented, as a possible replacement for Nanopolish eventalign. Here it might enable data exploration and downstream modification calling without the need for local realignments or other approaches that re-consider the distribution of raw data around the target motif, such as a 'local' Hidden Markov Model or Neural Networks. These possibilities are useful for a deeper understanding of the data and further tool development for modification detection works beyond m6A calling.

      Reviewer #3 (Public review):

      Summary:

      Nucleotide modifications are important regulators of biological function, however, until recently, their study has been limited by the availability of appropriate analytical methods. Oxford Nanopore direct RNA sequencing preserves nucleotide modifications, permitting their study, however, many different nucleotide modifications lack an available base-caller to accurately identify them. Furthermore, existing tools are computationally intensive, and their results can be difficult to interpret.

      Cheng et al. present SegPore, a method designed to improve the segmentation of direct RNA sequencing data and boost the accuracy of modified base detection.

      Strengths:

      This method is well-described and has been benchmarked against a range of publicly available base callers that have been designed to detect modified nucleotides.

      Weaknesses:

      However, the manuscript has a significant drawback in its current version. The most recent nanopore RNA base callers can distinguish between different ribonucleotide modifications, however, SegPore has not been benchmarked against these models.

      I recommend that re-submission of the manuscript that includes benchmarking against the rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0 dorado models, which are reported to detect m5C, m6A_DRACH, inosine_m6A and PseU.<br /> A clear demonstration that SegPore also outperforms the newer RNA base caller models will confirm the utility of this method.

      Thank you for highlighting this important limitation. While Dorado, the new ONT basecaller, is publicly available and supports modification-aware basecalling, suitable public datasets for benchmarking m5C, inosine, m6A, and PseU detection on RNA004 are currently lacking. Dorado’s modification-aware models are trained on ONT’s internal data, which is not publicly released. Therefore, it is not currently feasible to evaluate or directly compare SegPore’s performance against Dorado for m5C, inosine, m6A, and PseU detection.

      We would also like to emphasize that SegPore’s main contribution lies in raw signal segmentation, which is an upstream task in the RNA modification detection pipeline. To assess its performance in this context, we benchmarked SegPore against f5c and Uncalled4 on public RNA004 datasets for segmentation quality. Please refer to our response to Point 1 of Reviewer 1 for details.

      Our results show that the characteristic “jiggling” behavior is also observed in RNA004 data (Supplementary Figure S4), and SegPore achieves better segmentation performance than both f5c and Uncalled4 (Table 2).

      Recommendations for the authors:

      Reviewing Editor:

      Please note that we also received the following comments on the submission, which we encourage you to take into account:

      took a look at the work and for what I saw it only mentions/uses RNA002 chemistry, which is deprecated, effectively making this software unusable by anyone any more, as RNA002 is not commercially available. While the results seem promising, the authors need to show that it would work for RNA004. Notably, there is an alternative software for resquiggling for RNA004 (not Tombo or Nanopolish, but the GPU-accelerated version of Nanopolish (f5C), which does support RNA004. Therefore, they need to show that SegPore works for RNA004, because otherwise it is pointless to see that this method works better than others if it does not support current sequencing chemistries and only works for deprecated chemistries, and people will keep using f5C because its the only one that currently works for RNA004. Alternatively, if there would be biological insights won from the method, one could justify not implementing it in RNA004, but in this case, RNA002 is deprecated since March 2024, and the paper is purely methodological.

      Thank you for the comment. We agree that support for current sequencing chemistries is essential for practical utility. While SegPore was initially developed and benchmarked on RNA002 due to the availability of public data, we have now extended SegPore to support RNA004 chemistry.

      To address this concern, we performed a benchmark comparison using public RNA004 datasets against tools specifically designed for RNA004, including f5c and Uncalled4. Please refer to our response to Point 1 of Reviewer 1 for details. The results show that SegPore consistently outperforms f5c and Uncalled4 in segmentation accuracy on RNA004 data.

      Reviewer #2 (Recommendations for the authors):

      Various statements are made throughout the text that require further explanation, which might actually be defined in more detail elsewhere sometimes but are simply hard to find in the current form.

      (1) Page 2, “In this technique, five nucleotides (5mers) reside in the nanopore at a time, and each 5mer generates a characteristic current signal based on its unique sequence and chemical properties (16).”

      5mer? Still on R9 or just ignoring longer range influences, relevant? It is indeed a R9.4 model from ONT.

      Thank you for the observation. We apologize for the confusion and have clarified the relevant paragraph to indicate that the method is developed for RNA002 data by default. Specifically, we have added the following sentence:

      “Two versions of the direct RNA sequencing (DRS) kits are available: RNA002 and RNA004. Unless otherwise specified, this study focuses on RNA002 data.”

      (2) Page 3, “Employ models like Hidden Markov Models (HMM) to segment the signal, but they are prone to noise and inaccuracies.”

      That's the alignment/calling part, not the segmentation?

      Thank you for the comment. We apologize for the confusion. To clarify the distinction between segmentation and alignment, we added a new paragraph before the one in question to explain the general workflow of Nanopore DRS data analysis and to clearly define the task of segmentation. The added text reads:

      “The general workflow of Nanopore direct RNA sequencing (DRS) data analysis is as follows. First, the raw electrical signal from a read is basecalled using tools such as Guppy or Dorado, which produce the nucleotide sequence of the RNA molecule. However, these basecalled sequences do not include the precise start and end positions of each ribonucleotide (or k-mer) in the signal. Because basecalling errors are common, the sequences are typically mapped to a reference genome or transcriptome using minimap2 to recover the correct reference sequence. Next, tools such as Nanopolish and Tombo align the raw signal to the reference sequence to determine which portion of the signal corresponds to each k-mer. We define this process as the segmentation task, referred to as "eventalign" in Nanopolish. Based on this alignment, Nanopolish extracts various features—such as the start and end positions, mean, and standard deviation of the signal segment corresponding to a k-mer. This signal segment or its derived features is referred to as an "event" in Nanopolish.”

      We also revised the following paragraph describing SegPore to more clearly contrast its approach:

      “In SegPore, we first segment the raw signal into small fragments using a Hierarchical Hidden Markov Model (HHMM), where each fragment corresponds to a sub-state of a k-mer. Unlike Nanopolish and Tombo, which directly align the raw signal to the reference sequence, SegPore aligns the mean values of these small fragments to the reference. After alignment, we concatenate all fragments that map to the same k-mer into a larger segment, analogous to the "eventalign" output in Nanopolish. For RNA modification estimation, we use only the mean signal value of each reconstructed event.”

      We hope this revision clarifies the difference between segmentation and alignment in the context of our method and resolves the reviewer’s concern.

      (3) Page 4, Figure 1, “These segments are then aligned with the 5mer list of the reference sequence fragment using a full/partial alignment algorithm, based on a 5mer parameter table. For example, 𝐴𝑗 denotes the base "A" at the j-th position on the reference.”

      I think I do understand the meaning, but I do not understand the relevance of the Aj bit in the last sentence. What is it used for?

      When aligning the segments (output from Step 2) to the reference sequence in Step 3, it is possible for multiple segments to align to the same k-mer. This can occur particularly when the reference contains consecutive identical bases, such as multiple adenines (A). For example, as shown in Fig. 1A, Step 3, the first two segments (μ₁ and μ₂) are aligned to the first 'A' in the reference sequence, while the third segment is aligned to the second 'A'. In this case, the reference sequence AACTGGTTTC...GTC, which contains exactly two consecutive 'A's at the start. This notation helps to disambiguate segment alignment in regions with repeated bases.

      Additionally, this figure and its subscript include mapping with Guppy and Minimap2 but do not mention Nanopolish at all, while that seems an equally important step in the preprocessing (pg5). As such it is difficult to understand the role Nanopolish exactly plays. It's also not mentioned explicitly in the SegPore Workflow on pg15, perhaps it's part of step 1 there?

      We thank the reviewer for pointing this out. We apologize for the confusion. As mentioned in the public response to point 3 of Reviewer 2, SegPore uses Nanopolish to identify the poly(A) tail and transcript regions from the raw signal. SegPore then performs segmentation and alignment on the transcript portion only. This step is indeed part of Step 1 in the preprocessing workflow, as described in Supplementary Note 1, Section 3.

      To clarify this in the main text, we have updated the preprocessing paragraph on page 6 to explicitly describe the role of Nanopolish:

      “We begin by performing basecalling on the input fast5 file using Guppy, which converts the raw signal data into ribonucleotide sequences. Next, we align the basecalled sequences to the reference genome using Minimap2, generating a mapping between the reads and the reference sequences. Nanopolish provides two independent commands: "polya" and "eventalign".
The "polya" command identifies the adapter, poly(A) tail, and transcript region in the raw signal, which we refer to as the poly(A) detection results. The raw signal segment corresponding to the poly(A) tail is used to standardize the raw signal for each read. The "eventalign" command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference. It also computes summary statistics (e.g., mean, standard deviation) from the signal segment for each k-mer. Each k-mer together with its corresponding signal features is termed an event. These event features are then passed into downstream tools such as m6Anet and CHEUI for RNA modification detection. For full transcriptome analysis (Figure 3), we extract the aligned raw signal segment and reference sequence segment from Nanopolish's events for each read by using the first and last events as start and end points. For in vitro transcription (IVT) data with a known reference sequence (Figure 4), we extract the raw signal segment corresponding to the transcript region for each input read based on Nanopolish’s poly(A) detection results.”

      Additionally, we revised the legend of Figure 1A to explicitly include Nanopolish in step 1 as follows:

      “The raw current signal fragments are paired with the corresponding reference RNA sequence fragments using Nanopolish.”

      (4) Page 5, “The output of Step 3 is the "eventalign," which is analogous to the output generated by the Nanopolish "eventalign" command.”

      Naming the function of Nanopolish, the output file, and later on (pg9) the alignment of the newly introduced methods the exact same "eventalign" is very confusing.

      Thank you for the helpful comment. We acknowledge the potential confusion caused by using the term “eventalign” in multiple contexts. To improve clarity, we now consistently use the term “events” to refer to the output of both Nanopolish and SegPore, rather than using "eventalign" as a noun. We also added the following sentence to Step 3 (page 6) to clearly define what an “event” refers to in our manuscript:

      “An "event" refers to a segment of the raw signal that is aligned to a specific k-mer on a read, along with its associated features such as start and end positions, mean current, standard deviation, and other relevant statistics.”

      We have revised the text throughout the manuscript accordingly to reduce ambiguity and ensure consistent terminology.

      (5) Page 5, “Once aligned, we use Nanopolish's eventalign to obtain paired raw current signal segments and the corresponding fragments of the reference sequence, providing a precise association between the raw signals and the nucleotide sequence.”

      I thought the new method's HHMM was supposed to output an 'eventalign' formatted file. As this is not clearly mentioned elsewhere, is this a mistake in writing? Is this workflow dependent on Nanopolish 'eventalign' function and output or not?

      We apologize for the confusion. To clarify, SegPore is not dependent on Nanopolish’s eventalign function for generating the final segmentation results. As described in our response to your comment point 2 and elaborated in the revised text on page 4, SegPore uses its own HHMM-based segmentation model to divide the raw signal into small fragments, each corresponding to a sub-state of a k-mer. These fragments are then aligned to the reference sequence based on their mean current values.

      As explained in the revised manuscript:

      “In SegPore, we first segment the raw signal into small fragments using a Hierarchical Hidden Markov Model (HHMM), where each fragment corresponds to a sub-state of a k-mer. Unlike Nanopolish and Tombo, which directly align the raw signal to the reference sequence, SegPore aligns the mean values of these small fragments to the reference. After alignment, we concatenate all fragments that map to the same k-mer into a larger segment, analogous to the "eventalign" output in Nanopolish. For RNA modification estimation, we use only the mean signal value of each reconstructed event.”

      To avoid ambiguity, we have also revised the sentence on page 5 to more clearly distinguish the roles of Nanopolish and SegPore in the workflow. The updated sentence now reads:

      “Nanopolish provides two independent commands: "polya" and "eventalign".
The "polya" command identifies the adapter, poly(A) tail, and transcript region in the raw signal, which we refer to as the poly(A) detection results. The raw signal segment corresponding to the poly(A) tail is used to standardize the raw signal for each read. The "eventalign" command aligns the raw signal to a reference sequence, assigning a signal segment to individual k-mers in the reference. It also computes summary statistics (e.g., mean, standard deviation) from the signal segment for each k-mer. Each k-mer together with its corresponding signal features is termed an event. These event features are then passed into downstream tools such as m6Anet and CHEUI for RNA modification detection. For full transcriptome analysis (Figure 3), we extract the aligned raw signal segment and reference sequence segment from Nanopolish's events for each read by using the first and last events as start and end points. For in vitro transcription (IVT) data with a known reference sequence (Figure 4), we extract the raw signal segment corresponding to the transcript region for each input read based on Nanopolish’s poly(A) detection results.”

      (6) Page 5, “Since the polyA tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the polyA tail are consistent across all reads.”

      Perhaps I misread this statement: I interpret it as using the PolyA tail to do the normalization, rather than using the rest of the signal to do the normalization, and that results in consistent PolyA tails across all reads.

      If it's the latter, this should be clarified, and a little detail on how the normalization is done should be added, but if my first interpretation is correct:

      I'm not sure if its standard deviation is consistent across reads. The (true) value spread in this section of a read should be fairly limited compared to the rest of the signal in the read, so the noise would influence the scale quite quickly, and such noise might be introduced to pores wearing down and other technical influences. Is this really better than using the non-PolyA tail part of the reads signal, using Median Absolute Deviation to scale for a first alignment round, then re-fitting the signal scaling using Theil Sen on the resulting alignments (assigned read signal vs reference expected signal), as Tombo/Nanopolish (can) do?

      Additionally, this kind of normalization should have been part of the Nanopolish eventalign already, can this not be re-used? If it's done differently it may result in different distributions than the ONT kmer table obtained for the next step.

      Thank you for this detailed and thoughtful comment. We apologize for the confusion. The poly(A) tail–based normalization is indeed explained in Supplementary Note 1, Section 3, but we agree that the motivation needed to be clarified in the main text.

      We have now added the following sentence in the revised manuscript (before the original statement on page 5 to provide clearer context:

      “Due to inherent variability between nanopores in the sequencing device, the baseline levels and standard deviations of k-mer signals can differ across reads, even for the same transcript. To standardize the signal for downstream analyses, we extract the raw current signal segments corresponding to the poly(A) tail of each read. Since the poly(A) tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the poly(A) tail are consistent across all reads. This step is crucial for reducing…..”

      We chose to use the poly(A) tail for normalization because it is sequence-invariant—i.e., all poly(A) tails consist of identical k-mers, unlike transcript sequences which vary in composition. In contrast, using the transcript region for normalization can introduce biases: for instance, reads with more diverse k-mers (having inherently broader signal distributions) would be forced to match the variance of reads with more uniform k-mers, potentially distorting the baseline across k-mers.

      In our newly added RNA004 benchmark experiment, we used the default normalization provided by f5c, which does not include poly(A) tail normalization. Despite this, SegPore was still able to mask out noise and outperform both f5c and Uncalled4, demonstrating that our segmentation method is robust to different normalization strategies.

      (7) Page 7, “The initialization of the 5mer parameter table is a critical step in SegPore's workflow. By leveraging ONT's established kmer models, we ensure that the initial estimates for unmodified 5mers are grounded in empirical data.”

      It looks like the method uses Nanopolish for a first alignment, then improves the segmentation matching the reference sequence/expected 5mer values. I thought the Nanopolish model/tables are based on the same data, or similarly obtained. If they are different, then why the switch of kmer model? Now the original alignment may have been based on other values, and thus the alignment may seem off with the expected kmer values of this table.

      Thank you for this insightful question. To clarify, SegPore uses Nanopolish only to identify the poly(A) tail and transcript regions from the raw signal. In the bulk in vivo data analysis, we use Nanopolish’s first event as the start and the last event as the end to extract the aligned raw signal chunk and its corresponding reference sequence. Since SegPore relies on Nanopolish solely to delineate the transcript region for each read, it independently aligns the raw signals to the reference sequence without refining or adjusting Nanopolish’s segmentation results.

      While SegPore's 5-mer parameter table is initially seeded using ONT’s published unmodified k-mer models, we acknowledge that empirical signal values may deviate from these reference models due to run-specific technical variation and the presence of RNA modifications. For this reason, SegPore includes a parameter re-estimation step to refine the mean and standard deviation values of each k-mer based on the current dataset.

      The re-estimation process consists of two layers. In the outer layer, we select a set of 5mers that exhibit both modified and unmodified states based on the GMM results (Section 6 of Supplementary Note 1), while the remaining 5mers are assumed to have only unmodified states. In the inner layer, we align the raw signals to the reference sequences using the 5mer parameter table estimated in the outer layer (Section 5 of Supplementary Note 1). Based on the alignment results, we update the 5mer parameter table in the outer layer. This two-layer process is generally repeated for 3~5 iterations until the 5mer parameter table converges.This re-estimation ensures that:

      (1) The adjusted 5mer signal baselines remain close to the ONT reference (for consistency);

      (2) The alignment score between the observed signal and the reference sequence is optimized (as detailed in Equation 11, Section 5 of Supplementary Note 1);

      (3) Only 5mers that show a clear difference between the modified and unmodified components in the GMM are considered subject to modification.

      By doing so, SegPore achieves more accurate signal alignment independent of Nanopolish’s models, and the alignment is directly tuned to the data under analysis.

      (8) Page 9, “The output of the alignment algorithm is an eventalign, which pairs the base blocks with the 5mers from the reference sequence for each read (Fig. 1C).”

      “Modification prediction

      After obtaining the eventalign results, we estimate the modification state of each motif using the 5mer parameter table.”

      This wording seems to have been introduced on page 5 but (also there) reads a bit confusingly as the name of the output format, file, and function are now named the exact same "eventalign". I assume the obtained eventalign results now refer to the output of your HHMM, and not the original Nanopolish eventalign results, based on context only, but I'd rather have a clear naming that enables more differentiation.

      We apologize for the confusion. We have revised the sentence as follows for clarity:

      “A detailed description of both alignment algorithms is provided in Supplementary Note 1. The output of the alignment algorithm is an alignment that pairs the base blocks with the 5mers from the reference sequence for each read (Fig. 1C). Base blocks aligned to the same 5-mer are concatenated into a single raw signal segment (referred to as an “event”), from which various features—such as start and end positions, mean current, and standard deviation—are extracted. Detailed derivation of the mean and standard deviation is provided in Section 5.3 in Supplementary Note 1. In the remainder of this paper, we refer to these resulting events as the output of eventalign analysis or the segmentation task. ”

      (9) Page 9, “Since a single 5mer can be aligned with multiple base blocks, we merge all aligned base blocks by calculating a weighted mean. This weighted mean represents the single base block mean aligned with the given 5mer, allowing us to estimate the modification state for each site of a read.”

      I assume the weights depend on the length of the segment but I don't think it is explicitly stated while it should be.

      Thank you for the helpful observation. To improve clarity, we have moved this explanation to the last paragraph of the previous section (see response to point 8), where we describe the segmentation process in more detail.

      Additionally, a complete explanation of how the weighted mean is computed is provided in Section 5.3 of Supplementary Note 1. It is derived from signal points that are assigned to a given 5mer.

      (10) Page 10, “Afterward, we manually adjust the 5mer parameter table using heuristics to ensure that the modified 5mer distribution is significantly distinct from the unmodified distribution.”

      Using what heuristics? If this is explained in the supplementary notes then please refer to the exact section.

      Thank you for pointing this out. The heuristics used to manually adjust the 5mer parameter table are indeed explained in detail in Section 7 of Supplementary Note 1.

      To clarify this in the manuscript, we have revised the sentence as follows:

      “Afterward, we manually adjust the 5mer parameter table using heuristics to ensure that the modified 5mer distribution is significantly distinct from the unmodified distribution (see details in Section 7 of Supplementary Note 1).”

      (11) Page 10, “Once the table is fixed, it is used for RNA modification estimation in the test data without further updates.”

      By what tool/algorithm? Perhaps it is your own implementation, but with the next section going into segmentation benchmarking and using Nanopolish before this seems undefined.

      Thank you for pointing this out. We use our own implementation. See Algorithm 3 in Section 6 of Supplementary Note 1.

      We have revised the sentence for clarity:

      “Once a stabilized 5mer parameter table is estimated from the training data, it is used for RNA modification estimation in the test data without further updates. A more detailed description of the GMM re-estimation process is provided in Section 6 of Supplementary Note 1.”

      (12) Page 11, “A 5mer was considered significantly modified if its read coverage exceeded 1,500 and the distance between the means of the two Gaussian components in the GMM was greater than 5.”

      Considering the scaling done before also not being very detailed in what range to expect, this cutoff doesn't provide any useful information. Is this a pA value?

      Thank you for the observation. Yes, the value refers to the current difference measured in picoamperes (pA). To clarify this, we have revised the sentence in the manuscript to include the unit explicitly:

      “A 5mer was considered significantly modified if its read coverage exceeded 1,500 and the distance between the means of the two Gaussian components in the GMM was greater than 5 picoamperes (pA).”

      (13) Page 13, “The raw current signals, as shown in Figure 1B.”

      Wrong figure? Figure 2B seems logical.

      Thank you for catching this. You are correct—the reference should be to Figure 2B, not Figure 1B. We have corrected this in the revised manuscript.

      (14) Page 14, Figure 2A, these figures supposedly support the jiggle hypothesis but the examples seem to match only half the explanation. Any of these jiggles seem to be followed shortly by another in the opposite direction, and the amplitude seems to match better within each such pair than the next or previous segments. Perhaps there is a better explanation still, and this behaviour can be modelled as such instead.

      Thank you for your comment. We acknowledge that the observed signal patterns may appear ambiguous and could potentially suggest alternative explanations. However, as shown in Figure 2A, the red dots tend to align closely with the baseline of the previous state, while the blue dots align more closely with the baseline of the next state. We interpret this as evidence for the "jiggling" hypothesis, where k-mer temporarily oscillates between adjacent states during translocation.

      That said, we agree that more sophisticated models could be explored to better capture this behavior, and we welcome suggestions or references to alternative models. We will consider this direction in future work.

      (15) Page 15, “This occurs because subtle transitions within a base block may be mistaken for transitions between blocks, leading to inflated transition counts.”

      Is it really a "subtle transition" if it happens within a base block? It seems this is not a transition and thus shouldn't be named as such.

      Thank you for pointing this out. We agree that the term “subtle transition” may be misleading in this context. We revised the sentence to clarify the potential underlying cause of the inflated transition counts:

      “This may be due to a base block actually corresponding to a sub-state of a single 5mer, rather than each base block corresponding to a full 5mer, leading to inflated transition counts. To address this issue, SegPore’s alignment algorithm was refined to merge multiple base blocks (which may represent sub-states of the same 5mer) into a single 5mer, thereby facilitating further analysis.”

      (16) Page 15, “The SegPore "eventalign" output is similar to Nanopolish's "eventalign" command.”

      To the output of that command, I presume, not to the command itself.

      Thank you for pointing out the ambiguity. We have revised the sentence for clarity:

      “The final outputs of SegPore are the events and modification state predictions. SegPore’s events are similar to the outputs of Nanopolish’s "eventalign" command, in that they pair raw current signal segments with the corresponding RNA reference 5-mers. Each 5-mer is associated with various features — such as start and end positions, mean current, and standard deviation — derived from the paired signal segment.”

      (17) Page 15, “For selected 5mers, SegPore also provides the modification rate for each site and the modification state of that site on individual reads.”

      What selection? Just all kmers with a possible modified base or a more specific subset?

      We revised the sentence to clarify the selection criteria:

      “For selected 5mers that exhibit both a clearly unmodified and a clearly modified signal component, SegPore reports the modification rate at each site, as well as the modification state of that site on individual reads.”

      (18) Page 16, “A key component of SegPore is the 5mer parameter table, which specifies the mean and standard deviation for each 5mer in both modified and unmodified states (Figure 2A).”

      Wrong figure?

      Thank you for pointing this out. You are correct—it should be Figure 1A, not Figure 2A. We intended to visually illustrate the structure of the 5mer parameter table in Figure 1A, and we have corrected this reference in the revised manuscript.

      (19) Page 16, Table 1, I can't quite tell but I assume this is based on all kmers in the table, not just a m6A modified subset. A short added statement to make this clearer would help.

      Yes, you are right—it is averaged over all 5mers. We have revised the sentence for clarity as follows:

      " As shown in Table 1, SegPore consistently achieved the best performance averaged on all 5mers across all datasets..…."

      (20) Page 16, “Since the peaks (representing modified and unmodified states) are separable for only a subset of 5mers, SegPore can provide modification parameters for these specific 5mers. For other 5mers, modification state predictions are unavailable.”

      Can this be improved using some heuristics rather than the 'distance of 5' cutoff as described before? How small or big is this subset, compared to how many there should be to cover all cases?

      We agree that more sophisticated strategies could potentially improve performance. In this study, we adopted a relatively conservative approach to minimize false positives by using a heuristic cutoff of 5 picoamperes. This value was selected empirically and we did not explore alternative cutoffs. Future work could investigate more refined or data-driven thresholding strategies.

      (21) Page 16, “Tombo used the "resquiggle" method to segment the raw signals, and we standardized the segments using the polyA tail to ensure a fair comparison.”

      I don't know what or how something is "standardized" here.

      Standardized’ refers to the poly(A) tail–based signal normalization described in our response to point 6. We applied this normalization to Tombo’s output to ensure a fair comparison across methods. Without this standardization, Tombo’s performance was notably worse. We revised the sentence as follows:

      “Tombo used the "resquiggle" method to segment the raw signals, and we standardized the segments using the poly(A) tail to ensure a fair comparison (See preprocessing section in Materials and Methods).”

      (22) Page 16, “To benchmark segmentation performance, we used two key metrics: (1) the log-likelihood of the segment mean, which measures how closely the segment matches ONT's 5mer parameter table (used as ground truth), and (2) the standard deviation (std) of the segment, where a lower std indicates reduced noise and better segmentation quality. If the raw signal segment aligns correctly with the corresponding 5mer, its mean should closely match ONT's reference, yielding a high log-likelihood. A lower std of the segment reflects less noise and better performance overall.”

      Here the segmentation part becomes a bit odd:

      A: Low std can be/is achieved by dropping any noisy bits, making segments really small (partly what happens here with the transition segments). This may be 'true' here, in the sense that the transition is not really part of the segment, but the comparison table is a bit meaningless as the other tools forcibly assign all data to kmers, instead of ignoring parts as transition states. In other words, it is a benchmark that is easy to cheat by assigning more data to noise/transition states.

      B: The values shown are influenced by the alignment made between the read and expected reference signal. Especially Tombo tends to forcibly assign data to whatever looks the most similar nearby rather than providing the correct alignment. So the "benchmark of the segmentation performance" is more of an "overall benchmark of the raw signal alignment". Which is still a good, useful thing, but the text seems to suggest something else.

      Thank you for raising these important concerns regarding the segmentation benchmarking.

      Regarding point A, the base blocks aligned to the same 5mer are concatenated into a single segment, including the short transition blocks between them. These transition blocks are typically very short (4~10 signal points, average 6 points), while a typical 5mer segment contains around 20~60 signal points. To assess whether SegPore’s performance is inflated by excluding transition segments, we conducted an additional comparison: we removed 6 boundary signal points (3 from the start and 3 from the end) from each 5mer segment in Nanopolish and Tombo’s results to reduce potential noise. The new comparison table is shown in the following:

      SegPore consistently demonstrates superior performance. Its key contribution lies in its ability to recognize structured noise in the raw signal and to derive more accurate mean and standard deviation values that more faithfully represent the true state of the k-mer in the pore. The improved mean estimates are evidenced by the clearly separated peaks of modified and unmodified 5mers in Figures 3A and 4B, while the improved standard deviation is reflected in the segmentation benchmark experiments.

      Regarding point B, we apologize for the confusion. We have added a new paragraph to the introduction to clarify that the segmentation task indeed includes the alignment step.

      “The general workflow of Nanopore direct RNA sequencing (DRS) data analysis is as follows. First, the raw electrical signal from a read is basecalled using tools such as Guppy or Dorado, which produce the nucleotide sequence of the RNA molecule. However, these basecalled sequences do not include the precise start and end positions of each ribonucleotide (or k-mer) in the signal. Because basecalling errors are common, the sequences are typically mapped to a reference genome or transcriptome using minimap2 to recover the correct reference sequence. Next, tools such as Nanopolish and Tombo align the raw signal to the reference sequence to determine which portion of the signal corresponds to each k-mer. We define this process as the segmentation task, referred to as "eventalign" in Nanopolish. Based on this alignment, Nanopolish extracts various features—such as the start and end positions, mean, and standard deviation of the signal segment corresponding to a k-mer. This signal segment or its derived features is referred to as an "event" in Nanopolish. The resulting events serve as input for downstream RNA modification detection tools such as m6Anet and CHEUI.”

      (23) Page 17 “Given the comparable methods and input data requirements, we benchmarked SegPore against several baseline tools, including Tombo, MINES (26), Nanom6A (27), m6Anet, Epinano (28), and CHEUI (29).”

      It seems m6Anet is actually Nanopolish+m6Anet in Figure 3C, this needs a minor clarification here.

      m6Anet uses Nanopolish’s estimated events as input by default.

      (24) Page 18, Figure 3, A and B are figures without any indication of what is on the axis and from the text I believe the position next to each other on the x-axis rather than overlapping is meaningless, while their spread is relevant, as we're looking at the distribution of raw values for this 5mer. The figure as is is rather confusing.

      Thanks for pointing out the confusion. We have added concrete values to the axes in Figures 3A and 3B and revised the figure legend as follows in the manuscript:

      “(A) Histogram of the estimated mean from current signals mapped to an example m6A-modified genomic location (chr10:128548315, GGACT) across all reads in the training data, comparing Nanopolish (left) and SegPore (right). The x-axis represents current in picoamperes (pA).

      (B) Histogram of the estimated mean from current signals mapped to the GGACT motif at all annotated m6A-modified genomic locations in the training data, again comparing Nanopolish (left) and SegPore (right). The x-axis represents current in picoamperes (pA).”

      (25) Page 18 “SegPore's results show a more pronounced bimodal distribution in the raw signal segment mean, indicating clearer separation of modified and unmodified signals.”

      Without knowing the correct values around the target kmer (like Figure 4B), just the more defined bimodal distribution could also indicate the (wrongful) assignment of neighbouring kmer values to this kmer instead, hence this statement lacks some needed support, this is just one interpretation of the possible reasons.

      Thank you for the comment. We have added concrete values to Figures 3A and 3B to support this point. Both peaks fall within a reasonable range: the unmodified peak (125 pA) is approximately 1.17 pA away from its reference value of 123.83 pA, and the modified peak (118 pA) is around 7 pA away from the unmodified peak. This shift is consistent with expected signal changes due to RNA modifications (usually less than 10 pA), and the magnitude of the difference suggests that the observed bimodality is more likely caused by true modification events rather than misalignment.

      (26) Page 18 “Furthermore, when pooling all reads mapped to m6A-modified locations at the GGACT motif, SegPore showed prominent peaks (Fig. 3B), suggesting reduced noise and improved modification detection.”

      I don't think the prominent peaks directly suggest improved detection, this statement is a tad overreaching.

      We revised the sentense to the following:

      “SegPore exhibited more distinct peaks (Fig. 3B), indicating reduced noise and potentially enabling more reliable modification detection”.

      (27) Page18 “(2) direct m6A predictions from SegPore's Gaussian Mixture Model (GMM), which is limited to the six selected 5mers.”

      The 'six selected' refers to what exactly? Also, 'why' this is limited to them is also unclear as it is, and it probably would become clearer if it is clearly defined what this refers to.

      It is explained the page 16 in the SegPore’s workflow in the original manuscript as follows:

      “A key component of SegPore is the 5mer parameter table, which specifies the mean and standard deviation for each 5mer in both modified and unmodified states (Fig. 2A1A). Since the peaks (representing modified and unmodified states) are separable for only a subset of 5mers, SegPore can provide modification parameters for these specific 5mers. For other 5mers, modification state predictions are unavailable.”

      e select a small set of 5mers that show clear peaks (modified and unmodified 5mers) in GMM in the m6A site-level data analysis. These 5mers are provided in Supplementary Fig. S2C, as explained in the section “m6A site level benchmark” in the Material and Methods (page 12 in the original manuscript).

      “…transcript locations into genomic coordinates. It is important to note that the 5mer parameter table was not re-estimated for the test data. Instead, modification states for each read were directly estimated using the fixed 5mer parameter table. Due to the differences between human (Supplementary Fig. S2A) and mouse (Supplementary Fig. S2B), only six 5mers were found to have m6A annotations in the test data’s ground truth (Supplementary Fig. S2C). For a genomic location to be identified as a true m6A modification site, it had to correspond to one of these six common 5mers and have a read coverage of greater than 20. SegPore derived the ROC and PR curves for benchmarking based on the modification rate at each genomic location….”

      We have updated the sentence as follows to increase clarity:

      “which is limited to the six selected 5mers that exhibit clearly separable modified and unmodified components in the GMM (see Materials and Methods for details).”

      (28) Page 19, Figure 4C, the blue 'Unmapped' needs further explanation. If this means the segmentation+alignment resulted in simply not assigning any segment to a kmer, this would indicate issues in the resulting mapping between raw data and kmers as the data that probably belonged to this kmer is likely mapped to a neighbouring kmer, possibly introducing a bimodal distribution there.

      This is due to deletion event in the full alignment algorithm. See Page 8 of SupplementaryNote1:

      During the traceback step of the dynamic programming matrix, not every 5mer in the reference sequence is assigned a corresponding raw signal fragment—particularly when the signal’s mean deviates substantially from the expected mean of that 5mer. In such cases, the algorithm considers the segment to be generated by an unknown 5mer, and the corresponding reference 5mer is marked as unmapped.

      (29) Page 19, “For six selected m6A motifs, SegPore achieved an ROC AUC of 82.7% and a PR AUC of 38.7%, earning the third-best performance compared with deep leaning methods m6Anet and CHEUI (Fig. 3D).”

      How was this selection of motifs made, are these related to the six 5mers in the middle of Supplementary Figure S2? Are these the same six as on page 18? This is not clear to me.

      It is the same, see the response to point 27.

      (30) Page 21 “Biclustering reveals that modifications at the 6th, 7th, and 8th genomic locations are specific to certain clusters of reads (clusters 4, 5, and 6), while the first five genomic locations show similar modification patterns across all reads.”

      This reads rather confusingly. Both the '6th, 7th, and 8th genomic locations' and 'clusters 4,5,6' should be referred to in clearer terms. Either mark them in the figure as such or name them in the text by something that directly matches the text in the figure.

      We have added labels to the clusters and genomic locations Figure 4C, and revised the sentence as follows:

      “Biclustering reveals that modifications at g6 are specific to cluster C4, g7 to cluster C5, and g8 to cluster C6, while the first five genomic locations (g1 to g5) show similar modification patterns across all reads.”

      (31) Page 21, “We developed a segmentation algorithm that leverages the jiggling property in the physical process of DRS, resulting in cleaner current signals for m6A identification at both the site and single-molecule levels.”

      Leverages, or just 'takes into account'?

      We designed our HHMM specifically based on the jiggling hypothesis, so we believe that using the term “leverage” is appropriate.

      (32) Page 21, “Our results show that m6Anet achieves superior performance, driven by SegPore's enhanced segmentation.”

      Superior in what way? It barely improves over Nanopolish in Figure 3C and is outperformed by other methods in Figure 3D. The segmentation may have improved but this statement says something is 'superior' driven by that 'enhanced segmentation', so that cannot refer to the segmentation itself.

      We revise it as follows in the revised manuscript:

      ”Our results demonstrate that SegPore’s segmentation enables clear differentiation between m6A-modified and unmodified adenosines.”

      (33) Page 21, “In SegPore, we assume a drastic change between two consecutive 5mers, which may hold for 5mers with large difference in their current baselines but may not hold for those with small difference.”

      The implications of this assumption don't seem highlighted enough in the work itself and may be cause for falsely discovering bi-modal distributions. What happens if such a 5mer isn't properly split, is there no recovery algorithm later on to resolve these cases?

      We agree that there is a risk of misalignment, which can result in a falsely observed bimodal distribution. This is a known and largely unavoidable issue across all methods, including deep neural network–based methods. For example, many of these models rely on a CTC (Connectionist Temporal Classification) layer, which implicitly performs alignment and may also suffer from similar issues.

      Misalignment is more likely when the current baselines of neighboring k-mers are close. In such cases, the model may struggle to confidently distinguish between adjacent k-mers, increasing the chance that signals from neighboring k-mers are incorrectly assigned. Accurate baseline estimation for each k-mer is therefore critical—when baselines are accurate, the correct alignment typically corresponds to the maximum likelihood.

      We have added the following sentence to the discussion to acknowledge this limitation:

      “As with other RNA modification estimation methods, SegPore can be affected by misalignment errors, particularly when the baseline signals of adjacent k-mers are similar. These cases may lead to spurious bimodal signal distributions and require careful interpretation.”

      (34) Page 21, “Currently, SegPore models only the modification state of the central nucleotide within the 5mer. However, modifications at other positions may also affect the signal, as shown in Figure 4B. Therefore, introducing multiple states to the 5mer could help to improve the performance of the model.”

      The meaning of this statement is unclear to me. Is SegPore unable to combine the information of overlapping kmers around a possibly modified base (central nucleotide), or is this referring to having multiple possible modifications in a single kmer (multiple states)?

      We mean there can be modifications at multiple positions of a single 5mer, e.g. C m5C m6A m7G T. We have revised the sentence to:

      “Therefore, introducing multiple states for a 5mer to accout for modifications at mutliple positions within the same 5mer could help to improve the performance of the model.”

      (35) Page 22, “This causes a problem when apply DNN-based methods to new dataset without short read sequencing-based ground truth. Human could not confidently judge if a predicted m6A modification is a real m6A modification.”

      Grammatical errors in both these sentences. For the 'Human could not' part, is this referring to a single person's attempt or more extensively tested?

      Thanks for the comment. We have revised the sentence as follows:

      “This poses a challenge when applying DNN-based methods to new datasets without short-read sequencing-based ground truth. In such cases, it is difficult for researchers to confidently determine whether a predicted m6A modification is genuine (see Supplmentary Figure S5).”

      (36) Page 22, “…which is easier for human to interpret if a predicted m6A site is real.”

      "a" human, but also this probably meant to say 'whether' instead of 'if', or 'makes it easier'.

      Thanks for the advice. We have revise the sentence as follows:

      “One can generally observe a clear difference in the intensity levels between 5mers with an m6A and those with a normal adenosine, which makes it easier for a researcher to interpret whether a predicted m6A site is genuine.”

      (37) Page 22, “…and noise reduction through its GMM-based approach…”

      Is the GMM providing noise reduction or segmentation?

      Yes, we agree that it is not relevant. We have removed the sentence in the revised manuscript as follows:

      “Although SegPore provides clear interpretability and noise reduction through its GMM-based approach, there is potential to explore DNN-based models that can directly leverage SegPore's segmentation results.”

      (38) Page 23, “SegPore effectively reduces noise in the raw signal, leading to improved m6A identification at both site and single-molecule levels…”

      Without further explanation in what sense this is meant, 'reduces noise' seems to overreach the abilities, and looks more like 'masking out'.

      Following the reviewer’s suggestion, we change it to ‘mask out'’ in the revised manuscript.

      “SegPore effectively masks out noise in the raw signal, leading to improved m6A identification at both site and single-molecule levels.”

      Reviewer #3 (Recommendations for the authors):

      I recommend the publication of this manuscript, provided that the following comments (and the comments above) are addressed.

      In general, the authors state that SegPore represents an improvement on existing software. These statements are largely unquantified, which erodes their credibility. I have specified several of these in the Minor comments section.

      Page 5, Preprocessing: The authors comment that the poly(A) tail provides a stable reference that is crucial for the normalisation of all reads. How would this step handle reads that have variable poly(A) tail lengths? Or have interrupted poly(A) tails (e.g. in the case of mRNA vaccines that employ a linker sequence)?

      We apologize for the confusion. The poly(A) tail–based normalization is explained in Supplementary Note 1, Section 3.

      As shown in Author response image 1 below, the poly(A) tail produces a characteristic signal pattern—a relatively flat, squiggly horizontal line. Due to variability between nanopores, raw current signals often exhibit baseline shifts and scaling of standard deviations. This means that the signal may be shifted up or down along the y-axis and stretched or compressed in scale.

      Author response image 1.

      The normalization remains robust with variable poly(A) tail lengths, as long as the poly(A) region is sufficiently long. The linker sequence will be assigned to the adapter part rather than the poly(A) part.

      To improve clarity in the revised manuscript, we have added the following explanation:

      “Due to inherent variability between nanopores in the sequencing device, the baseline levels and standard deviations of k-mer signals can differ across reads, even for the same transcript. To standardize the signal for downstream analyses, we extract the raw current signal segments corresponding to the poly(A) tail of each read. Since the poly(A) tail provides a stable reference, we normalize the raw current signals across reads, ensuring that the mean and standard deviation of the poly(A) tail are consistent across all reads. This step is crucial for reducing…..”

      We chose to use the poly(A) tail for normalization because it is sequence-invariant—i.e., all poly(A) tails consist of identical k-mers, unlike transcript sequences which vary in composition. In contrast, using the transcript region for normalization can introduce biases: for instance, reads with more diverse k-mers (having inherently broader signal distributions) would be forced to match the variance of reads with more uniform k-mers, potentially distorting the baseline across k-mers.

      Page 7, 5mer parameter table: r9.4_180mv_70bps_5mer_RNA is an older kmer model (>2 years). How does your method perform with the newer RNA kmer models that do permit the detection of multiple ribonucleotide modifications? Addressing this comment is crucial because it is feasible that SegPore will underperform in comparison to the newer RNA base caller models (requiring the use of RNA004 datasets).

      Thank you for highlighting this important point. For RNA004, we have updated SegPore to ensure compatibility with the latest kit. In our revised manuscript, we demonstrate that the translocation-based segmentation hypothesis remains valid for RNA004, as supported by new analyses presented in the supplementary Figure S4.

      Additionally, we performed a new benchmark with f5c and Uncalled4 in RNA004 data in the revised manuscript (Table 2), where SegPore exhibit a better performance than f5c and Uncalled4.

      We agree that benchmarking against the latest Dorado models—specifically rna004_130bps_hac@v5.1.0 and rna004_130bps_sup@v5.1.0, which include built-in modification detection capabilities—would provide valuable context for evaluating the utility of SegPore. However, generating a comprehensive k-mer parameter table for RNA004 requires a large, well-characterized dataset. At present, such data are limited in the public domain. Additionally, Dorado is developed by ONT and its internal training data have not been released, making direct comparisons difficult.

      Our current focus is on improving raw signal segmentation quality, which are upstream tasks critical to many downstream analyses, including RNA modification detection. Future work may include benchmarking SegPore against models like Dorado once appropriate data become available.

      The Methods and Results sections contain redundant information - please streamline the information in these sections and reduce the redundancy. For example, the benchmarking section may be better situated in the Results section.

      Following your advice, we have removed redundant texts about the Segmentation benchmark from Materials and Methods in the revised manuscript.

      Minor comments

      (1) Introduction

      Page 3: "By incorporating these dynamics into its segmentation algorithm...". Please provide an example of how motor protein dynamics can impact RNA translocation. In particular, please elaborate on why motor protein dynamics would impact the translocation of modified ribonucleotides differently to canonical ribonucleotides. This is provided in the results, but please also include details in the Introduction.

      Following your advice, we added one sentence to explain how the motor protein affect the translocation of the DNA/RNA molecule in the revised manuscript.

      “This observation is also supported by previous reports, in which the helicase (the motor protein) translocates the DNA strand through the nanopore in a back-and-forth manner. Depending on ATP or ADP binding, the motor protein may translocate the DNA/RNA forward or backward by 0.5-1 nucleotides.”

      As far as we understand, this translocation mechanism is not specific to modified or unmodified nucleotides. For further details, we refer the reviewer to the original studies cited.

      Page 3: "This lack of interpretability can be problematic when applying these methods to new datasets, as researchers may struggle to trust the predictions without a clear understanding of how the results were generated." Please provide details and citations as to why researchers would struggle to trust the predictions of m6Anet. Is it due to a lack of understanding of how the method works, or an empirically demonstrated lack of reliability?

      Thank you for pointing this out. The lack of interpretability in deep learning models such as m6Anet stems primarily from their “black-box” nature—they provide binary predictions (modified or unmodified) without offering clear reasoning or evidence for each call.

      When we examined the corresponding raw signals, we found it difficult to visually distinguish whether a signal segment originated from a modified or unmodified ribonucleotide. The difference is often too subtle to be judged reliably by a human observer. This is illustrated in the newly added Supplementary Figure S5, which shows Nanopolish-aligned raw signals for the central 5mer GGACT in Figure 4B, displayed both uncolored and colored by modification state (according to the ground truth).

      Although deep neural networks can learn subtle, high-dimensional patterns in the signal that may not be readily interpretable, this opacity makes it difficult for researchers to trust the predictions—especially in new datasets where no ground truth is available. The issue is not necessarily an empirically demonstrated lack of reliability, but rather a lack of transparency and interpretability.

      We have updated the manuscript accordingly and included Supplementary Figure S5 to illustrate the difficulty in interpreting signal differences between modified and unmodified states.

      Page 3: "Instead of relying on complex, opaque features...". Please provide evidence that the research community finds the figures generated by m6Anet to be difficult to interpret, or delete the sections relating to its perceived lack of usability.

      See the figure provided in the response to the previous point. We added a reference to this figure in the revised manuscript.

      “Instead of relying on complex, opaque features (see Supplementary Figure S5), SegPore leverages baseline current levels to distinguish between…..”

      (2) Materials and Methods

      Page 5, Preprocessing: "We begin by performing basecalling on the input fast5 file using Guppy, which converts the raw signal data into base sequences.". Please change "base" to ribonucleotide.

      Revised as requested.

      Page 5 and throughout, please refer to poly(A) tail, rather than polyA tail throughout.

      Revised as requested.

      Page 5, Signal segmentation via hierarchical Hidden Markov model: "...providing more precise estimates of the mean and variance for each base block, which are crucial for downstream analyses such as RNA modification prediction." Please specify which method your HHMM method improves upon.

      Thank you for the suggestion. Since this section does not include a direct comparison, we revised the sentence to avoid unsupported claims. The updated sentence now reads:

      "...providing more precise estimates of the mean and variance for each base block, which are crucial for downstream analyses such as RNA modification prediction."

      Page 10, GMM for 5mer parameter table re-estimation: "Typically, the process is repeated three to five times until the 5mer parameter table stabilizes." How is the stabilisation of the 5mer parameter table quantified? What is a reasonable cut-off that would demonstrate adequate stabilisation of the 5mer parameter table?

      Thank you for the comment. We assess the stabilization of the 5mer parameter table by monitoring the change in baseline values across iterations. If the absolute change in baseline values for all 5mers is less than 1e-5 between two consecutive iterations, we consider the estimation to have stabilized.

      Page 11, M6A site level benchmark: why were these datasets selected? Specifically, why compare human and mouse ribonuclotide modification profiles? Please provide a justification and a brief description of the experiments that these data were derived from, and why they are appropriate for benchmarking SegPore.

      Thank you for the comment. These data are taken from a previous benchmark studie about m6A estimation from RNA002 data in the literature (https://doi.org/10.1038/s41467-023-37596-5). We think the data are appropreciate here.

      Thank you for the comment. The datasets used were taken from a previous benchmark study on m6A estimation using RNA002 data (https://doi.org/10.1038/s41467-023-37596-5). These datasets include human and mouse transcriptomes and have been widely used to evaluate the performance of RNA modification detection tools. We selected them because (i) they are based on RNA002 chemistry, which matches the primary focus of our study, and (ii) they provide a well-characterized and consistent benchmark for assessing m6A detection performance. Therefore, we believe they are appropriate for validating SegPore.

      (3) Results

      Page 13, RNA translocation hypothesis: "The raw current signals, as shown in Fig. 1B...". Please check/correct figure reference - Figure 1B does not show raw current signals.

      Thank you for pointing this out. The correct reference should be Figure 2B. We have updated the figure citation accordingly in the revised manuscript.

      Page 19, m6A identification at the site level: "For six selected m6A motifs, SegPore achieved an ROC AUC of 82.7% and a PR AUC of 38.7%, earning the third best performance compared with deep leaning methods m6Anet and CHEUI (Fig. 3D)." SegPore performs third best of all deep learning methods. Do the authors recommend its use in conjunction with m6Anet for m6A detection? Please clarify in the text.

      This sentence aims to convey that SegPore alone can already achieve good performance. If interpretability is the primary goal, we recommend using SegPore on its own. However, if the objective is to identify more potential m6A sites, we suggest using the combined approach of SegPore and m6Anet. That said, we have chosen not to make explicit recommendations in the main text to avoid oversimplifying the decision or potentially misleading readers.

      Page 19, m6A identification at the single molecule level: "one transcribed with m6A and the other with normal adenosine". I assume that this should be adenine? Please replace adenosine with adenine throughout.

      Thank you for pointing this out. We have revised the sentence to use "adenine" where appropriate. In other instances, we retain "adenosine" when referring specifically to adenine bound to a ribose sugar, which we believe is suitable in those contexts.

      Page 19, m6A identification at the single molecule level: "We used 60% of the data for training and 40% for testing". How many reads were used for training and how many for testing? Please comment on why these are appropriate sizes for training and testing datasets.

      In total, there are 1.9 million reads, with 1.14 million used for training and 0.76 million  for testing (60% and 40%, respectively). We chose this split to ensure that the training set is sufficiently large to reliably estimate model parameters, while the test set remains substantial enough to robustly evaluate model performance. Although the ratio was selected somewhat arbitrarily, it balances the need for effective training with rigorous validation.

      (4) Discussion

      Page 21: "We believe that the de-noised current signals will be beneficial for other downstream tasks." Which tasks? Please list an example.

      We have revised the text for clarity as follows:

      “We believe that the de-noised current signals will be beneficial for other downstream tasks, such as the estimation of m5C, pseudouridine, and other RNA modifications.”

      Page 22: "One can generally observe a clear difference in the intensity levels between 5mers with a m6A and normal adenosine, which is easier for human to interpret if a predicted m6A site is real." This statement is vague and requires qualification. Please reference a study that demonstrates the human ability to interpret two similar graphs, and demonstrate how it relates to the differences observed in your data.

      We apologize for the confusion. We have revised the sentence as follows:

      “One can generally observe a clear difference in the intensity levels between 5mers with an m6A and those with a normal adenosine, which makes it easier for a researcher to interpret whether a predicted m6A site is genuine.”

      We believe that Figures 3A, 3B, and 4B effectively illustrate this concept.

      Page 23: How long does SegPore take for its analyses compared to other similar tools? How long would it take to analyse a typical dataset?

      We have added run-time statistics for datasets of varying sizes in the revised manuscript (see Supplementary Figure S6). This figure illustrates SegPore’s performance across different data volumes to help estimate typical processing times.

      (5) Figures

      Figure 4C. Please number the hierachical clusters and genomic locations in this figure. They are referenced in the text.

      Following your suggestion, we have labeled the hierarchical clusters and genomic locations in Figure 4C in the revised manuscript.

      In addition, we revised the corresponding sentence in the main text as follows: “Biclustering reveals that modifications at g6 are specific to cluster C4, g7 to cluster C5, and g8 to cluster C6, while the first five genomic locations (g1 to g5) show similar modification patterns across all reads.”

    1. Author response:

      The following is the authors’ response to the original reviews

      Recommendations for the Authors:

      Reviewer #1:

      We think that this manuscript brings an important contribution that will be of interest in the areas of statistical physicists, (microbiota) ecology, and (biological) data science. The evidence of their results is solid and the work improves the state-of-the-art in terms of methods. We have a few concerns that, in our opinion, the authors should address.

      Major concerns:

      (1) While the paper could be of interest for the broad audience of e-Life, the way it is written is accessible mainly to physicists. We encourage the authors to take the broad audience into account by i) explaining better the essence of what is being done at each step, ii) highlighting the relevance of the method compared to other methods, iii) discussing the ecological implications of the results.

      Examples on how to approach i) include: Modify or expand Figure 1 so that non-familiar readers can understand the summary of the work (e.g. with cartoons representing communities, diseased states and bacterial interactions and their relationship with the inference method); in each section, summarize at the beginning the purpose of what is going to be addressed in this section, and summarize at the end what the section has achieved; in Figure 2, replace symbols by their meaning as much as possible-the same for Figure 1, at the very least in the figure caption.

      Example on how to approach ii): Since the authors aim to establish a bridge between disordered systems and microbiome ecology, it could be useful to expand a bit the introduction on disordered systems for biologists/biophysicists. This could be done with an additional text box, which could also highlight the advantages of this approach in comparison to other techniques (e.g. model-free approaches can also classify healthy and diseased states).

      Example on how to approach iii): The authors could discuss with more depth the ecological implications of their results. For example, do they have a hypothesis on why demographic and neutral effects could dominate in healthy patients?

      We thank the reviewer for the observations. Following the suggestion in the revised version, each section outlines the goal of what will be addressed in that section, and summarizes what we have achieved at the end; We also updated Figure 1 and Figure 2.

      (i) For figure 1, we expanded and hopefully made more clear how we conceptualize the problem, use the data, andestablish our method. In Figure 2, we enriched the y labels of each panel with the name associated with the order parameter.

      (ii) We thank the reviewer for helping us improve the readability of the introductory part, thus providing moreinsights into disordered systems techniques for a broader audience. We have added a few explanations at the end of page 2 – to explain the advantages of such methodology compared to other strategies and models.

      (iii) We thank the reviewer for raising the need for a more in-depth ecological discussion of our results. A simple wayto understand why neutral effects may dominate in healthy patients is the following. Neutrality implies that species differences are mainly shaped by stochastic processes such as demographic noise, with species treated as different realizations of the same underlying stochastic ecological dynamics. In our analysis, we observe that healthy individuals tend to exhibit highly similar microbial communities, suggesting that the compositional variability among their microbiomes is compatible—at least in part—with the fluctuations expected from demographic stochasticity alone. In contrast, patients with the disease display significantly more heterogeneous microbial compositions. The diversity and structure of their gut communities cannot be satisfactorily explained by neutral demographic fluctuations alone.

      This discrepancy implies that additional deterministic forces—such as altered ecological interactions—are driving the divergence observed in dysbiotic states. In diseased individuals, the breakdown of such interactions leads to a structurally distinct regime that may correspond to a phase of marginal stability, as indicated by our theoretical modeling. This shift marks a transition from a community governed by neutrality and demographic noise to one dominated by non-neutral ecological forces (as depicted in Figure 4). We added these comments in the discussion section of the revised manuscript.

      (2) Taking into account the broader audience, we invite the authors to edit the abstract, as it seems to jump from one ecological concept to another without explicitly communicating what is the link between these concepts. From the first two sentences, the motivation seems to be species diversity, but no mention of diversity comes after the second sentence. There is no proper introduction/definition of what macroecological states are. After that, the authors switch to healthy and unhealthy states, without previously introducing any link between gut microbiota states and the host’s health (which perhaps could be good in the first or second sentence, although other framings can be as valid). After that, interactions appear in the text and are related to instability, but the reader might not know whether this is surprising or if healthy/unhealthy states are generally related to stability.

      We pointed out a few examples, but the authors could extend their revision on i), ii) and iii) beyond such specific comments. In our opinion, this would really benefit the paper.

      In response to the reviewer’s concern about conceptual clarity and structure, we substantially revised the abstract to improve its accessibility and logical flow. In the revised abstract, we now clearly link species diversity to microbiome structure and function from the outset, addressing initial confusion. We provide a concise definition of ”macroecological states,” framing them as reproducible statistical patterns reflecting community-level properties. Additionally, the revised version explicitly connects gut microbiome states to host health earlier, resolving the previous abrupt shift in focus. Finally, we conclude by highlighting how disordered systems theory advances our understanding of microbiome stability and functioning, reinforcing the novelty and broader significance of our approach. Overall, the revised abstract better serves a broad interdisciplinary audience, including readers unfamiliar with the technicalities of disordered systems or microbial ecology, while preserving the scientific depth and accuracy of our work

      (3) The connection with consumer-resource (CR) models is quite unusual. In Equation (12), why do the authors assume that the consumption term does not depend on R? This should be addressed, since this term is usually dependent on R in microbial ecology models.

      In case this is helpful, it is known that the symmetric Lotka-Volterra model emerges from time-scale separation in the MacArthur model, where resources reproduce logistically and are consumed by other species (e.g., plants eaten by herbivores). Consumer-resource models form a broad category, while the MacArthur model is a specific case featuring logistic resource growth. For microbes, a more meaningful justification of the generalized Lotka-Volterra (GLV) model from a consumer-resource perspective involves the consumer-resource dynamics in a chemostat, where time-scale separation is assumed and higher-order interactions are neglected. See, for example: a) The classic paper by MacArthur: R. MacArthur. Species packing and competitive equilibrium for many species. Theoretical Population Biology, 1(1):1-11, 1970. b) Recent works on time-scale separation in chemostat consumer-resource models: Anna Posfai et al., PRL, 2017 Sireci et al., PNAS, 2023 Akshit Goyal et al., PRX-Life, 2025

      We thank the reviewer for the observation. We apologize for the typo that appeared in the main text and that we promptly corrected. The Consumers-Resources model we had in mind is the classical case proposed by MacArthur, where resources are self-regulated according to a logistic growth mechanism, which leads to the generalized LotkaVolterra model we employ in our work.

      Minor concerns:

      (1) The title has a nice pun for statistical physicists, but we wonder if it can be a bit confusing for the broader audience of e-Life. Although we leave this to the author’s decision, we’d recommend considering changing the title, making it more explicit in communicating the main contribution/result of the work.

      Following the reviewer’s suggestion, we have introduced an explanatory subtitle: “Linking Species Interactions to Dysbiosis through a Disordered Lotka-Volterra Framework”.

      (2) Review the references - some preprints might have already been published: Pasqualini J. 2023, Sireci 2022, Wu 2021.

      We thank the reviewer for pointing our attention to this inaccuracy. We updated the references to Pasqualini and Sireci papers. To our knowledge, Wu’s paper has appeared as an arXiv preprint only.

      (3) Species do not generally exhibit identical carrying capacities (see Grilli, Nat. Commun., 2020; some taxa are generally more abundant than others. The authors could discuss whether the model, with the inferred parameters, can accurately reproduce the distribution of species’ mean abundances.

      We thank the reviewer for this insightful comment. As discussed in the revised manuscript (lines 294–299), our current model does not accurately reproduce the empirical species abundance distribution (SAD). This limitation stems from the assumption of constant carrying capacities across species. While empirical observations (e.g., Grilli et al., Nat. Commun., 2020 [1]) show heterogeneous mean abundances often following power-law or log-normal distributions. However, our model assumes constant carrying capacity, resulting in SADs devoid of fat tails, which diverge from empirical data.

      This simplification is implemented to maintain the analytical tractability of the disordered generalized Lotka-Volterra (dGLV) framework, a common approach also found in prior works such as Bunin (2017) and Barbier et al. (2018) [2, 3]. Introducing heterogeneity in carrying capacities, such as drawing them from a log-normal distribution, or switching to multiplicative (rather than demographic) noise, could indeed produce SADs that better align with empirical data. Nevertheless, implementing changes would significantly complicate the analytical treatment.

      We acknowledge these directions as promising avenues for future research. They could help enhance the empirical realism of the model and its capacity to capture observed macroecological patterns while posing new theoretical challenges for disordered systems analysis

      (4) A substantial number of cited works (Grilli, Nat. Commun., 2020; Zaoli & Grilli, Science Advances, 2021; Sireci et al., PNAS, 2023; Po-Yi Ho et al., eLife, 2022) suggest that environmental fluctuations play a crucial role in shaping microbiome composition and dynamics. Is the authors’ analysis consistent with this perspective? Do they expect their conclusions to remain robust if environmental fluctuations are introduced?

      We thank the reviewer for stressing this point. The introduction of environmental fluctuations in the model formally violates detailed balance, thereby preventing the definition of an energy function. To date, no study has integrated random interactions together with both demographic and environmental noise within a unified analytical framework. This is certainly a highly promising direction that some of the authors are already exploring. However, given the inherently out-of-equilibrium nature of the system and the absence of a free energy, we would need to adopt a Dynamical Mean-Field Theory formalism and eventually analyze the corresponding stationary equations to be solved self-consistently. We added, however, a brief note in the Discussion section.

      (5) The term “order parameters“ may not be intuitive for a biological audience. In any case, the authors should explicitly define each order parameter when first introduced.

      We thank the reviewer for the comment. We introduced the names of the order parameters as soon as they are introduced, along with a brief explanation of their meaning that may be accessible to an audience with biological background.

      (6) Line 242: Should ψU be ψD?

      We thank the reviewer for the observation. We corrected the typo.

      (7) Given that the authors are discussing healthy and diseased states and to avoid confusion, the authors could perhaps use another word for ’pathological’ when they refer to dynamical regimes (e.g., in Appendix 2: ’letting the system enter the pathological regime of unbounded growth’).

      We thank the reviewer for the helpful comment. As suggested, we used the term “unphysical” instead of “pathological” where needed.

      Reviewer #2:

      (1) A technical point that I could not understand is how the authors deal with compositional data. One reason for my confusion is that the order parameters h and q0 are fixed n data to 1/S and 1/S2, and thus I do not see how they can be informative. Same for carrying capacity, why is it not 1 if considering relative abundance?

      We thank the reviewer for raising this point. We acknowledge that the treatment of compositional data and the interpretation of order parameters h and q0 were not sufficiently clarified in the manuscript. Additionally, there was an imprecision in the text regarding the interpretation of these parameters.

      As defined in revised Eq. (4) of the manuscript, h and q0 are to be averaged over the entire dataset, summing across samples α. Specifically, and , where S<sub>α</sub> is the number of species present in sample α and is the average over samples. These parameters are therefore informative, as they encapsulate sample-level ecological diversity, and their variation reflects biological differences between healthy and diseased states. For instance, Pasqualini et al., 2024 [4] reported significant differences in these metrics between health conditions, thereby supporting their ecological relevance.

      Regarding carrying capacities, we clarify that although we work with relative abundance data (i.e., compositional data), we do not fix the carrying capacity K to 1. Instead, we set K to the maximum value of xi (relative abundance) within each sample, to preserve compatibility with empirical data and allow for coexistence. While this remains a modeling assumption, it ensures better ecological realism within the constraints of the disordered GLV framework.

      (2) Obviously I’m missing something, so it would be nice to clarify in simple terms the logic of the argument. I understand that Lagrange multipliers are going to be used in the model analysis, and there are a lot of technical arguments presented in the paper, but I would like a much more intuitive explanation about the way the data can be used to infer order parameters if those are fixed by definition in compositional data.

      We thank the reviewer for the observation. The order parameters can be measured directly from the data, even in the presence of compositionality, as explained above. We can connect those parameters with the theory even for compositional data, because the only effect of adding the compositionality constraint is to shift the linear coefficient in the Hamiltonian, which corresponds to shifting the average interaction µ. However, the resulting phase diagram is mostly affected by the variance of the interactions σ2 (as µ is such that we are in the bounded phase).

      (3) Another point that I did not understand comes from the fact that the authors claim that interaction variance is smaller in unhealthy microbiomes. Yet they also find that those are closer to instability, and are more driven by niche processes. I would have expected the opposite to be true, more variance in the interactions leading to instability (as in May’s original paper for instance). Is this apparent paradox explained by covariations in demographic stochasticity (T) and immigration rate (lambda)? If so, I think it would be very useful to comment on that.

      As Altieri and coworkers showed in their PRL (2021) [5], the phase diagram of our model differs fundamentally from that of Biroli et al. (2018) [6]. In the latter, the intuitive rule – greater interaction variance yields greater instability – indeed holds. For the sake of clarity, we have attached below the resulting phase diagram obtained by Altieri et al.

      The apparent paradox arises because the two phase diagrams are tuned by different parameters. Consequently, even at low temperature and with weak interaction variance, our system may sit nearer to the replica-symmetrybreaking (RSB) line.

      Fig. 3 in the main text it is not a (σ,T) phase diagram where all other parameters are kept constant. Rather, it is a plot of the inferred σ and T parameters from the data (without showing the corresponding µ).

      To capture the full, non-trivial influence of all parameters on stability, we studied the so-called “replicon eigenvalue” in the RS (i.e. single equilibrium) approximation. This leading eigenvalue measures how close a given set of inferred parameters – and hence a microbiome – is to the RSB threshold. For a visual representation of these findings, refer to Figure 4.

      Author response image 1.

      (4) What do the empirical SAD look like? It would be nice to see the actual data and how the theoretical SADs compare.

      The empirical species abundance distributions (SADs) analyzed in our study are presented and discussed in detail in Pasqualini et al., 2024 [4]. Given the overlap in content, we chose not to reproduce these figures in the current manuscript to avoid redundancy.

      As we also clarify in the revised text, the theoretical SAD is derived from the disordered generalized Lotka-Volterra (dGLV) model in the unique fixed point phase typically exhibit exponential tails. These distributions do not match the heavier-tailed patterns (e.g., log-normal or power-law-like) observed in empirical microbiome data. This discrepancy stems from the simplifying assumptions of the dGLV framework, including the use of constant carrying capacities and demographic noise.

      In the revised manuscript, we have added a brief discussion in the revised manuscript to explicitly acknowledge this limitation and emphasize it as a direction for future refinement of the model, such as incorporating heterogeneous carrying capacities or exploring alternative noise structures.

      (5) Some typos: often “niche” is written “nice”.

      We thank the reviewer for this suggestion. After inspecting the text, we corrected the reported typos.

      Reviewer #3:

      Major comments:

      (1) In the S3 text, the authors say that filtered metagenomic reads were processed using the software Kaiju. The description of the pipeline does not mention how core genes were selected, which is often a crucial step in determining the abundance of a species in a metagenomic sample. In addition, the senior author of this manuscript has published a version of Kaiju that leverages marker genes classification methods (deemed Core-Kaiju), but it was not used for either this manuscript or Pasqualini et al. (2014; Tovo et al., 2020). I am not suggesting that the data necessarily needs to be reprocessed, but it would be useful to know how core genes were chosen in Pasqualini et al. and why Core-Kaiju was not used (2014).

      Prior to the current manuscript and the PLOS Computational Biology paper by Pasqualini et al. [4], we applied the core-Kaiju protocol to the same dataset used in both studies. However, this tool was originally developed and validated using general catalogs of culturable organisms, not specifically tuned for gut microbiomes. As a result, we have realized that in many samples Core Kajiu would filter only very few species (in some samples, the number of identified species was as low as 5–10), undermining the reliability of the analysis. Due to these limitations, we opted to use the standard Kaiju version in our work. We are actively developing an improved version of the core-Kaiju protocol that will overcome the discussed limitations and preliminary results (not shown here) indicate the robustness of the obtained patterns also in this case.

      (2) My understanding of Pasqualini et al. was that diseased patients experienced larger fluctuations in abundance, while in this study, they had smaller fluctuations (Figure 3a; 2024). Is this a discrepancy between the two models or is there a more nuanced interpretation?

      We thank the reviewer for the observation. This is only an apparent discrepancy, as the term fluctuation has different meanings in the two contexts. The fluctuations referred to by the reviewer correspond to a parameter of our theory—namely, noise in the interactions. Conversely, in Pasqualini et al. σ indicates environmental fluctuations. Nevertheless, there is no conceptual discrepancy in our results: in both studies, unhealthy microbiomes were found to be less stable. In fact, also in this study, notably Fig. 4, shows that unhealthy microbiomes lie closer to the RSB line, a phenomenon that is also associated with enhanced fluctuations.

      (3) Line 38-41: It would be helpful to explicitly state what “interaction patterns” are being referenced here. The final sentence could also be clarified. Do microbiomes “host“ interactions or are they better described as a property (“have”, “harbor”). The word “host” may confuse some readers since it is often used to refer to the human host. I am also not sure what point is being made by “expected to govern natural ones”. There are interactions between members of a microbiome; experimental studies have characterized some of these interactions, which we expect to relate in some way to interactions in nature. Is this what the authors are saying?

      Thanks. We agree that this sentence was not clear. Indeed, we are referring to pairwise species interactions and not to host-microbiome interactions. We have rewritten this part in the following way: In fact, recent work shows that the network-level properties of species-species interactions —for example, the sign balance, average strength, and connectivity of the inferred interaction matrix— shift systematically between healthy and dysbiotic gut communities (see for instance, [7, 8]). Pairwise species interactions have been quantified in simplified in-vitro consortia [9, 10]; we assume that the same classes of interactions also operate—albeit in a more complex form—in the native gut microbiome.

      (4) Line 43: I appreciate that the authors separated neutral vs. logistic models here.

      (5) Lines 51-75: The framing here is well-written and convincing. Network inference is an ongoing, active subject in ecology, and there is an unfortunate focus on inferring every individual interaction because ecologists with biology backgrounds are not trained to think about the problem in the language of statistical physics.

      We thank the reviewer for these positive comments.

      (6) Line 87: Perhaps I’m missing something obvious, but I don’t see how ρi sets the intrinsic timescale of the dynamics when its units are 1/(time*individuals), assuming the dimensions of ri are inverse time.

      We thank the reviewer for the observation. We corrected this phrase in the main text.

      (7) Lines 189-190: “as close as possible to the data” it would aid the reader if you specified the criteria meant by this statement.

      We thank the reviewer for the observation. We removed the sentence, as it introduced some redundancy in our argument. In the subsequent text, the proposed method is exposed in details.

      (8) Line 198: It would aid the reader if you provided some context for what the T - σ plane represents.

      We thank the referee for the helpful indication. Indeed, we have better clarified the mutual role of the demographic noise amplitude and strength of the random interaction matrix, as theoretically predicted in the PRL (2021) by Altieri and coworkers [5]. Please, find an additional paragraph on page 6 of the resubmitted version.

      (9) Line 217: Specifying what is meant by “internal modes“ would aid the typical life science reader.

      We thank the reviewer for the suggestion. Recognizing that referring to “internal modes” to describe the SAD shape in that context might cause confusion, we replaced “internal modes“ with “peaks”.

      (10) Line 219: Some additional justification and clarification are needed here, as some may think of “m“ as being biomass.

      We added a sentence to better explain this concept. “In classical and quantum field theory, the particle-particle interaction embedded in the quadratic term is typically referred to as a mass source. In the context of this study, captures quadratic fluctuations of species abundances, as also appearing in the expression of the leading eigenvalue of the stability matrix.”

      Minor comments:

      (1) I commend the authors for removing metagenomic reads that mapped to the human genome in the preprocessing stage of their pipeline. This may seem like an obvious pre-processing step, but it is unfortunately not always implemented.

      We thank the referee for pointing this potential issue. The data used in this work, as well as the bioinformatic workflow used to generate them has been described in detail in Pasqualini et al., 2024 [4]. As one of the main steps for preprocessing, we remove reads mapping to the human genome.

      (2) Line 13: “Bacterial“ excludes archaea, and while you may not have many high-abundance archaea in your human gut data, this sentence does not specify the human gut. Usually, this exclusion is averted via the term “microbial“, though sometimes researchers raise objections to the term when the data does not include fungal members (e.g., all 16S studies).

      We thank the reviewer for this suggestion. As to include archaeal organisms, we adopt the term “microbial“ instead of “bacterial“.

      (3) Line 18: This manuscript is being submitted under the “Physics of Living Systems“ tract, but it may be useful to explicitly state in the Abstract that disordered systems are a useful approach for understanding large, complex communities for the benefit of life science researchers coming from a biology background.

      Thank. We have modified the abstract following this suggestion.

      (4) Line 68: Consider using “adapted“ or something similar instead of “mutated“ if there is no specific reason for that word choice.

      We thank the reviewer for this suggestion, which was implemented in the text.

      (5) Line 111: It would be useful to define annealed and quenched for a general life science audience.

      We thank the reviewer for this suggestion. In the “Results” section, we have opted for “time-dependent disordered interactions” to reach a broader audience and avoid any jargon. Moreover, in the Discussion we added a detailed footnote: “In contrast to the quenched approximation, the annealed version assumes that the random couplings are not fixed but instead fluctuate over time, with their covariance governed by independent Ornstein–Uhlenbeck processes.”

      (6) Line 124: Likewise for the replicon sector.

      We thank the reviewer for the suggestion. We added a footnote on page 4, after the formula, to highlight the physical intuition behind the introduction of the replicon mode.

      “The replicon eigenvalue refers to a particular type of fluctuation around the saddle-point (mean-field) solution within the replica framework. When the Hessian matrix of the replicated free energy is diagonalized, fluctuations are divided into three sectors: longitudinal, anomalous, and replicon. The replicon mode is the most sensitive to criticality signaling – by its vanishing trend – the emergence of many nearly-degenerate states. It essentially describes how ‘soft’ the system is to microscopic rearrangements in configuration space.”

      (7) Figure 2: It would be helpful to include y-axis labels for each order parameter alongside the mathematical notation.

      We thank the reviewer for this suggestion. Now the y-axis of Figure 2 includes, along the mathmetical symbol, the label of the represented quantities.

      (8) Line 242: Subscript “U” is used to denote “Unhealthy” microbiomes, but “D” is used to denote “Diseased” in Figs. 2 and 3 (perhaps elsewhere as well).

      We thank the reviewer for this observation. After checking the various subscripts in the text, coherently with figure 2 and 3, we homogenized our notation, adopting the subscript “D“ for symbols related to the diseased/unhealthy condition.

      (9) Line 283: “not to“ should be “not due to“

      We thank the reviewer for this suggestion. After inspecting the text, we corrected the reported error.

      (10) Equations 23, 34: Extra “=“ on the RHS of the first line.

      We consistently follow the same formatting across all the line breaks in the equations throughout the text.

      We are thus resubmitting our paper, hoping to have satisfactorily addressed all referees’ concerns.

      References

      (1) Jacopo Grilli. Macroecological laws describe variation and diversity in microbial communities. Nature communications, 11(1):4743, 2020.

      (2) Guy Bunin. Ecological communities with lotka-volterra dynamics. Physical Review E, 95(4):042414, 2017.

      (3) Matthieu Barbier, Jean-Franc¸ois Arnoldi, Guy Bunin, and Michel Loreau. Generic assembly patterns in complex ecological communities. Proceedings of the National Academy of Sciences, 115(9):2156–2161, 2018.

      (4) Jacopo Pasqualini, Sonia Facchin, Andrea Rinaldo, Amos Maritan, Edoardo Savarino, and Samir Suweis. Emergent ecological patterns and modelling of gut microbiomes in health and in disease. PLOS Computational Biology, 20(9):e1012482, 2024.

      (5) Ada Altieri, Felix Roy, Chiara Cammarota, and Giulio Biroli. Properties of equilibria and glassy phases of the random lotka-volterra model with demographic noise. Physical Review Letters, 126(25):258301, 2021.

      (6) Giulio Biroli, Guy Bunin, and Chiara Cammarota. Marginally stable equilibria in critical ecosystems. New Journal of Physics, 20(8):083051, 2018.

      (7) Amir Bashan, Travis E Gibson, Jonathan Friedman, Vincent J Carey, Scott T Weiss, Elizabeth L Hohmann, and Yang-Yu Liu. Universality of human microbial dynamics. Nature, 534(7606):259–262, 2016.

      (8) Marcello Seppi, Jacopo Pasqualini, Sonia Facchin, Edoardo Vincenzo Savarino, and Samir Suweis. Emergent functional organization of gut microbiomes in health and diseases. Biomolecules, 14(1):5, 2023.

      (9) Jared Kehe, Anthony Ortiz, Anthony Kulesa, Jeff Gore, Paul C Blainey, and Jonathan Friedman. Positive interactions are common among culturable bacteria. Science advances, 7(45):eabi7159, 2021.

      (10) Ophelia S Venturelli, Alex V Carr, Garth Fisher, Ryan H Hsu, Rebecca Lau, Benjamin P Bowen, Susan Hromada, Trent Northen, and Adam P Arkin. Deciphering microbial interactions in synthetic human gut microbiome communities. Molecular systems biology, 14(6):e8157, 2018.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary:

      The study explored the biomechanics of kangaroo hopping across both speed and animal size to try and explain the unique and remarkable energetics of kangaroo locomotion.

      Strengths:

      The study brings kangaroo locomotion biomechanics into the 21st century. It is a remarkably difficult project to accomplish. There is excellent attention to detail, supported by clear writing and figures.

      Weaknesses:

      The authors oversell their findings, but the mystery still persists. 

      The manuscript lacks a big-picture summary with pointers to how one might resolve the big question.

      General Comments

      This is a very impressive tour de force by an all-star collaborative team of researchers. The study represents a tremendous leap forward (pun intended) in terms of our understanding of kangaroo locomotion. Some might wonder why such an unusual species is of much interest. But, in my opinion, the classic study by Dawson and Taylor in 1973 of kangaroos launched the modern era of running biomechanics/energetics and applies to varying degrees to all animals that use bouncing gaits (running, trotting, galloping and of course hopping). The puzzling metabolic energetics findings of Dawson & Taylor (little if any increase in metabolic power despite increasing forward speed) remain a giant unsolved problem in comparative locomotor biomechanics and energetics. It is our "dark matter problem".

      Thank you for the kind words.

      This study is certainly a hop towards solving the problem. But, the title of the paper overpromises and the authors present little attempt to provide an overview of the remaining big issues. 

      We have modified the title to reflect this comment.  “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos”

      The study clearly shows that the ankle and to a lesser extent the mtp joint are where the action is. They clearly show in great detail by how much and by what means the ankle joint tendons experience increased stress at faster forward speeds.

      Since these were zoo animals, direct measures were not feasible, but the conclusion that the tendons are storing and returning more elastic energy per hop at faster speeds is solid. The conclusion that net muscle work per hop changes little from slow to fast forward speeds is also solid. 

      Doing less muscle work can only be good if one is trying to minimize metabolic energy consumption. However, to achieve greater tendon stresses, there must be greater muscle forces. Unless one is willing to reject the premise of the cost of generating force hypothesis, that is an important issue to confront. Further, the present data support the Kram & Dawson finding of decreased contact times at faster forward speeds. Kram & Taylor and subsequent applications of (and challenges to) their approach supports the idea that shorter contact times (tc) require recruiting more expensive muscle fibers and hence greater metabolic costs. Therefore, I think that it is incumbent on the present authors to clarify that this study has still not tied up the metabolic energetics across speed problems and placed a bow atop the package. 

      Fortunately, I am confident that the impressive collective brain power that comprises this author list can craft a paragraph or two that summarizes these ideas and points out how the group is now uniquely and enviably poised to explore the problem more using a dynamic SIMM model that incorporates muscle energetics (perhaps ala' Umberger et al.). Or perhaps they have other ideas about how they can really solve the problem.

      You have raised important points, thank you for this feedback. We have added a limitations and considerations section to the discussion which highlights that there are still unanswered questions. Line 311-328

      Considerations and limitations

      “First, we believe it is more likely that the changes in moment arms and EMA can be attributed to speed rather than body mass, given the marked changes in joint angles and ankle height observed at faster hopping speeds. However, our sample included a relatively narrow range of body masses (13.7 to 26.6 kg) compared to the potential range (up to 80 kg), limiting our ability to entirely isolate the effects of speed from those of mass. Future work should examine a broader range of body sizes. Second, kangaroos studied here only hopped at relatively slow speeds, which bounds our estimates of EMA and tendon stress to a less critical region. As such, we were unable to assess tendon stress at fast speeds, where increased forces would reduce tendon safety factors closer to failure. A different experimental or modelling approach may be needed, as kangaroos in enclosures seem unwilling to hop faster over force plates. Finally, we did not determine whether the EMA of proximal hindlimb joints (which are more difficult to track via surface motion capture markers) remained constant with speed. Although the hip and knee contribute substantially less work than the ankle joint (Fig. 4), the majority of kangaroo skeletal muscle is located around these proximal joints. A change in EMA at the hip or knee could influence a larger muscle mass than at the ankle, potentially counteracting or enhancing energy savings in the ankle extensor muscle-tendon units. Further research is needed to understand how posture and muscles throughout the whole body contribute to kangaroo energetics.”

      Additionally, we added a line “Peak GRF also naturally increased with speed together with shorter ground contact durations (Fig. 2b, Suppl. Fig 1b)” (line 238) to highlight that we are not proposing that changes in EMA alone explain the full increase in tendon stress. Both GRF and EMA contribute substantially (almost equally) to stress, and we now give more equal discussion to both. For instance, we now also evaluate how much each contributes: “If peak GRF were constant but EMA changed from the average value of a slow hop to a fast hop, then stress would increase 18%, whereas if EMA remained constant and GRF varied by the same principles, then stress would only increase by 12%. Thus, changing posture and decreasing ground contact duration both appear to influence tendon stress for kangaroos, at least for the range of speeds we examined” (Line 245-249)

      We have added a paragraph in the discussion acknowledging that the cost of generating force problem is not resolved by our work, concluding that “This mechanism may help explain why hopping macropods do not follow the energetic trends observed in other species (Dawson and Taylor 1973, Baudinette et al. 1992, Kram and Dawson 1998), but it does not fully resolve the cost of generating force conundrum” Line 274-276.

      I have a few issues with the other half of this study (i.e. animal size effects). I would enjoy reading a new paragraph by these authors in the Discussion that considers the evolutionary origins and implications of such small safety factors. Surely, it would need to be speculative, but that's OK.

      We appreciate this comment from the reviewer, however could not extend the study to discuss animal size effects because, as we now note in the results: “The range of body masses may not be sufficient to detect an effect of mass on ankle moment in addition to the effect of speed.” Line 193

      Reviewer #2 (Public Review):

      Summary

      This is a fascinating topic that has intrigued scientists for decades. I applaud the authors for trying to tackle this enigma. In this manuscript, the authors primarily measured hopping biomechanics data from kangaroos and performed inverse dynamics. 

      While these biomechanical analyses were thorough and impressively incorporated collected anatomical data and an Opensim model, I'm afraid that they did not satisfactorily address how kangaroos can hop faster and not consume more metabolic energy, unique from other animals.  Noticeably, the authors did not collect metabolic data nor did they model metabolic rates using their modelling framework. Instead, they performed a somewhat traditional inverse dynamics analysis from multiple animals hopping at a self-selected speed.

      In the current study, we aimed to provide a joint-level explanation for the increases of tendon stress that are likely linked to metabolic energy consumption.

      We have now included a limitations section in the manuscript (See response to Rev 1). We plan to expand upon muscle level energetics in the future with a more detailed musculoskeletal model.

      Within these analyses, the authors largely focused on ankle EMA, discussing its potential importance (because it affects tendon stress, which affects tendon strain energy, which affects muscle mechanics) on the metabolic cost of hopping. However, EMA was roughly estimated (CoP was fixed to the foot, not measured) and did not detectibly associate with hopping speed (see results Yet, the authors interpret their EMA findings as though it systematically related with speed to explain their theory on how metabolic cost is unique in kangaroos vs. other animals

      As noted in our methods, EMA was not calculated from a fixed centre of pressure (CoP). We did fix the medial-lateral position, owing to the fact that both feet contacted the force plate together, but the anteroposterior movement of the CoP was recorded by the force plate and thus allowed to move. We report the movement (or lack of movement) in our results. The anterior-posterior axis is the most relevant to lengthening or shortening the distance of the ‘out-lever’ R, and thereby EMA. It is necessary to assume fixed medial-lateral position because a single force trace and CoP is recorded when two feet land on the force plate. The mediallateral forces on each foot cancel out so there is no overall medial-lateral movement if the forces are symmetrical (e.g. if the kangaroo is hopping in a straight path and one foot is not in front of the other). We only used symmetrical trials so that the anterior-posterior movement of the CoP would be reliable. We have now added additional details into the text to clarify this

      Indeed, the relationship between R and speed (and therefore EMA and speed) was not significant. However, the significant change in ankle height with speed, combined with no systematic change in COP at midstance, demonstrates that R would be greater at faster speeds. If we consider the nonsignificant relationship between R and speed to indicate that there is no change in R, then these two results conflict. We could not find a flaw in our methods, so instead concluded that the nonsignificant relationship between R and speed may be due to a small change in R being undetectable in our data. Taking both results into account, we believe it is more likely that there is a non-detectable change in R, rather than no change in R with speed, but we presented both results for transparency. We have added an additional section into the results to make this clearer (Line 177-185) “If we consider the nonsignificant relationship between R (and EMA) and speed to indicate that there is no change in R, then it conflicts with the ankle height and CoP result. Taking both into account, we think it is more likely that there is a small, but important, change in R, rather than no change in R with speed. It may be undetectable because we expect small effect sizes compared to the measurement range and measurement error (Suppl. Fig. 3h), or be obscured by a similar change in R with body mass. R is highly dependent on the length of the metatarsal segment, which is longer in larger kangaroos (1 kg BM corresponded to ~1% longer segment, P<0.001, R<sup>2</sup>=0.449). If R does indeed increase with speed, both R and r will tend to decrease EMA at faster speeds.”

      These speed vs. biomechanics relationships were limited by comparisons across different animals hopping at different speeds and could have been strengthened using repeated measures design

      There is significant variation in speed within individuals, not just between individuals. The preferred speed of kangaroos is 2-4.5 m/s, but most individuals showed a wide speed range within this. Eight of our 16 kangaroos had a maximum speed that was 1-2m/s faster than their slowest trial. Repeated measures of these eight individuals comprises 78 out of the 100 trials.   It would be ideal to collect data across the full range of speeds for all individuals, but it is not feasible in this type of experimental setting. Interference with animals such as chasing is dangerous to kangaroos as they are prone to adverse reactions to stress. We have now added additional information about the chosen hopping speeds into the results and methods sections to clarify this “The kangaroos elected to hop between 1.99 and 4.48 m s<sup>-1</sup>, with a range of speeds and number of trials for each individual (Suppl. Fig. 9).”  (Line 381-382)

      There are also multiple inconsistencies between the authors' theory on how mechanics affect energetics and the cited literature, which leaves me somewhat confused and wanting more clarification and information on how mechanics and energetics relate

      We thank the reviewer for this comment. Upon rereading we now understand the reviewers position, and have made substantial revisions to the introduction and discussion (See comments below) 

      My apologies for the less-than-favorable review, I think that this is a neat biomechanics study - but am unsure if it adds much to the literature on the topic of kangaroo hopping energetics in its current form.

      Again we thank the reviewer for their time and appreciate their efforts to strengthen our manuscript.

      Reviewer #3 (Public Review):

      Summary:

      The goal of this study is to understand how, unlike other mammals, kangaroos are able to increase hopping speed without a concomitant increase in metabolic cost. They use a biomechanical analysis of kangaroo hopping data across a range of speeds to investigate how posture, effective mechanical advantage, and tendon stress vary with speed and mass. The main finding is that a change in posture leads to increasing effective mechanical advantage with speed, which ultimately increases tendon elastic energy storage and returns via greater tendon strain. Thus kangaroos may be able to conserve energy with increasing speed by flexing more, which increases tendon strain.

      Strengths:

      The approach and effort invested into collecting this valuable dataset of kangaroo locomotion is impressive. The dataset alone is a valuable contribution.

      Thank you!

      Weaknesses:

      Despite these strengths, I have concerns regarding the strength of the results and the overall clarity of the paper and methods used (which likely influences how convincingly the main results come across).

      (1) The paper seems to hinge on the finding that EMA decreases with increasing speed and that this contributes significantly to greater tendon strain estimated with increasing speed. It is very difficult to be convinced by this result for a number of reasons:

      It appears that kangaroos hopped at their preferred speed. Thus the variability observed is across individuals not within. Is this large enough of a range (either within or across subjects) to make conclusions about the effect of speed, without results being susceptible to differences between subjects? 

      Apologies, this was not clear in the manuscript. Kangaroos hopping at their preferred speed means we did not chase or startle them into high speeds to comply with ethics and enclosure limitations. Thus we did not record a wide range of speeds within the bounds of what kangaroos are capable of in the wild (up to 12 m/s), but for the range we did measure (~2-4.5 m/s), there is a large amount of variation in hopping speed within each individual kangaroo. Out of 16 individuals, eight individuals had a difference of 1-2m/s between their slowest and fastest trials, and these kangaroos accounted for 78 out of 100 trials. Of the remainder, six individuals had three for fewer trials each, and two individuals had highly repeatable speeds (3 out of 4, and 6 out of 7 trials were within 0.5 m/s). We have now removed the terminology “preferred speed” e.g line 115. We have added additional information about the chosen hopping speeds into the results and methods, including an appendix figure “The kangaroos elected to hop between 1.99 and 4.48 m s<sup>-1</sup>, with a range of speeds and number of trials for each individual (Suppl. Fig. 9).” (Line 381-382)

      In the literature cited, what was the range of speeds measured, and was it within or between subjects?

      For other literature, to our knowledge the highest speed measured is ~9.5m/s (see supplementary Fig1b) and there were multiple measures for several individuals (see methods Kram & Dawson 1998). 

      Assuming that there is a compelling relationship between EMA and velocity, how reasonable is it to extrapolate to the conclusion that this increases tendon strain and ultimately saves metabolic cost?  They correlate EMA with tendon strain, but this would still not suggest a causal relationship (incidentally the p-value for the correlation is not reported). 

      The functions that underpin these results (e.g. moment = GRF*R) come from physical mechanics and geometry, rather than statistical correlations. Additionally, a p-value is not appropriate in the relationship between EMA and stress (rather than strain) because the relationship does not appear to be linear. We have made it clearer in the discussion that we are not proposing that entire change in stress is caused by changes in EMA, but that the increase in GRF that naturally occurs with speed will also explain some of the increase in stress, along with other potential mechanisms. The discussion has been extensively revised to reflect this. 

      Tendon strain could be increasing with ground reaction force, independent of EMA. Even if there is a correlation between strain and EMA, is it not a mathematical necessity in their model that all else being equal, tendon stress will increase as ema decreases? I may be missing something, but nonetheless, it would be helpful for the authors to clarify the strength of the evidence supporting their conclusions.

      Yes, GRF also contributes to the increase in tendon stress in the mechanism we propose (Suppl. Fig. 8), see the formulas in Fig 6, and we have made this clearer in the revised discussion (see above comment).  You are correct that mathematically stress is inversely proportional to EMA, which can be observed in Fig. 7a, and we did find that EMA decreases. 

      The statistical approach is not well-described. It is not clear what the form of the statistical model used was and whether the analysis treated each trial individually or grouped trials by the kangaroo. There is also no mention of how many trials per kangaroo, or the range of speeds (or masses) tested. 

      The methods include the statistical model with the variables that we used, as well as the kangaroo masses (13.7 to 26.6 kg, mean: 20.9 ± 3.4 kg). We did not have sufficient within individual sample size to use a linear mixed effect model including subject as a random factor, thus all trials were treated individually. We have included this information in the results section. 

      We have now moved the range of speeds from the supplementary material to the results and figure captions. We have added information on the number of trials per kangaroo to the methods, and added Suppl. Fig. 9 showing the distribution of speeds per kangaroo.

      We did not group the data e.g. by using an average speed per individual for all their trials, or by comparing fast to slow groups for statistical analysis (the latter was only for display purposes in our figures, which we have now made clearer in the methods statistics section). 

      Related to this, there is no mention of how different speeds were obtained. It seems that kangaroos hopped at a self-selected pace, thus it appears that not much variation was observed. I appreciate the difficulty of conducting these experiments in a controlled manner, but this doesn’t exempt the authors from providing the details of their approach.

      Apologies, this was not clear in the manuscript. Kangaroos hopping at their preferred speed means we did not chase or startle them into high speeds to comply with ethics and enclosure limitations. Thus we did not record a wide range of speeds within the bounds of what kangaroos are capable of in the wild (up to 12 m/s). We have now removed the terminology “preferred speed” e.g. line 115. We have added additional information about the chosen hopping speeds into the results and methods, including an appendix figure (see above comment). (Line 381-382)

      Some figures (Figure 2 for example) present means for one of three speeds, yet the speeds are not reported (except in the legend) nor how these bins were determined, nor how many trials or kangaroos fit in each bin. A similar comment applies to the mass categories. It would be more convincing if the authors plotted the main metrics vs. speed to illustrate the significant trends they are reporting.

      Thank you for this comment. The bins are used only for display purposes and not within the statistical analysis. We have clarified this in the revised manuscript: “The data was grouped into body mass (small 17.6±2.96 kg, medium 21.5±0.74 kg, large 24.0±1.46 kg) and speed (slow 2.52±0.25 m s<sup>-1</sup>, medium 3.11±0.16 m s<sup>-1</sup>, fast 3.79±0.27 m s<sup>-1</sup>) subsets for display purposes only”. (Line 495-497)

      (2) The significance of the effects of mass is not clear. The introduction and abstract suggest that the paper is focused on the effect of speed, yet the effects of mass are reported throughout as well, without a clear understanding of the significance. This weakness is further exaggerated by the fact that the details of the subject masses are not reported.

      Indeed, the primary aim of our study was to explore the influence of speed, given the uncoupling of energy from hopping speed in kangaroos. We included mass to ensure that the effects of speed were not driven by body mass (i.e.: that larger kangaroos hopped faster). Subject masses were reported in the first paragraph of the methods, albeit some were estimated as outlined in the same paragraph.

      (3) The paper needs to be significantly re-written to better incorporate the methods into the results section. Since the results come before the methods, some of the methods must necessarily be described such that the study can be understood at some level without turning to the dedicated methods section. As written, it is very difficult to understand the basis of the approach, analysis, and metrics without turning to the methods.

      The methods after the discussion is a requirement of the journal. We have incorporated some methods in the results where necessary but not too repetitive or disruptive, e.g. Fig. 1 caption, and specifying we are only analysing EMA for the ankle joint

      Reviewing Editor (Recommendations For The Authors):

      Below is a list of specific recommendations that the authors could address to improve the eLife assessment:

      (1) Based on the data presented and the fact that metabolic energy was not measured, the authors should temper their conclusions and statements throughout the manuscript regarding the link between speed and metabolic energy savings. We recommend adding text to the discussion summarizing the strengths and limitations of the evidence provided and suggesting future steps to more conclusively answer this mystery.

      There is a significant body of work linking metabolic energy savings to measured increases in tendon stress in macropods. However, the purpose of this paper was to address the unanswered questions about why tendon stress increases. We found that stress did not only increase due to GRF increasing with speed as expected, but also due to novel postural changes which decreased EMA. In the revised manuscript, we have tempered our conclusions to make it clearer that it is not just EMA affecting stress, and added limitations throughout the manuscript (see response to Rev 1). 

      (2) To provide stronger evidence of a link between speed, mechanics, and metabolic savings the authors can consider estimating metabolic energy expenditure from their OpenSIM model. This is one suggestion, but the authors likely have other, possibly better ideas. Such a model should also be able to explain why the metabolic rate increases with speed during uphill hopping.

      Extending the model to provide direct metabolic cost estimates will be the goal of a future paper, however the models does not have detailed muscle characteristics to do this in the formulation presented here. It would be a very large undertaking which is beyond the scope of the current manuscript. As per the comment above, the results of this paper are not reliant on metabolic performance. 

      (3) The authors attempt to relate the newly quantified hopping biomechanics to previously published metabolic data. However, all reviewers agree that the logic in many instances is not clear or contradictory. Could one potential explanation be that at slow speeds, forces and tendon strain are small, and thus muscle fascicle work is high? Then, with faster speeds, even though the cost of generating isometric force increases, this is offset by the reduction in the metabolic cost of muscular work. The paper could provide stronger support for their hypotheses with a much clearer explanation of how the kinematics relate to the mechanics and ultimately energy savings.

      In response to the reviewers comments, we have substantially modified the discussion to provide clearer rationale.

      (4) The methods and the effort expended to collect these data are impressive, but there are a number of underlying assumptions made that undermine the conclusions. This is due partly to the methods used, but also the paper's incomplete description of their methods. We provide a few examples below:

      It would be helpful if the authors could speak to the effect of the limited speeds tested and between-animal comparisons on the ability to draw strong conclusions from the present dataset. ·

      Throughout the discussion, the authors highlight the relationship between EMA and speed. However, this is misleading since there was no significant effect of speed on EMA. Speed only affected the muscle moment arm, r. At minimum, this should be clarified and the effect on EMA not be overstated. Additionally, the resulting implications on their ability to confidently say something about the effect of speed on muscle stress should be discussed. 

      We have now provided additional details, (see responses above) to these concerns. For instance, we added a supplementary figure showing the speed distribution per individual. The primary reviewer concern (that each kangaroo travelled at a single speed) was due to a miscommunication around the terminology “preferred” which has now been corrected. 

      We now elaborate in the results why we are not very concerned that EMA is insignificant. The statistical insignificance of EMA is ultimately due to the insignificance of the direct measurement of R, however, we now better explain in the results why we believe that this statistical insignificance is due to error/noise of the measurement which is relatively large compared to the effect size. Indirect indications of how R may increase with speed (via ankle height from the ground) are statistically significant. Lines 177-185. 

      We consider this worth reporting because, for instance, an 18% change in EMA will be undetectable by measurement, but corresponds to an 18% change in tendon stress which is measurable and physiologically significant (safety factor would decrease from 2 to 1.67).  We presented both significant and insignificant results for transparency. 

      We have also discussed this within a revised limitations section of the manuscript (Line 311328). 

      Reviewer #1 (Recommendations For The Authors):

      Title: I would cut the first half of the title. At least hedge it a bit. "Clues" instead of "Unlocking the secrets".

      We have revised the title to: “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos”

      In my comments, ... typically indicates a stylistic change suggested to the text.

      Overall, the paper covers speed and size. Unfortunately, the authors were not 100% consistent in the order of presenting size then speed, or speed then size. Just choose one and stick with it.

      We have attempted to keep the order of presenting size and speed consistent, however there are several cases where this would reduce the readability of the manuscript and so in some cases this may vary. 

      One must admit that there is a lot of vertical scatter in almost all of the plots. I understand that these animals were not in a lab on a treadmill at a controlled speed and the animals wear fur coats so marker placements vary/move etc. But the spread is quite striking, e.g. Figure 5a the span at one speed is almost 10x. Can the authors address this somewhere? Limitations section?

      The variation seen likely results from attempting to display data in a 2D format, when it is in fact the result of multiple variables, including speed, mass, stride frequency and subject specific lengths. Slight variations in these would be expected to produce some noise around the mean, and I think it’s important to consider this while showing the more dominant effects. 

      In many locations in the manuscript, the term "work" is used, but rarely if ever specified that this is the work "per hop". The big question revolves around the rate of metabolic energy consumption (i.e. energy per time or average metabolic power), one must not forget that hop frequency changes somewhat across speed, so work per hop is not the final calculation.

      Thank you for this comment. We have now explicitly stated work per hop in figure captions and in the results (line 208). The change in stride frequency at this range of speeds is very small, particularly compared to the variance in stride frequency (Suppl. Fig. 1d), which is consistent with other researchers who found that stride frequency was constant or near constant in macropods at analogous speeds (e.g. Dawson and Taylor 1973, Baudinette et al. 1987). 

      Line 61 ....is likely related.

      Added “likely” (line 59)

      Line 86 I think the Allen reference is incomplete. Wasn't it in J Exp Biology?

      Thank you. Changed. 

      Line 122 ... at faster speeds and in larger individuals.

      Changed: “We hypothesised that (i) the hindlimb would be more crouched at faster speeds, primarily due to the distal hindlimb joints (ankle and metatarsophalangeal), independent of changes with body mass” (Line 121-122).

      Line 124 I found this confusing. Try to re-word so that you explain you mean more work done by the tendons and less by the ankle musculature.

      Amended: “changes in moment arms resulting from the change in posture would contribute to the increase in tendon stress with speed, and may thereby contribute to energetic savings by increasing the amount of positive and negative work done by the ankle without requiring additional muscle work” (Line 123)

      Line 129 hopefully "braking" not "breaking"!

      Thank you. Fixed. (Line 130)

      Line 129 specify fore-aft horizontal force.

      Added "fore-aft" to "negative fore-aft horizontal component" (Line 130-131)

      Line 130 add something like "of course" or "naturally" since if there is zero fore-aft force, the GRF vector of course must be vertical. 

      Added "naturally" (Line 132)

      Line 138 clarify that this section is all stance phase. I don't recall reading any swing phase data.

      Changed to: "Kangaroo hindlimb stance phase kinematics varied…" (Line 141)

      Line 143 and elsewhere. I found the use of dorsiflexion and plantarflexion confusing. In Figure 3, I see the ankle never flexing more than 90 degrees. So, the ankle joint is always in something of a flexed position, though of course it flexes and extends during contact. I urge the authors to simplify to flextion/extension and drop the plantar/dorsi.

      We have edited this section to describe both movements as greater extension (plantarflexion). (Line 147). We have further clarified this in the figure caption for figure 3.  

      Line 147 ...changes were…

      Fixed, line 150

      Line 155 I'm a bit confused here. Are the authors calculating some sort of overall EMA or are they saying all of the individual joint EMAs all decreased?

      Thank you, we clarified that it is at the ankle. Line 158

      Line 158 since kangaroos hop and are thus positioned high and low throughout the stance phase, try to avoid using "high" and "low" for describing variables, e.g. GRF or other variables. Just use "greater/greatest" etc.

      Thanks for this suggestion. We have changed "higher" into "greater" where appropriate throughout the manuscript e.g. line 161

      Lines 162 and 168 same comment here about "r" and "R". Do you mean ankle or all joints?

      Clarified that it is the gastrocnemius and plantaris r, and the R to the ankle. (Lines 164-165)

      Line 173 really, ankle height?

      Added: ankle height is "vertical distance from the ground". Line 177

      Line 177 is this just the ankle r?

      Added "of the ankle" line 158 and “Achilles” line 187 

      Line 183 same idea, which tendon/tendons are you talking about here?

      Added "Achilles" to be more clear (Line 187)

      Line 195 substitute "converted" for "transferred".

      Done (Line 210)

      Line 223 why so vague? i.e. why use "may"? Believe in your data. ...stress was also modulated by changes....

      Changed "may" to "is"

      Line 229 smaller ankle EMA (especially since you earlier talked about ankle "height").

      Changed “lower” to “smaller” Line 254

      Line 2236 ...and return elastic energy…

      Added "elastic" line 262

      Line 244 IMPORTANT: Need to explain this better! I think you are saying that the net work at the ankle is staying the same across speed, BUT it is the tendons that are storing and returning that work, it's not that the muscles are doing a lot of negative/positive work.

      Changed: “The consistent net work observed among all speeds suggests the ankle extensor muscle-tendon units are performing similar amounts of ankle work independent of speed, which would predominantly be done by the tendon.” Line 270-272)

      Line 258-261 I think here is where you are over-selling the data/story. Although you do say "a" mechanism (and not "the" mechanism, you still need to deal with the cost of generating more force and generating that force faster.

      We removed this sentence and replaced it with a discussion of the cost of generating force hypothesis, and alternative scenarios for the how force and metabolics could be uncoupled. 

      Line 278 "the" tendon? Which tendon?

      Added "Achilles"

      Line 289. I don't think one can project into the past.

      Changed “projected” to "estimated"

      Line 303 no problem, but I've never seen a paper in biology where the authors admit they don't know what species they were studying!

      Can’t be helped unfortunately. It is an old dataset and there aren’t photos of every kangaroo. Fortunately, from the grey and red kangaroos we can distinguish between, we know there are no discernible species effects on the data. 

      Lines 304-306 I'm not clear here. Did you use vertical impulse (and aerial time) to calculate body weight? Or did you somehow use the braking/propulsive impulse to calculate mass? I would have just put some apples on the force plate and waited for them to stop for a snack.

      Stationary weights were recorded for some kangaroos which did stand on the force plate long enough, but unfortunately not all of them were willing to do so. In those cases, yes, we used impulse from steady-speed trials to estimate mass. We cross-checked by estimated mass from segment lengths (as size and mass are correlated). This is outlined in the first paragraph of the methods.

      Lines 367 & 401 When you use the word "scaled" do you mean you assumed geometric similarity?

      No, rather than geometric scaling, we allowed scaling to individual dimensions by using the markers at midstance for measurements. We have amended the paragraph to clarify that the shape of the kangaroo changes and that mass distribution was preserved during the shape change (line 441-446) 

      Lines 381-82 specify "joint work"

      Added "joint work"  (Line 457)

      Figure 1 is gorgeous. Why not add the CF equation to the left panel of the caption?

      We decided to keep the information in the figure caption. “Total leg length was calculated as the sum of the segment lengths (solid black lines) in the hindlimb and compared to the pelvisto-toe distance (dashed line) to calculate the crouch factor”

      Figure 2 specify Horizontal fore-aft.

      Done

      Figure 3g I'd prefer the same Min. Max Flexion vertical axis labels as you use for hip & knee.

      While we appreciate the reviewer trying to increase the clarity of this figure, we have left it as plantar/dorsi flexion since these are recognised biomechanical terms. To avoid confusion, we have further defined these in the figure caption “For (f-g), increased plantarflexion represents a decrease in joint flexion, while increased dorsiflexion represents increased flexion of the joint.”

      Figure 4. I like it and I think that you scaled all panels the same, i.e. 400 W is represented by the same vertical distance in all panels. But if that's true, please state so in the Caption. It's remarkable how little work occurs at the hip and knee despite the relatively huge muscles there.

      Is it true that the y axes are all at the same scale. We have added this to the caption. 

      Figure 5 Caption should specify "work per hop".

      Added

      Figure 7 is another beauty.

      Thank you!

      Supplementary Figure 3 is this all ANKLE? Please specify.

      Clarified that it is the gastrocnemius and plantaris r, and the R to the ankle.

      Reviewer #2 (Recommendations For The Authors):

      To 'unlock the secrets of kangaroo locomotor energetics' I expected the authors to measure the secretive outcome variable, metabolic rate using laboratory measures. Rather, the authors relied on reviewing historic metabolic data and collecting biomechanics data across different animals, which limits the conclusions of this manuscript.

      We have revised to the title to make it clearer that we are investigating a subset of the energetics problem, specifically posture. “Postural adaptations may contribute to the unique locomotor energetics seen in hopping kangaroos.” We have also substantially modified the discussion to temper the conclusions from the paper. 

      After reading the hypothesis, why do the authors hypothesize about joint flexion and not EMA? Because the following hypothesis discusses the implications of moment arms on tendon stress, EMA predictions are more relevant (and much more discussed throughout the manuscript).

      Ankle and MTP angles are the primary drivers of changes in r, R & thus, EMA. We used a two part hypothesis to capture this. We have rephased the hypotheses: “We hypothesised that (i) the hindlimb would be more crouched at faster speeds, primarily due to the distal hindlimb joints (ankle and metatarsophalangeal), independent of changes with body mass, and (ii) changes in moment arms resulting from the change in posture would contribute to the increase in tendon stress with speed, and may thereby contribute to energetic savings by increasing the amount of positive and negative work done by the ankle without requiring additional muscle work.”

      If there were no detectable effects of speed on EMA, are kangaroos mechanically like other animals (Biewener Science 89 & JAP 04) who don't vary EMA across speeds? Despite no detectible effects, the authors state [lines 228-229] "we found larger and faster kangaroos were more crouched, leading to lower ankle EMA". Can the authors explain this inconsistency? Lines 236 "Kangaroos appear to use changes in posture and EMA". I interpret the paper as EMA does not change across speed.

      Apologies, we did not sufficiently explain this originally. We now explain in the results our reasoning behind our belief that EMA and R may change with speed. “If we consider the nonsignificant relationship between R (and EMA) and speed to indicate that there is no change in R, then it conflicts with the ankle height and CoP result. Taking both into account, we think it is more likely that there is a small, but important, change in R, rather than no change in R with speed. It may be undetectable because we expect small effect sizes compared to the measurement range and measurement error (Suppl. Fig. 3h), or be obscured by a similar change in R with body mass. R is highly dependent on the length of the metatarsal segment, which is longer in larger kangaroos (1 kg BM corresponded to ~1% longer segment, P<0.001, R<sup>2</sup>=0.449). If R does indeed increase with speed, both R and r will tend to decrease EMA at faster speeds.” (Line 177-185)

      Lines 335-339: "We assumed the force was applied along phalanx IV and that there was no medial or lateral movement of the centre of pressure (CoP)". I'm confused, did the authors not measure CoP location with respect to the kangaroo limb? If not, this simple estimation undermines primary results (EMA analyses).

      We have changed "The anterior or posterior movement of the CoP was recorded by the force plate" to read: "The fore-aft movement of the CoP was recorded by the force plate within the motion capture coordinate system" (Line 406-407) and added more justification for fixing the CoP movement in the other axis: “It was necessary to assume the CoP was fixed in the mediallateral axis because when two feet land on the force plate, the lateral forces on each foot are not recorded, and indeed cancel if the forces are symmetrical (i.e. if the kangaroo is hopping in a straight path and one foot is not in front of the other). We only used symmetrical trials to ensure reliable measures of the anterior-posterior movement of the CoP.” (Line 408-413)

      The introduction makes many assertions about the generalities of locomotion and the relationship between mechanics and energetics. I'm afraid that the authors are selectively choosing references without thoroughly evaluating alternative theories. For example, Taylor, Kram, & others have multiple papers suggesting that decreasing EMA and increasing muscle force (and active muscle volume) increase metabolic costs during terrestrial locomotion. Rather, the authors suggest that decreasing EMA and increasingly high muscle force at faster speeds don't affect energetics unless muscle work increases substantially (paragraph 2)? If I am following correctly, does this theory conflict with active muscle volume ideas that are peppered throughout this manuscript?

      Yes, as you point out, the same mechanism does lead to different results in kangaroos vs humans, for instance, but this is not a contradiction. In all species, decreasing EMA will result in an increase in muscle force due to less efficient leverage (i.e. lower EMA) of the muscles, and the muscle-tendon unit will be required to produce more force to balance the joint moment. As a consequence, human muscles activate a greater volume in order for the muscle-tendon unit to increase muscle work and produce enough force. We are proposing that in kangaroos, the increase in work is done by the achilles tendon rather than the muscles. Previous research suggests that macropod ankle muscles contract isometrically or that the fibres do not shorten more at faster speeds i.e. muscle work does not increase with speed. Instead, the additional force seems to come from the tendon storing and subsequently returning more strain energy (indicated by higher stress). We found that the increase in tendon stress comes from higher ground force at faster speeds, and from it adopting a more crouched posture which increases the tendons’ stresses compared to an upright posture for a given speed (think of this as increasing the tendon’s stress capacity). We have substantially revised the discussion to highlight this.

      Similarly, does increased gross or net tendon mechanical energy storage & return improve hopping energetics? Would more tendon stress and strain energy storage with a given hysteresis value also dissipate more mechanical energy, requiring leg muscles to produce more net work? Does net or gross muscle work drive metabolic energy consumption?

      Based on the cost of generating force hypothesis, we think that gross muscle work would be linked to driving metabolic energy consumption. Our idea here is that the total body work is a product of the work done by the tendon and the muscle combined. If the tendon has the potential to do more work, then the total work can increase without muscle work needing to increase.

      The results interpret speed effects on biomechanics, but each kangaroo was only collected at 1 speed. Are inter-animal comparisons enough to satisfy this investigation?

      We have added a figure (Suppl Fig 9) to demonstrate the distribution of speed and number of trials per kangaroo. We have also removed "preferred" from the manuscript as this seems to cause confusion. Most kangaroos travelled at a range of “casual” speeds.

      Abstract: Can the authors more fully connect the concept of tendon stress and low metabolic rates during hopping across speeds? Surely, tendon mechanics don't directly drive the metabolic cost of hopping, but they affect muscle mechanics to affect energetics.

      Amended to: " This phenomenon may be related to greater elastic energy savings due to increasing tendon stress; however, the mechanisms which enable the rise in stress, without additional muscle work remain poorly understood." (Lines 25-27).

      The topic sentence in lines 61-63 may be misleading. The ensuing paragraph does not substantiate the topic sentence stating that ankle MTUs decouple speeds and energetics.

      We added "likely" to soften the statement. (Line 59)

      Lines 84-86: In humans, does more limb flexion and worse EMA necessitate greater active muscle volume? What about muscle contractile dynamics - See recent papers by Sawicki & colleagues that include Hill-type muscle mechanics in active muscle volume estimates.

      Added: “Smaller EMA requires greater muscle force to produce a given force on the ground, thereby demanding a greater volume of active muscle, and presumably greater metabolic rates than larger EMA for the same physiology”. (Line 80-82)

      Lines 106: can you give the context of what normal tendon safety factors are?

      Good idea. Added: "far lower than the typical safety factor of four to eight for mammalian tendons (Ker et al. 1988)." Line 106-107

      I thought EMA was relatively stable across speeds as per Biewener [Science & JAP '04]. However the authors gave an example of an elephant to suggest that it is typically inversely related to speed. Can the authors please explain the disconnect and the most appropriate explanation in this paragraph?

      Knee EMA in particular changed with speed in Biewener 2004. What is “typical” probably depends on the group of animals studied; e.g., cursorial quadrupedal mammals generally seem to maintain constant EMA, but other groups do not.

      These cases are presented to show a range of consequences for changing EMA (usually with mass, but sometimes with speed). We have made several adjustments to the paragraph to make this clearer. Lines 85-93.

      The results depend on the modeled internal moment arm (r). How confident are the authors in their little r prediction? Considering complications of joint mechanics in vivo including muscle bulging. Holzer et al. '20 Sci Rep demonstrated that different models of the human Achilles tendon moment arm predict vastly different relationships between the moment arm and joint angle.

      Our values for r and EMA closely align with previous papers which measured/calculate these values in kangaroos, such as Kram 1998, and thus we are confident in our interpretation.  

      This is a misleading results sentence: Small decreases in EMA correspond to a nontrivial increase in tendon stress, for instance, reducing EMA from 0.242 (mean minimum EMA of the slow group) to 0.206 (mean minimum EMA of the fast group) was associated with an ~18% increase in tendon stress. The authors could alternatively say that a ~15% decrease in EMA was associated with an ~18% increase in tendon stress, which seems pretty comparable.

      Thank you for pointing this out, it is important that it is made clearer. Although the change in relative magnitude is approximately the same (as it should be), this does not detract from the importance. The "small decrease in EMA" is referring to the absolute values, particularly in respect to the measurement error/noise. The difference is small enough to have been undetectable with other methods used in previous studies. We have amended the sentence to clarify this.

      It now reads: “Subtle decreases in EMA which may have been undetected in previous studies correspond to discernible increases in tendon stress. For instance, reducing EMA from 0.242 (mean minimum EMA of the slow group) to 0.206 (mean minimum EMA of the fast group) was associated with an increase in tendon stress from ~50 MPa to ~60 MPa, decreasing safety factor from 2 to 1.67 (where 1 indicates failure), which is both measurable and physiologically significant.” (Line 195-200)

      Lines 243-245: "The consistent net work observed among all speeds suggests the ankle extensors are performing similar amounts of ankle work independent of speed." If this is true, and presumably there is greater limb work performed on the center of mass at faster speeds (Donelan, Kram, Kuo), do more proximal leg joints increase work and energy consumption at faster speeds?

      The skin over the proximal leg joints (knee and hip) moves too much to get reliable measures of EMA from the ratio of moment arms. This will be pursued in future work when all muscles are incorporated in the model so knee and hip EMA can be determined from muscle force.

      We have added limitations and considerations paragraph to the manuscript: “Finally, we did not determine whether the EMA of proximal hindlimb joints (which are more difficult to track via surface motion capture markers) remained constant with speed. Although the hip and knee contribute substantially less work than the ankle joint (Fig. 4), the majority of kangaroo skeletal muscle is located around these proximal joints. A change in EMA at the hip or knee could influence a larger muscle mass than at the ankle, potentially counteracting or enhancing energy savings in the ankle extensor muscle-tendon units. Further research is needed to understand how posture and muscles throughout the whole body contribute to kangaroo energetics.” (Line 321-328)

      Lines 245-246: "Previous studies using sonomicrometry have shown that the muscles of tammar wallabies do not shorten considerably during hops, but rather act near-isometrically as a strut" Which muscles? All muscles? Extensors at a single joint?

      Added "gastrocnemius and plantaris" Line 164-165

      Lines 249-254: "The cost of generating force hypothesis suggests that faster movement speeds require greater rates of muscle force development, and in turn greater cross-bridge cycling rates, driving up metabolic costs (Taylor et al. 1980, Kram and Taylor 1990). The ability for the ankle extensor muscle fibres to remain isometric and produce similar amounts of work at all speeds may help explain why hopping macropods do not follow the energetic trends observed in quadrupedal species." These sentences confuse me. Kram & Taylor's cost of force-generating hypothesis assumes that producing the same average force over shorter contact times increases metabolic rate. How does 'similar muscle work' across all speeds explain the ability of macropods to use unique energetic trends in the cost of force-generating hypothesis context?

      Thank you for highlighting this confusion. We have substantially revised the discussion clarify where the mechanisms presented deviate from the cost of generating force hypothesis. Lines 270-309

      Reviewer #3 (Recommendations For The Authors):

      In addition to the points described in the public review, I have additional, related, specific comments:

      (1) Results: Please refer to the hypotheses in the results, and relate the the findings back to the hypotheses.

      We now relate the findings back to the hypotheses 

      Line 142 “In partial support of hypothesis (i), greater masses and faster speeds were associated with more crouched hindlimb postures (Fig. 3a,c).”.

      Lines 205-206: “The increase in tendon stress with speed, facilitated in part by the change in moment arms by the shift in posture, may explain changes in ankle work (c.f. Hypothesis (ii)).” 

      (2) Results: please provide the main statistical results either in-line or in a table in the main text.

      We (the co-authors) have discussed this at length, and have agreed that the manuscript is far more readable in the format whereby most statistics lie within the supplementary tables, otherwise a reader is met with a wall of statistics. We only include values in the main text when the magnitude is relevant to the arguments presented in the results and discussion.

      (3) Line 140: Describe how 'crouched' was defined.

      We have now added a brief definition of ‘Crouch factor’ after the figure caption. (Line 143) (Fig. 3a,c; where crouch factor is the ratio of total limb length to pelvis to toe distance).

      (4) Line 162: This seems to be a main finding and should be a figure in the main text not supplemental. Additionally, Supplementary Figures 3a and b do not show this finding convincingly There should be a figure plotting r vs speed and r vs mass.

      The combination of r and R are represented in the EMA plot in the main text. The r and R plots are relegated to the supplementary because the main text is already very crowded.  Thank you for the suggestion for the figure plotting r and R versus speed, this is now included as Suppl. Fig. 3h

      (5) Line 166: Supplementary Figure 3g does not show the range of dorsiflexion angles as a function of speed. It shows r vs dorsiflexion angle. Please correct.

      Thanks for noticing this, it was supposed to reference Fig 3g rather than Suppl Fig 3g in the sentence regarding speed. We have fixed this, Line 170. 

      We had added a reference to Suppl Fig 3 on Line 169 as this shows where the peak in r with ankle angle occurs (114.4 degrees).

      (6) Line 184: Where are the statistical results for this statement?

      The relationship between stress and EMA does not appear to be linear, thus we only present R<sup>^</sup>2 for the power relationship rather than a p-value. 

      (7) Line 192: The authors should explain how joint work and power relate/support the overall hypotheses. This section also refers to Figures 4 and 5 even though Figures 6 and 7 have already been described. Please reorganize.

      We have added a sentence at the end of the work and power section to mention hypothesis (ii) and lead into the discussion where it is elaborated upon. 

      “The increase in positive and negative ankle work may be due to the increase in tendon stress rather than additional muscle work.” Line 219-220 We have rearranged the figure order.

      (8) The statistics are not reported in the main text, but in the supplementary tables. If a result is reported in the main text, please report either in-line or with a table in the main text.

      We leave most statistics in the supplementary tables to preserve the readability of the manuscript. We only include values in the main text when the magnitude is relevant to the arguments raised in the results and discussion.

    1. Pero Sócrates tuvo razón en algo: la escritura sí atrofió nuestra memoria. No la de todos, por supuesto, pero sin duda relegó el acto de recordar a un segundo plano, tanto individual como colectivamente.

      Me parece significativa esta frase porque hace una clara referencia a cómo la IA, hasta cierto punto, nos vuelve esclavos y perezosos. Con esto no me refiero a que la IA sea mala (como se menciona en el texto), sino que me llama la atención la manera en que, tanto ahora como antes, las personas prefieren un método más sencillo que uno complicado para lograr el objetivo que tienen en mente. Antes se trataba de recordar, lo cual se facilitó con la escritura; ahora, en cambio, hablamos de tareas más cotidianas: escribir, buscar la solución a un problema, pedir consejos, tomar decisiones, etc.

      Como se menciona en la frase, no es una cuestión de bueno o malo, sino de que, como seres humanos, solemos buscar la facilidad en nuestras vidas. Inevitablemente, esto ha ocurrido, ocurre y seguirá ocurriendo a medida que sigamos evolucionando como sociedad.

    2. Pero, a cambio, la escritura nos abrió la posibilidad de conocer mucho más allá de lo que puede guardar una memoria humana individual.

      Esta frase es otra que me llamó mucho la atención debido a que refleja bien la realidad que tenemos en este momento con la IA. Si, es criticada por muchos, pero al mismo tiempo nos abrió puertas y caminos que antes no éramos capaces de visualizar.

      La IA es una herramienta que debe usarse con responsabilidad y cabeza para no volvernos dependientes, pero reitero, no es que sea algo malo, solo debe tomarse como lo que es: una herramienta en la que uno puede apoyarse.

    1. Pero Sócrates tuvo razón en algo: la escritura sí atrofió nuestra memoria. No la de todos, por supuesto, pero sin duda relegó el acto de recordar a un segundo plano, tanto individualmente, como colectivamente

      Me parece significativa esta frase porque hace una clara referencia a cómo la IA, hasta cierto punto, nos vuelve esclavos y perezosos. Con esto no me refiero a que la IA sea mala (como se menciona en el texto), sino que me llama la atención la manera en que, tanto ahora como antes, las personas prefieren un método más sencillo que uno complicado para lograr el objetivo que tienen en mente. Antes se trataba de recordar, lo cual se facilitó con la escritura; ahora, en cambio, hablamos de tareas más cotidianas: escribir, buscar la solución a un problema, pedir consejos, tomar decisiones, etc.

      Como se menciona en la frase, no es una cuestión de bueno o malo, sino de que, como seres humanos, solemos buscar la facilidad en nuestras vidas. Inevitablemente, esto ha ocurrido, ocurre y seguirá ocurriendo a medida que sigamos evolucionando como sociedad.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This paper presents results from four independent experiments, each of which tests for rhythmicity in auditory perception. The authors report rhythmic fluctuations in discrimination performance at frequencies between 2 and 6 Hz. The exact frequency depends on the ear and experimental paradigm, although some frequencies seem to be more common than others.

      Strengths:

      The first sentence in the abstract describes the state of the art perfectly: "Numerous studies advocate for a rhythmic mode of perception; however, the evidence in the context of auditory perception remains inconsistent". This is precisely why the data from the present study is so valuable. This is probably the study with the highest sample size (total of > 100 in 4 experiments) in the field. The analysis is very thorough and transparent, due to the comparison of several statistical approaches and simulations of their sensitivity. Each of the experiments differs from the others in a clearly defined experimental parameter, and the authors test how this impacts auditory rhythmicity, measured in pitch discrimination performance (accuracy, sensitivity, bias) of a target presented at various delays after noise onset.

      Weaknesses:

      (1) The authors find that the frequency of auditory perception changes between experiments. I think they could exploit differences between experiments better to interpret and understand the obtained results. These differences are very well described in the Introduction, but don't seem to be used for the interpretation of results. For instance, what does it mean if perceptual frequency changes from between- to within-trial pitch discrimination? Why did the authors choose this experimental manipulation? Based on differences between experiments, is there any systematic pattern in the results that allows conclusions about the roles of different frequencies? I think the Discussion would benefit from an extension to cover this aspect.

      We believe that interpreting these differences remains difficult and a precise, detailed (and possibly mechanistic) interpretation is beyond the goal of the present study. The main goal of this study was to explore the consistency and variability of effects across variations of the experimental design and samples of participants. Interpreting specific effects, e.g. at particular frequencies, would make sense mostly if differences between experiments have been confirmed in a separate reproduction. Still, we do provide specific arguments for why differences in the outcome between different experiments, e.g. with and without explicit trial initialization by the participants, could be expected. See lines 91ff in the introduction and 786ff in the discussion.

      (2) The Results give the impression of clear-cut differences in relevant frequencies between experiments (e.g., 2 Hz in Experiment 1, 6 Hz in Exp 2, etc), but they might not be so different. For instance, a 6 Hz effect is also visible in Experiment 1, but it just does not reach conventional significance. The average across the three experiments is therefore very useful, and also seems to suggest that differences between experiments are not very pronounced (otherwise the average would not produce clear peaks in the spectrum). I suggest making this point clearer in the text.

      We have revised the conclusions to note that the present data do not support clear cut differences between experiments. For this reason we also refrain from detailed interpretations of specific effects, as suggested by this reviewer in point 1 above.

      (3) I struggle to understand the hypothesis that rhythmic sampling differs between ears. In most everyday scenarios, the same sounds arrive at both ears, and the time difference between the two is too small to play a role for the frequencies tested. If both ears operate at different frequencies, the effects of the rhythm on overall perception would then often cancel out. But if this is the case, why would the two ears have different rhythms to begin with? This could be described in more detail.

      This hypothesis was not invented by us, but in essence put forward in previous work. The study by Ho et al. CurrBiol 2017 has reported rhythmic effects at different frequencies in the left and right ears, and we here tried to reproduce these effects. One could speculate about an ear-difference based on studies reporting a right-ear advantage in specific listening tasks, and the idea that different time scales of rhythmic brain activity may be specifically prevail in the left and right cortical hemispheres; hence it does not seem improbable that there could be rhythmic effects in both ears at different frequencies. We note this in the introduction, l. 65ff.

      Reviewer #2 (Public review):

      Summary:

      The current study aims to shed light on why previous work on perceptual rhythmicity has led to inconsistent results. They propose that the differences may stem from conceptual and methodological issues. In a series of experiments, the current study reports perceptual rhythmicity in different frequency bands that differ between different ear stimulations and behavioral measures.

      The study suggests challenges regarding the idea of universal perceptual rhythmicity in hearing.

      Strengths:

      The study aims to address differences observed in previous studies about perceptual rhythmicity. This is important and timely because the existing literature provides quite inconsistent findings. Several experiments were conducted to assess perceptual rhythmicity in hearing from different angles. The authors use sophisticated approaches to address the research questions.

      Weaknesses:

      (1) Conceptional concerns:

      The authors place their research in the context of a rhythmic mode of perception. They also discuss continuous vs rhythmic mode processing. Their study further follows a design that seems to be based on paradigms that assume a recent phase in neural oscillations that subsequently influence perception (e.g., Fiebelkorn et al.; Landau & Fries). In my view, these are different facets in the neural oscillation research space that require a bit more nuanced separation. Continuous mode processing is associated with vigilance tasks (work by Schroeder and Lakatos; reduction of low frequency oscillations and sustained gamma activity), whereas the authors of this study seem to link it to hearing tasks specifically (e.g., line 694). Rhythmic mode processing is associated with rhythmic stimulation by which neural oscillations entrain and influence perception (also, Schroeder and Lakatos; greater low-frequency fluctuations and more rhythmic gamma activity). The current study mirrors the continuous rather than the rhythmic mode (i.e., there was no rhythmic stimulation), but even the former seems not fully fitting, because trials are 1.8 s short and do not really reflect a vigilance task. Finally, previous paradigms on phase-resetting reflect more closely the design of the current study (i.e., different times of a target stimulus relative to the reset of an oscillation). This is the work by Fiebelkorn et al., Landau & Fries, and others, which do not seem to be cited here, which I find surprising. Moreover, the authors would want to discuss the role of the background noise in resetting the phase of an oscillation, and the role of the fixation cross also possibly resetting the phase of an oscillation. Regardless, the conceptional mixture of all these facets makes interpretations really challenging. The phase-reset nature of the paradigm is not (or not well) explained, and the discussion mixes the different concepts and approaches. I recommend that the authors frame their work more clearly in the context of these different concepts (affecting large portions of the manuscript).

      Indeed, the paradigms used here and in many similar previous studies incorporate an aspect of phase-resetting, as the presentation of a background noisy may effectively reset ongoing auditory cortical processes. Studies trying to probe for rhythmicity in auditory perception in the absence any background noise have not shown any effect (Zoefel and Heil, 2013), perhaps because the necessary rhythmic processes along auditory pathways are only engaged when some sound is present. We now discuss these points, and also acknowledge the mentioned studies in the visual system; l. 57.

      (2) Methodological concerns:

      The authors use a relatively unorthodox approach to statistical testing. I understand that they try to capture and characterize the sensitivity of the different analysis approaches to rhythmic behavioral effects. However, it is a bit unclear what meaningful effects are in the study. For example, the bootstrapping approach that identifies the percentage of significant variations of sample selections is rather descriptive (Figures 5-7). The authors seem to suggest that 50% of the samples are meaningful (given the dashed line in the figure), even though this is rarely reached in any of the analyses. Perhaps >80% of samples should show a significant effect to be meaningful (at least to my subjective mind). To me, the low percentage rather suggests that there is not too much meaningful rhythmicity present. 

      We note that there is no clear consensus on what fraction of experiments should be expected or how this way of quantifying effects should be precisely valued (l. 441ff). However, we now also clearly acknowledge in the discussion that the effective prevalence is not very high (l. 663).

      I suggest that the authors also present more traditional, perhaps multi-level, analyses: Calculation of spectra, binning, or single-trial analysis for each participant and condition, and the respective calculation of the surrogate data analysis, and then comparison of the surrogate data to the original data on the second (participant) level using t-tests. I also thought the statistical approach undertaken here could have been a bit more clearly/didactically described as well.

      We here realize that our description of the methods was possibly not fully clear. We do follow the strategy as suggested by this reviewer, but rather than comparing actual and surrogate data based on a parametric t-test, we compare these based on a non-parametric percentile-based approach. This has the advantage of not making specific (and possibly not-warranted) assumptions about the distribution of the data. We have revised the methods to clarify this, l. 332ff. 

      The authors used an adaptive procedure during the experimental blocks such that the stimulus intensity was adjusted throughout. In practice, this can be a disadvantage relative to keeping the intensity constant throughout, because, on average, correct trials will be associated with a higher intensity than incorrect trials, potentially making observations of perceptual rhythmicity more challenging. The authors would want to discuss this potential issue. Intensity adjustments could perhaps contribute to the observed rhythmicity effects. Perhaps the rhythmicity of the stimulus intensity could be analyzed as well. In any case, the adaptive procedure may add variance to the data.

      We have added an analysis of task difficulty to the results (new section “Effects of adaptive task difficulty“) to address this. Overall we do not find systematic changes in task difficulty across participants for most of the experiments, but for sure one cannot rule out that this aspect of the design also affects the outcomes.  Importantly, we relied on an adaptive task difficulty to actually (or hopefully) reduce variance in the data, by keeping the task-difficulty around a certain level. Give the large number of trials collected, not using such an adaptive produce may result in performance levels around chance or near ceiling, which would make impossible to detect rhythmic variations in behavior. 

      Additional methodological concerns relate to Figure 8. Figures 8A and C seem to indicate that a baseline correction for a very short time window was calculated (I could not find anything about this in the methods section). The data seem very variable and artificially constrained in the baseline time window. It was unclear what the reader might take from Figure 8.

      This figure was intended mostly for illustration of the eye tracking data, but we agree that there is no specific key insight to be taken from this. We removed this. 

      Motivation and discussion of eye-movement/pupillometry and motor activity: The dual task paradigm of Experiment 4 and the reasons for assessing eye metrics in the current study could have been better motivated. The experiment somehow does not fit in very well. There is recent evidence that eye movements decrease during effortful tasks (e.g., Contadini-Wright et al. 2023 J Neurosci; Herrmann & Ryan 2024 J Cog Neurosci), which appears to contradict the results presented in the current study. Moreover, by appealing to active sensing frameworks, the authors suggest that active movements can facilitate listening outcomes (line 677; they should provide a reference for this claim), but it is unclear how this would relate to eye movements. Certainly, a person may move their head closer to a sound source in the presence of competing sound to increase the signal-to-noise ratio, but this is not really the active movements that are measured here. A more detailed discussion may be important. The authors further frame the difference between Experiments 1 and 2 as being related to participants' motor activity. However, there are other factors that could explain differences between experiments. Self-paced trials give participants the opportunity to rest more (inter-trial durations were likely longer in Experiment 2), perhaps affecting attentional engagement. I think a more nuanced discussion may be warranted.

      We expanded the motivation of why self-pacing trials may effectively alter how rhythmic processes affect perception, and now also allude to attention and expectation related effects (l. 786ff). Regarding eye movements we now discuss the results in the light of the previously mentioned studies, but again refrain from a very detailed and mechanistic interpretation (l. 782).

      Discussion:

      The main data in Figure 3 showed little rhythmicity. The authors seem to glance over this fact by simply stating that the same phase is not necessary for their statistical analysis. Previous work, however, showed rhythmicity in the across-participant average (e.g., Fiebelkorn's and similar work). Moreover, one would expect that some of the effects in the low-frequency band (e.g., 2-4 Hz) are somewhat similar across participants. Conduction delays in the auditory system are much smaller than the 0.25-0.5 s associated with 2-4 Hz. The authors would want to discuss why different participants would express so vastly different phases that the across-participant average does not show any rhythmicity, and what this would mean neurophysiologically.

      We now discussion the assumptions and implications of similar or distinct phases of rhythmic processes within and between participants (l. 695ff). In particular we note that different origins of the underlying neurophysiological processes eventually may suggest that such assumptions are or a not warranted.  

      An additional point that may require more nuanced discussion is related to the rhythmicity of response bias versus sensitivity. The authors could discuss what the rhythmicity of these different measures in different frequency bands means, with respect to underlying neural oscillations.

      We expanded discussion to interpret what rhythmic changes in each of the behavioral metric could imply (l. 706ff).

      Figures:

      Much of the text in the figures seems really small. Perhaps the authors would want to ensure it is readable even for those with low vision abilities. Moreover, Figure 1A is not as intuitive as it could be and may perhaps be made clearer. I also suggest the authors discuss a bit more the potential monoaural vs binaural issues, because the perceptual rhythmicity is much slower than any conduction delays in the auditory system that could lead to interference.

      We tried to improve the font sizes where possible, and discuss the potential monaural origins as suggested by other reviewers. 

      Reviewer #3 (Public review):

      Summary:

      The finding of rhythmic activity in the brain has, for a long time, engendered the theory of rhythmic modes of perception, that humans might oscillate between improved and worse perception depending on states of our internal systems. However, experiments looking for such modes have resulted in conflicting findings, particularly in those where the stimulus itself is not rhythmic. This paper seeks to take a comprehensive look at the effect and various experimental parameters which might generate these competing findings: in particular, the presentation of the stimulus to one ear or the other, the relevance of motor involvement, attentional demands, and memory: each of which are revealed to effect the consistency of this rhythmicity.

      The need the paper attempts to resolve is a critical one for the field. However, as presented, I remain unconvinced that the data would not be better interpreted as showing no consistent rhythmic mode effect. It lacks a conceptual framework to understand why effects might be consistent in each ear but at different frequencies and only for some tasks with slight variants, some affecting sensitivity and some affecting bias.

      Strengths:

      The paper is strong in its experimental protocol and its comprehensive analysis, which seeks to compare effects across several analysis types and slight experiment changes to investigate which parameters could affect the presence or absence of an effect of rhythmicity. The prescribed nature of its hypotheses and its manner of setting out to test them is very clear, which allows for a straightforward assessment of its results

      Weaknesses:

      There is a weakness throughout the paper in terms of establishing a conceptual framework both for the source of "rhythmic modes" and for the interpretation of the results. Before understanding the data on this matter, it would be useful to discuss why one would posit such a theory to begin with. From a perceptual side, rhythmic modes of processing in the absence of rhythmic stimuli would not appear to provide any benefit to processing. From a biological or homeostatic argument, it's unclear why we would expect such fluctuations to occur in such a narrow-band way when neither the stimulus nor the neurobiological circuits require it.

      We believe that the framework for why there may be rhythmic activity along auditory pathways that shapes behavioral outcomes has been laid out in many previous studies, prominently here (Schroeder et al., 2008; Schroeder and Lakatos, 2009; Obleser and Kayser, 2019). Many of the relevant studies are cited in the introduction, which is already rather long given the many points covered in this study. 

      Secondly, for the analysis to detect a "rhythmic mode", it must assume that the phase of fluctuations across an experiment (i.e., whether fluctuations are in an up-state or down-state at onset) is constant at stimulus onset, whereas most oscillations do not have such a total phase-reset as a result of input. Therefore, some theoretical positing of what kind of mechanism could generate this fluctuation is critical toward understanding whether the analysis is well-suited to the studied mechanism.

      In line with this and previous comments (by reviewer 2) we have expanded the discussion to consider the issue of phase alignment (l. 695ff). 

      Thirdly, an interpretation of why we should expect left and right ears to have distinct frequency ranges of fluctuations is required. There are a large number of statistical tests in this paper, and it's not clear how multiple comparisons are controlled for, apart from experiment 4 (which specifies B&H false discovery rate). As such, one critical method to identify whether the results are not the result of noise or sample-specific biases is the plausibility of the finding. On its face, maintaining distinct frequencies of perception in each ear does not fit an obvious conceptual framework.

      Again this point was also noted by another reviewer and we expanded the introduction and discussion in this regard (l. 65ff).

      Reviewer #1 (Recommendations for the authors):

      (1) An update of the AR-surrogate method has recently been published (https://doi.org/10.1101/2024.08.22.609278). I appreciate that this is a lot of work, and it is of coursee up to the authors, but given the higher sensitivity of this method, it might be worth applying it to the four datasets described here.

      Reading this article we note that our implementation of the AR-surrogate method was essentially as suggested here, and not as implemented by Brookshire. In fact we had not realized that Brookshire had apparently computed the spectrum based on the group-average data. As explained in the Methods section, as now clarified even better, we compute for each participant the actual spectrum of this participant’s data, and a set of surrogate spectra. We then perform a group-average of both to compute the p-value of the actual group-average based on the percentile of the distribution of surrogate averages. This send step differs from Harris & Beale, which used a one-sided t-test. The latter is most likely not appropriate in a strict statistical sense, but possibly more powerful for detecting true results compared to the percentile-based approach that we used (see l. 332ff).

      (2) When results for the four experiments are reported, a reminder for the reader of how these experiments differ from each other would be useful.

      We have added this in the Results section.

      "considerable prevalence of differences around 4Hz, with dual‐task requirements leading to stronger rhythmicity in perceptual sensitivity". There is a striking similarity to recently published data (https://doi.org/10.1101/2024.08.10.607439 ) demonstrating a 4-Hz rhythm in auditory divided attention (rather than between modalities as in the present case). This could be a useful addition to the paragraph.

      We have added a reference to this preprint, and additional previous work pointing in the same direction mentioned in there.  

      (3) There are two typos in the Introduction: "related by different from the question", and below, there is one "presented" too much.

      These have been fixed.

      Reviewer #3 (Recommendations for the authors):

      My major suggestion is that these results must be replicated in a new sample. I understand this is not simple to do and not always possible, but at this point, no effect is replicated from one experiment to the next, despite very small changes in protocol (especially experiment 1 vs 2). It's therefore very difficult to justify explaining the different effects as real as opposed to random effects of this particular sample. While the bootstrapping effects show the level of consistency of the effect within the sample studied, it can not be a substitute for a true replication of the results in a new sample.

      We agree that only an independent replication can demonstrate the robustness of the results. We do consider experiment 1 a replication test of Ho et al. CurrBiol 2017, which results in different results than reported there. But more importantly, we consider the analysis of ‘reproducibility’ by simulating participant samples a key novelty of the present work, and want to emphasize this over the within-study replication of the same experiment.  In fact, in light of the present interpretation of the data, even a within-study replication would most likely not offer a clear-cut answer. 

      As I said in the public review, the interpretation of the results, and of why perceptual cycles in arhythmic stimuli could be a plausible theory to begin with, is lacking. A conceptual framework would vastly improve the impact and understanding of the results.

      We tried to strengthen the conceptual framework in the introduction. We believe that this is in large provided by previous work, and the aim of the present study was to explore the robustness of effects and not to suggest and discover novel effects. 

      Minor comments:

      (1) The authors adapt the difficulty as a function of performance, which seems to me a strange choice for an experiment that is analyzing the differences in performance across the experiment. Could you add a sentence to discuss the motivation for this choice?

      We now mention the rationale in the Methods section and in a new section of the Results. There we also provide additional analyses on this parameter.

      (2) The choice to plot the p-values as opposed to the values of the actual analysis feels ill-advised to me. It invites comparison across analyses that isn't necessarily fair. It would be more informative to plot the respective analysis outputs (spectral power, regression, or delta R2) and highlight the windows of significance and their overlap across analyses. In my opinion, this would be more fair and accurate depiction of the analyses as they are meant to be used.

      We do disagree. As explained in the Methods (l. 374ff): “(Showing p-values) … allows presenting the results on a scale that can be directly compared between analysis approaches, metrics, frequencies and analyses focusing on individual ears or the combined data. Each approach has a different statistical sensitivity, and the underlying effect sizes (e.g. spectral power) vary with frequency for both the actual data and null distribution. As a result, the effect size reaching statistical significance varies with frequency, metrics and analyses.” 

      The fact that the level of power (or R2 or whatever metric we consider) required to reach significance differs between analyses (one ear, both ears), metrics (d-prime, bias, RT) and between analyses approaches makes showing the results difficult, as we would need a separate panel for each of those. This would multiply the number of panels required e.g. for Figure 4 by 3, making it a figure with 81 axes. Also neither the original quantities of each analysis (e.g. spectral power) nor the p-values that we show constitute a proper measure of effect size in a statistical sense. In that sense, neither of these is truly ideal for comparing between analyses, metrics etc. 

      We do agree thought that many readers may want to see the original quantification and thresholds for statistical significance. We now show these in an exemplary manner for the Binned analysis of Experiment 1, which provides a positive result and also is an attempt to replicate the findings by  Ho et al 2017. This is shown in new Figure 5. 

      (3) Typo in line 555 (+ should be plus minus).

      (4) Typo in line 572: "Comparison of 572 blocks with minus dual task those without"

      (5) Typo in line 616: remove "one".

      (6) Line 666 refers to effects in alpha band activity, but it's unclear what the relationship is to the authors' findings, which peak around 6 Hz, lower than alpha (~10 Hz).

      (7) Line 688 typo, remove "amount of".

      These points have been addressed.  

      (8) Oculomotor effect that drives greater rhythmicity at 3-4 Hz. Did the authors analyze the eye movements to see if saccades were also occurring at this rate? It would be useful to know if the 3-4 Hz effect is driven by "internal circuitry" in the auditory system or by the typical rate of eye movement.

      A preliminary analysis of eye movement data was in previous Figure 8, which was removed on the recommendation of another review.  This showed that the average saccade rate is about 0.01 saccade /per trial per time bin, amounting to on average less than one detected saccade per trial. Hence rhythmicity in saccades is unlikely to explain rhythmicity in behavioral data at the scale of 34Hz. We now note this in the Results.

      Obleser J, Kayser C (2019) Neural Entrainment and Attentional Selection in the Listening Brain. Trends Cogn Sci 23:913-926.

      Schroeder CE, Lakatos P (2009) Low-frequency neuronal oscillations as instruments of sensory selection. Trends Neurosci 32:9-18.

      Schroeder CE, Lakatos P, Kajikawa Y, Partan S, Puce A (2008) Neuronal oscillations and visual amplification of speech. Trends Cogn Sci 12:106-113.

      Zoefel B, Heil P (2013) Detection of Near-Threshold Sounds is Independent of EEG Phase in Common Frequency Bands. Front Psychol 4:262.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      This is an interesting study characterizing and engineering so-called bathy phytochromes, i.e., those that respond to near infrared (NIR) light in the ground state, for optogenetic control of bacterial gene expression. Previously, the authors have developed a structure-guided approach to functionally link several light-responsive protein domains to the signaling domain of the histidine kinase FixL, which ultimately controls gene expression. Here, the authors use the same strategy to link bathy phytochrome light-responsive domains to FixL, resulting in sensors of NIR light. Interestingly, they also link these bathy phytochrome light-sensing domains to signaling domains from the tetrathionate-sensing SHK TtrS and the toluene-sensing SHK TodS, demonstrating the generality of their protein engineering approach more broadly across bacterial two-component systems.

      This is an exciting result that should inspire future bacterial sensor design. They go on to leverage this result to develop what is, to my knowledge, the first system for orthogonally controlling the expression of two separate genes in the same cell with NIR and Red light, a valuable contribution to the field.

      Finally, the authors reveal new details of the pH-dependent photocycle of bathy phytochromes and demonstrate that their sensors work in the gut - and plant-relevant strains E. coli Nissle 1917 and A. tumefaciens.

      Strengths:

      (1) The experiments are well-founded, well-executed, and rigorous.

      (2) The manuscript is clearly written.

      (3) The sensors developed exhibit large responses to light, making them valuable tools for ontogenetic applications.

      (4) This study is a valuable contribution to photobiology and optogenetics.

      We thank the reviewer for the positive verdict on our manuscript.

      Weaknesses:

      (1) As the authors note, the sensors are relatively insensitive to NIR light due to the rapid dark reversion process in bathy phytochromes. Though NIR light is generally non-phototoxic, one would expect this characteristic to be a limitation in some downstream applications where light intensities are not high (e.g., in vivo).

      We principally concur with this reviewer’s assessment that delivery of light (of any color) into living tissue can be severely limited by absorption, reflection, and scattering. That notwithstanding, at least two considerations suggest that in-vivo deployment of the pNIRusk setups we presently advance may be feasible.

      First, while the pNIRusk setups are indeed less light-sensitive compared to, e.g., our earlier redlight-responsive pREDusk and pDERusk setups (see Meier et al. Nat Commun 2024), we note that the overall light fluences required for triggering them are in the range of tens of µW per cm<sub>2</sub>. By contrast, optogenetic experiments in vivo, in particular in the neurosciences, often employ light area intensities on the order of mW per cm<sub>2</sub> and above. Put another way, compared to the optogenetic tools used in these experiments, the pNIRusk setups are actually quite sensitive to light.

      Second, sensitivity to NIR light brings the advantage of superior tissue penetration, see data reported by Weissleder Nat Biotech 2001 and Ash et al. Lasers Med Sci 2017 (both papers are cited in our manuscript). Based on these data, the intensity of blue light (450 nm) therefore falls off 5-10 times more strongly with penetration depth than that of NIR light (800 nm).

      We have added a brief treatment of these aspects in the Discussion section.

      (2) Though they can be multiplexed with Red light sensors, these bathy phytochrome NIR sensors are more difficult to multiplex with other commonly used light sensors (e.g., blue) due to the broad light responsivity of the Pfr state. This challenge may be overcome by careful dosing of blue light, as the authors discuss, but other bacterial NIR sensing systems with less cross-talk may be preferred in some applications.

      The reviewer is correct in noting that, at least to a certain extent, the pNIRusk systems also respond to blue light owing to their Soret absorbance bands (see Fig. 1). That said, we note two points:

      First, a given photoreceptor that preferentially responds to certain wavelengths, e.g., 700 nm in the case of conventional bacterial phytochromes (BphP), generally absorbs shorter wavelengths to some degree as well. Absorption of these shorter wavelengths suffices for driving electronic and/or vibronic transitions of the chromophore to higher energy levels which often give rise to productive photochemistry and downstream signal transduction. Put another way, a certain response of sensory photoreceptors to shorter wavelengths is hence fully expected and indeed experimentally borne out, as for instance shown by Ochoa-Fernandez et al. in the so-called PULSE setup (Nat Meth 2020, doi: 10.1038/s41592-020-0868-y).

      Second, known BphPs share similar Pr and Pfr absorbance spectra. We therefore expect other BphP-based optogenetic setups to also respond to blue light to some degree. Currently, there are insufficient data to gauge whether individual BphPs systematically differ in their relative sensitivity to blue compared to red or NIR light. Arguably, pertinent experiments may be an interesting subject for future study.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, Meier et al. engineer a new class of light-regulated two-component systems. These systems are built using bathy-bacteriophytochromes that respond to near-infrared (NIR) light. Through a combination of genetic engineering and systematic linker optimization, the authors generate bacterial strains capable of selective and tunable gene expression in response to NIR stimulation. Overall, these results are an interesting expansion of the optogenetic toolkit into the NIR range. The cross-species functionality of the system, modularity, and orthogonality have the potential to make these tools useful for a range of applications.

      Strengths:

      (1) The authors introduce a novel class of near-infrared light-responsive two-component systems in bacteria, expanding the optogenetic toolbox into this spectral range.

      (2) Through engineering and linker optimization, the authors achieve specific and tunable gene expression, with minimal cross-activation from red light in some cases.

      (3) The authors show that the engineered systems function robustly in multiple bacterial strains, including laboratory E. coli, the probiotic E. coli Nissle 1917, and Agrobacterium tumefaciens.

      (4) The combination of orthogonal two-component systems can allow for simultaneous and independent control of multiple gene expression pathways using different wavelengths of light.

      (5) The authors explore the photophysical properties of the photosensors, investigating how environmental factors such as pH influence light sensitivity.

      Weaknesses:

      (1) The expression of multi-gene operons and fluorescent reporters could impose a metabolic burden. The authors should present data comparing optical density for growth curves of engineered strains versus the corresponding empty-vector control to provide insight into the burden and overall impact of the system on host viability and growth.

      In response to this comment, we have recorded growth kinetics of bacteria harboring the pNIRusk-DsRed plasmids or empty vectors under both inducing (i.e., under NIR light) and noninducing conditions (i.e., darkness). We did not observe systematic differences in the growth kinetics between the different cultures, thus suggesting that under the conditions tested there is no adverse effect on cell viability.

      We include the new data in Suppl. Fig. 5c-d and refer to them in the main text.

      (2) The manuscript consistently presents normalized fluorescence values, but the method of normalization is not clear (Figure 2 caption describes normalizing to the maximal fluorescence, but the maximum fluorescence of what?). The authors should provide a more detailed explanation of how the raw fluorescence data were processed. In addition, or potentially in exchange for the current presentation, the authors should include the raw fluorescence values in supplementary materials to help readers assess the actual magnitude of the reported responses.

      We appreciate this valid comment and have altered the representation of the fluorescence data. All values for a given fluorescent protein (i.e., either DsRed or YPet) across all systems are now normalized to a single reference value, thus enabling direct comparison between experiments.

      (3) Related to the prior point, it would be useful to have a positive control for fluorescence that could be used to compare results across different figure panels.

      As all data are now normalized to the same reference value, direct comparison across all figures is enabled.

      (4) Real-time gene expression data are not presented in the current manuscript, but it would be helpful to include a time-course for some of the key designs to help readers assess the speed of response to NIR light.

      In response to this comment, we include in the revised manuscript induction kinetics of bacterial cultures bearing pNIRusk upon transfer to inducing NIR-light conditions. To this end, aliquots were taken at discrete timepoints, transcriptionally and translationally arrested, and analyzed for optical density and DsRed reporter fluorescence after allowing for chromophore maturation.

      We include the new data in Suppl. Fig. 5e and refer to them in the manuscript.

      Moreover, we note that the experiments in Agrobacterium tumefaciens used a luciferase reporter thus enabling the continuous monitoring of the light-induced expression kinetics. These data (unchanged in revision) are to be found in Suppl. Fig. 9.

      Reviewer #3 (Public review):

      Summary:

      This paper by Meier et al introduces a new optogenetic module for the regulation of bacterial gene expression based on "bathy-BphP" proteins. Their paper begins with a careful characterization of kinetics and pH dependence of a few family members, followed by extensive engineering to produce infrared-regulated transcriptional systems based on the authors' previous design of the pDusk and pDERusk systems, and closing with characterization of the systems in bacterial species relevant for biotechnology.

      Strengths:

      The paper is important from the perspective of fundamental protein characterization, since bathyBphPs are relatively poorly characterized compared to their phytochrome and cyanobacteriochrome cousins. It is also important from a technology development perspective: the optogenetic toolbox currently lacks infrared-stimulated transcriptional systems. Infrared light offers two major advantages: it can be multiplexed with additional tools, and it can penetrate into deep tissues with ease relative to the more widely used blue light-activated systems. The experiments are performed carefully, and the manuscript is well written.

      Weaknesses:

      My major criticism is that some information is difficult to obtain, and some data is presented with limited interpretation, making it difficult to obtain intuition for why certain responses are observed. For example, the changes in red/infrared responses across different figures and cellular contexts are reported but not rationalized. Extensive experiments with variable linker sequences were performed, but the rationale for linker choices was not clearly explained. These are minor weaknesses in an overall very strong paper.

      We are grateful for the positive take on our manuscript.

      Reviewer #1 (Recommendations for the authors):

      (1) As eLife is a broad audience journal, please define the Soret and Q-bands (line 125).

      We concur and have added labels in fig. 1a that designate the Soret and Q bands.

      (2) The initial (0) Ac design in Figure 2b is activated by NIR and Red light, albeit modestly. The authors state that this construct shows "constant reporter fluorescence, largely independent of illumination" (line 167). This language should be changed to reflect the fact that this Ac construct responds to both of these wavelengths.

      Agreed. We have amended the text accordingly.

      (3) pNIRusk Ac 0 appears to show a greater light response than pNIRusk Av -5. However, the authors claim that the former is not light-responsive and the latter is. This conclusion should be explained or changed.

      The assignment of pNIRusk Av-5 as light-responsive is based on the relative difference in reporter fluorescence between darkness and illumination with either red or NIR light. Although the overall fluorescence is much lower in Av-5 than for Av-0, the relative change upon illumination is much more pronounced. We add a statement to this effect to the text.

      (4) The authors state that "when combining DmDERusk-Str-YPet with AvTod+21-DsRed expression rose under red and NIR light, respectively, whereas the joint application of both light colors induced both reporter genes" (lines 258-261). In contrast, Figure 3c shows that application of both wavelengths of light results in exclusive activation of YPet expression. It appears the description of the data is wrong and must be corrected. That said, this error does not impact their conclusion that two separate target genes can be independently activated by NIR and red light.

      We thank the reviewer for catching this error which we have corrected in the revised manuscript.

      (5) Line 278: I don't agree with the authors' blanket statement that the use of upconversion nanoparticles is a "grave" limitation for NIR-light mediated activation of bacterial gene expression in vivo. The authors should either expound on the severity of the limitation or use more moderate language.

      We have replaced the word ‘grave’ by ‘potential’ and thereby toned down our wording.

      Reviewer #2 (Recommendations for the authors):

      (1) Please include a discussion on the expected depth penetration of different light wavelengths. This is most relevant in the context of the discussion about how these NIR systems could be used with living therapeutics.

      Given the heterogeneity of biological tissue, it is challenging to state precise penetration depths for different wavelengths of light. That said, blue light for instance is typically attenuated by biological tissue around 5 to 10 times as strongly as near-infrared light is.

      We have expanded the Discussion chapter to cover these aspects.

      (2) It would be helpful for Figure 2C (or supplementary) to also include the response to blue light stimulation.

      We agree and have acquired pertinent data for the blue-light response. The new data are included in an updated Fig. 2c. Data acquired at varying NIR-light intensities, originally included in Fig. 2c, have been moved to Suppl. Fig. 5a-b.

      (3) In Figure 4A, data on the response of E. coli Nissle to blue and red light are missing. Including this would help identify whether the reduced sensitivity to non-NIR wavelengths observed in the E. coli lab strain is preserved in the probiotic background.

      In response to this comment, we have acquired pertinent data on E. coli Nissle. While the results were overall similar to those in the laboratory strain, the response to blue and NIR light was yet lower in the Nissle bacteria which stands to benefit optogenetic applications.

      We have updated Fig. 4a accordingly. For clarity, we only show the data for AvNIRusk in the main paper but have relegated the data on AcNIRusk to Suppl. Fig. 8. (Note that this has necessitated a renumbering of the subsequent Suppl. Figs.)

      (4) On many of the figures, there are thin gray lines that appear between the panels that it would be nice to eliminate because, in some cases, they cut through words and numbers.

      The grey lines likely arose from embedding the figures into the text document. In the typeset manuscript, which has become available on the eLife webpage in the meantime, there are no such lines. That said, we will carefully check throughout the submission/publishing/proofing process lest these lines reappear.

      (5) Page 7, line 155: "As not least seen" typo or awkward phrasing.

      We have restructured the sentence and thereby hopefully clarified the unclear phrasing.

      (6) Page 7, line 167: It does not appear to be the case that the initial pNIRusk designs show constant fluorescence that is largely independent of illumination. AcNIRusk shows an almost twofold change from dark to NIR. Reword this to avoid confusion.

      We concur with this comment, similar to reviewer #1’s remark, and have adjusted the text accordingly.

      (7) Page 8, line 174: Related to the previous point, AvNIRusk has one design that is very minimally light switchable (-5), so stating that six light switchable designs have been identified is also confusing.

      As stated in our response to reviewer #1 above, the assignment of AvNIRusk-5 as light-switchable is based on the relative fluorescence change upon illumination. We have added an explanation to the text.

      (8) Page 10, line 228-229: I was not able to find the data showing that expression levels were higher for the DmTtr systems than the pREDusk and pNIRusk setups. This may be an issue related to the normalization point. It was not clear to me how to compare these values.

      We apologize for the initially unclear representation of the data. In response to this reviewer’s general comments above, we have now normalized all fluorescence values to a single reference value, thus allowing their direct comparison.

      (9) Page 12, line 264: "finer-grained expression control can be exerted..." Either show data or adjust the language so that it is clear this is a prediction.

      True, we have replaced ‘can’ by ‘could’.

      (10) Page 25, line 590: CmpX13 cells have a reference that is given later, but it should be added where it first appears.

      Agreed, we have added the reference in the indicated place.

      (11) Page 25, line 592: define LB/Kan.

      We had already defined this abbreviation further up but, for clarity, we have added it again in the indicated position.

      (12) Page 40, line 946: "normalized by" rather than "to".

      We have implemented the requested change in the indicated and several other positions of the manuscript.

      (13) Figures 2C, 3C, and similar plots in the supplementary material would benefit from having a legend for the colors.

      We agree and have added pertinent legends to the corresponding main and supplementary figures.

      (14) As a reader, I had some trouble following all the acronyms. This is at the author's discretion, but I would eliminate ones that are not strictly essential (e.g. MTP for microtiter plate; I was unable to identify what "MCS" meant; look for other opportunities to remove acronyms).

      In the revised manuscript, we have defined the abbreviation ‘MCS’ (for ‘multiple-cloning site’) upon first occurrence. We have decided to retain the abbreviation ‘MTP’ in the text.

      (15) Could the authors briefly speculate on why A. tumefaciens activation with red light might occur?

      While we can but speculate as to the underlying reasons for the divergent red-light response in A. tumefaciens, we discuss possible scenarios below.

      Commonly, two-component systems (TCS) exhibit highly cooperative and steep responses to signal. As a consequence, even small differences in the intracellular amounts of phosphorylated and unphosphorylated response regulator (RR) can give to significantly changed gene-expression output. Put another way, the gene-expression output need not scale linearly with the extent of RR phosphorylation but, rather, is expected to show nonlinear dependence with pronounced thresholding effects.

      Differences in the pertinent RR levels can for instance arise from variations in the expression levels of the pNIRusk system components between E. coli and A. tumefaciens. Moreover, the two bacteria greatly differ in their two-component-system (TCS) repertoire. Although TCSs are commonly well insulated from each other, cross-talk with endogenous TCSs, even if limited, may cause changes in the levels of phosphorylated RR and hence gene-expression output. In a similar vein, the RR can also be phosphorylated and dephosphorylated non-enzymatically, e.g., by reaction with high-energy anhydrides (such as acetyl phosphate) and hydrolysis, respectively. Other potential origins for the divergent red-light response include differences in the strength of the promoters driving expression of the pNIRusk system components and the fluorescent/luminescent reporters, respectively.

      (16) It would be helpful for the authors to briefly explain why they needed to switch to luminescence from fluorescence for the A. tumeraciens studies.

      While there was no strict necessity to switch from the fluorescence-based system used in E. coli to a luminescence-based system in A. tumefaciens, we opted for luminescence based on prior experience with other Alphaproteobacteria (e.g., 10.1128/mSystems.00893-21), where luminescence offered significant advantages. Specifically, it provides essentially background-free signal detection and greater sensitivity for monitoring gene expression. In addition, as demonstrated in Suppl. Fig. 9c and d, the luminescence system enables real-time tracking of gene expression dynamics, which further supported its use in our experimental setup (see our response to reviewer #2’s general comments).

      (17) This is a very minor comment that the authors can take or leave, but I got hung up on the word "implement" when it appeared a few times in the manuscript because I tended to read it as "put a plan into place" rather than its other meaning.

      In the abstract, we have replaced one instance of the word ‘implement’ by ‘instrument’.

      (18) The authors should include the relevant constructs on AddGene or another public strainsharing service.

      We whole-heartedly subscribe to the idea of freely sharing research materials with fellow scientists. Therefore, we had already deposited the most relevant AvNIRusk in Addgene, even prior to the initial submission of the manuscript (accession number 235084). In the meantime, we have released the deposition, and the plasmid can be obtained from Addgene since May 15<sub>th</sub> of this year.

      Reviewer #3 (Recommendations for the authors):

      Suggestion for improvement:

      This paper relies heavily on variations in linker sequences to shift responses. I am familiar with prior work from the Moglich lab in which helical linkers were employed to shift responses in synthetic two-component systems, with interesting periodicity in responses with every 7 residues (as expected for an alpha helix) and inversion of responses at smaller linker shifts. There is no mention in this paper whether their current engineering follows a similar rationale, what types of linkers are employed (e.g. flexible vs helical), and whether there is an interpretation for how linker lengths alter responses. Can you explain what classes of linker sequences are used throughout Figures 2 and 3, and whether length or periodicity affects the outcome? This would be very helpful for readers who are new to this approach, or if the rationale here differs from the authors' prior work.

      The PATCHY approach employed at present followed a closely similar rationale as in our previous studies. That is, linkers were extended/shortened and varied in their sequence by recombining different fragments of the natural linkers of the parental receptors, i.e., the bacteriophytochrome and the FixL sensor histidine kinase, respectively. We have added a statement to this effect in the text and a reference to Suppl. Fig. 3 which illustrates the principal approach.

      Compared to our earlier studies, we isolated fewer receptor variants supporting light-regulated responses, despite covering a larger sequence space. Owing to the sparsity of the light-regulated variants, an interpretation of the linker properties and their correlation with light-regulated activity is challenging. Although doubtless unsatisfying from a mechanistic viewpoint, we therefore refrain from a pertinent discussion which would be premature and speculative at this point. As the reviewer raises a valid and important point, we have expanded the text by referring to our earlier studies and the observed dependence of functional properties on linker composition.

      It is sometimes difficult to intuit or rationalize the differences in red/IR sensitivity across closely related variants. An important example appears in Figure 3C vs 3B. I think the AvTod+21 in 3B should be the equivalent to the DsRed response in the second column of 3C (AvTod+21 + DmDERusk), except, of course, that the bacteria in 3C carry an additional plasmid for the DERusk system. However, in 3B, the response to red light is substantial - ~50% as strong as that for IR, whereas in 3C, red light elicits no response at all. What is the difference? The reason this is important is that the AvTod+21 and DMDERusk represent the best "orthogonal" red and infrared light responses, but this is not at all obvious from 3B, where AvTod+21 still causes a substantial (and for orthogonality, undesirable) response under red light. Perhaps subtle differences in expression level due to plasmid changes cause these differences in light responses? Could the authors test how the expression level affects these responses? The paper would be greatly improved if observations of the diverse red/IR responses could be rationalized by some design criteria.

      As noted above in our response to reviewer #2, we have now normalized all fluorescence readings to joint reference values, thus allowing a better comparison across experiments.

      The reviewer is correct in noting that upon multiplexing, the individual plasmid systems support lower fluorescence levels than when used in isolation. We speculate that the combination of two plasmids may affect their copy numbers (despite the use of different resistance markers and origins of replications) and hence their performance. Likewise, the cellular metabolism may be affected when multiple plasmids are combined. These aspects may well account for the absent red-light response in AvTod+21 in the multiplexing experiments which is – indeed – unexpected. As, at present, we cannot provide a clear rationalization for this effect, we recommend verifying the performance of the plasmid setups when multiplexing.

      The paper uses "red" and "infrared" to refer to ~624 nm and ~800 nm light, respectively. I wonder whether it might be possible to shift these peak wavelengths to obtain even better separation for the multiplexing experiments. Perhaps shifting the specific red wavelength could result in better separation between DERusk and AvTod systems, for example? Could the authors comment on this (maybe based on action spectra of their previously developed tools) or perhaps test a few additional stimulation wavelengths?

      The choice of illumination wavelengths used in these experiments is dictated by the LED setups available for illumination of microtiter plates. On the one hand, we are using an SMD (surface-mount device) three-color LED with a fixed wavelength of the red channel around 624 nm (see Hennemann et al., 2018). On the other hand, we are deploying a custom-built device with LEDs emitting at around 800 nm (see Stüven et al., 2019 and this work). Adjusting these wavelengths is therefore challenging, although without doubt potentially interesting.

      To address this reviewer comment, we have added a statement to the text that the excitation wavelengths may be varied to improve multiplexed applications.

      Additional minor comments:

      (1) Figure 2C: It would be very helpful to place a legend on the figure panel for what the colors indicate, since they are unique to this panel and non-intuitive.

      This comment coincides with one by reviewer #2, and we have added pertinent legends to this and related supplementary figures.

      (2) Figure 3C: it is not obvious which system uses DsRed and which uses YPet in each combination, since the text indicates that all combinations were cloned, and this is not clearly described in the legend. Is it always the first construct in the figure legend listed for DsRed and the second for YPet?

      For clarification, we have revised the x-axis labels in Fig. 3C. (And yes, it is as this reviewer surmises: the first of the two constructs harbored DsRed and the second one YPet.)

    1. Note de synthèse : Les formes de la violence et le témoignage

      Ce document de synthèse explore les différentes formes et fonctions du témoignage face à la violence, en s'appuyant sur l'analyse de Didier Fassin dans "Les formes de la violence (8)".

      Il met en lumière l'importance de l'attestation de la violence, les diverses figures du témoin, les défis de sa représentation, et l'émergence de nouvelles médiations technologiques pour révéler la vérité.

      I. L'attestation de la violence : une urgence face à l'invisibilisation

      La raison d'être la plus commune de l'écriture et de la représentation de la violence est de l'attester, une urgence d'autant plus grande que la réalité est invisibilisée. L'auteur cite deux exemples contemporains de cette invisibilisation et des tentatives d'attestation :

      La violence coloniale française en Algérie : Malgré une loi de 2005 qui "oblige les programmes scolaires... à reconnaître le rôle positif de la présence française outre-mer", des travaux comme celui d'Alain Ruot (2024) dans "La première guerre en Algérie" rappellent les "spoliations de terre, les déplacements de population, les massacres de villageois, les enfumades de grottes, les centaines de milliers de morts surtout des civils" perpétrées par le corps expéditionnaire français.

      L'expulsion des Palestiniens (la Nakba) : L'expulsion de "750 000 Palestiniens, soit environ la moitié de la population arabe de ce territoire", qui a entraîné la "destruction de villages et dans certains cas du meurtre de leurs habitants", a longtemps été ignorée.

      Le film "Partition" (2025) de Dana Alan, prolongeant son ouvrage "Voices of the Nagba", vise à "restituer l'expérience de l'enagbactrale à travers les archives coloniales du mandat britannique" et les récits des Palestiniens.

      Ces entreprises visent à attester ce que les nations ont "enfoui souvent dans les profondeurs de l'oubli".

      Si les auteurs de violence peuvent avoir intérêt à la montrer pour "la jouissance de l'exercice de la force à la production d'un régime de terreur", ils ont souvent "un intérêt plus grand encore à la dissimuler, à la déguiser, à la nier" pour éviter la condamnation ou la sanction.

      Dans ces cas, il est crucial pour les victimes, leurs proches, et les "entrepreneurs de justice" (avocats, militants des droits humains, chercheurs) d'apporter la preuve de la violence, ses circonstances et ses responsables.

      "Attester la violence c'est donc combattre le déni, l'occultation, le mensonge, le révisionnisme historique. Attester la violence c'est emporter témoignage, c'est sans faire le témoin."

      II. Les figures du témoin : entre objectivité et subjectivité S'appuyant sur Émile Benveniste, l'auteur distingue deux conceptions du témoin, principalement à travers le latin :

      Testis : "celui qui assiste entière à une affaire où deux personnages sont intéressés ayant été présent au moment où les faits se sont produits".

      Sa parole "peut être utilisé pour trancher un litige à condition qu'il soit établi qu'il n'était pas lui-même partie prenante". Le testis est extérieur à la scène, son observation est présumée objective.

      Superstess : "décrit le témoin comme celui qui subsiste au-delà, témoin en même temps que survivant".

      Son témoignage est autorisé par le fait d'avoir "vécu lui-même les faits notamment lorsqu'il s'implique un danger ou une épreuve et d'avoir survécu à ce péril".

      Le superstess est la victime, son récit est nécessairement subjectif, mais non insoupçon.

      Cette distinction est mise à l'épreuve par la littérature sur la Shoah.

      A. Le défi du témoignage face à la dissimulation nazie

      L'histoire de l'extermination des Juifs et des Roms n'est pas quelque chose dont les nazis se vantaient, mais qu'ils ont cherché à dissimuler, y compris "vis-à-vis du peuple allemand et vis-à-vis d'eux-mêmes".

      Hannah Arendt, dans "Eichmann à Jérusalem", souligne l'usage d'un "langage codé" ou "règles de langage" qui étaient "dans le parler ordinaire... un mensonge", pour euphémiser les crimes : "solution finale", "traitement spécial", "évacuation".

      L'effet de ce système de langage n'était pas "d'empêcher les gens de savoir ce qu'ils faisaient, mais de les empêcher de mettre leurs actes en rapport avec leur ancienne notion normale du meurtre et du mensonge, en somme de rendre mentalement acceptable ce qui aurait pu leur paraître moralement intolérable."

      Pierre Vidal-Naquet ajoute que ce langage codé a facilité le négationnisme ultérieur.

      Les nazis, conscients de ce qui allait se passer, avertissaient cyniquement les prisonniers : "De quelque façon que cette guerre se finisse, nous l'avons déjà gagné contre vous ; aucun d'entre vous ne restera pour porter témoignage.

      Mais même si quelques-uns en réchappaient, le monde ne les croira pas, il n'y aura pas de certitude, car nous détruirons les preuves en vous détruisant." (Primo Levi, "Les naufragés et les rescapés").

      Cette peur du non-crédit a hanté les survivants, qui ont souvent raconté un cauchemar récurrent où leurs proches ne les croyaient pas.

      D'où l'importance vitale du témoignage, comme l'exprime Robert Antelme : "nous voulions parler, être entendu enfin".

      B. La complexité du témoignage des survivants (Superstess/Testis)

      Primo Levi, en écrivant "Si c'est un homme", cherchait à "attester" son expérience.

      Cependant, il exprime une profonde gêne, estimant que "nous les survivants ne sommes pas les vrais témoins... car nous sommes ceux qui grâce à la prévarication, l'habileté ou la chance, n'ont pas touché le fond."

      Les "musulmans" (ceux tellement affaiblis qu'ils étaient voués à mourir) sont les "témoins intégraux".

      La réflexion de Levi met à l'épreuve la distinction testis/superstess :

      • Il est un superstess incontestable, ayant survécu à l'impensable et décrivant l'insulte de la "démolition d'un homme".
      • Mais il est aussi un testis, conscient de ne jamais pouvoir restituer l'expérience de ceux qui ont été dévorés, et pour qui il parle "à leur place, par délégation".

      L'exemple d'Urbinec, l'enfant paralysé et mutique à Auschwitz, dont la "nécessité de parler jaillissait dans son regard avec une force explosive", et dont Primo Levi écrit "il témoigne à travers mes paroles", illustre cette réconciliation tragique des deux figures : "le superstès devenu testis sauve du néant la mémoire du petit garçon."

      C. Diversité des styles et temporalités du témoignage

      Les récits des survivants du génocide adoptent des styles et des temporalités variés :

      • Témoignage immédiat : David Rousset ("L'univers concentrationnaire", 1946) rencontre un succès rapide malgré la réticence des sociétés européennes, peut-être grâce à une "forme de recherche esthétique" créant une distance "qui neutralise les émotions".

      Son écriture est "austère et ironique", utilisant "des formules elliptiques et tranchantes, parfois caustiques et troublantes."

      • Témoignage différé : Charlotte Delbo ("Aucun de nous ne reviendra", 1965), écrit un premier brouillon après sa sortie, puis le reprend 20 ans plus tard. Elle commence par la scène collective des arrivées de trains, utilisant des phrases courtes et des images fortes pour dire "l'inconcevable".

      • Anti-mémoire : Imre Kertész ("Être et destin", 1985) adopte le regard "naïf déconcerté" d'un adolescent, décrivant la découverte progressive de l'horreur des camps, comme "l'odeur... doucâtre, en quelque sorte gluante" du crématorium.

      Il décrit la "détérioration physique" sans pathos, et même un "désir sourd" de vivre au moment du "tri final des mourants".

      • Méfiance et refus d'enfermement : Ruth Kluger ("Refus de témoigner. Une jeunesse", 1992) écrit pour exprimer sa méfiance face à la multiplication des témoignages et son refus d'être réduite à sa condition de déportée.

      • L'expérience des victimes du nazisme est à la fois "spécifique" (partir d'un vécu individuel) et "indéterminée" (nécessité de trouver les mots et la forme face à "l'incommunicabilité abyssale").

      Pour l'immense majorité des survivants, il faut "accepter de n'être ni superstès ni testice et donc se taire."

      III. Autres figures du témoin et médiations

      A. Auctor et Histor : l'autorité et la connaissance

      Auctor (latin) : "celui qui augmente la confiance, le garant, la source et donc l'autorité" et "celui qui pousse à agir, l'instigateur, le créateur et donc l'auteur".

      Le crédit est le fondement de son témoignage.

      Histor (grec) : "celui qui sait, qui connaît... l'historien". L'enquête est le fondement de son témoignage.

      Ces figures n'ont pas vécu les faits mais peuvent en être les garants. Les historiens contemporains "réunissent souvent les deux dimensions", bénéficiant du "crédit de leur discipline" et s'appuyant sur des "enquêtes menées dans des archives ou par des entretiens".

      L'exemple de Jean Hatzfeld et son livre "Dans le nu de la vie" (2000) sur le génocide rwandais illustre l'auctor.

      Il rassemble des récits de survivants, s'autorisant à les convaincre de parler malgré leur réticence.

      Journaliste et écrivain, il utilise sa double autorité pour "attester ce qu'a été et ce qu'est encore... l'expérience de ces hommes, de ces femmes, de ces enfants qui ont vécu le massacre."

      Bien que les récits soient rédigés à la première personne, ils sont "entièrement écrits par une troisième personne, l'auteur."

      • L'histore est illustré par les chercheurs en sciences sociales qui restituent et interprètent les faits en s'appuyant sur des "archives nationales ou étrangères, des jugements rendus par des juridictions internationales, des articles de journaux locaux, des entretiens avec des personnes occupant des positions différentes, des observations de procès".

      Les travaux de Mahmoud Mamdani ("When Victims Become Killers", 2001) interprètent le génocide rwandais à la lumière de l'histoire coloniale, distinguant le génocide conduit par les "settlers" (colons) et celui par les "natives" (indigènes).

      Hélène Dumas ("Le génocide au village", 2014) se concentre sur la "mécanique microlocale des violences", montrant que le génocide est "une affaire de voisins et de parents" et que les génocidaires "éprouvent une jouissance dans la souffrance et l'humiliation de leurs victimes."

      Beata Umubyeyi Mairesse ("Le convoi", 2024), une survivante du génocide rwandais, se distingue par sa réflexivité et son intégrité.

      Elle est à la fois superstess, racontant sa survie, et testis, décrivant ce qu'elle a vu.

      Elle se fait également historienne de son histoire, explorant des archives et conduisant des entretiens, mais "elle répugne à faire acte d'autorité," refusant d'être l'auctor.

      B. Martous : le témoin-martyr

      En grec ancien, "Martous" signifie le témoin, mais aussi, plus spécifiquement dans la Bible, le "témoin de Dieu", c'est-à-dire le martyr, celui qui "a accepté de mourir pour attester de sa croyance".

      Giorgio Agamben ("Ce qui reste d'Auschwitz", 1998) note que le martyre chrétien a dû "justifier le scandale d'une mort insensée".

      Le "shaï" arabe a un sens similaire, désignant à la fois le témoin et le martyr.

      En Palestine, la figure du shaïd s'est développée comme "ciment de l'unité nationale".

      Le shaïd peut être une victime tuée "sans l'avoir choisi" ou un combattant qui s'est exposé "volontairement pour la cause de son peuple".

      Ce dédoublement transforme le sens du martyre, l'étendant du "sacrifice librement consenti à la mort subie", et du "strictement religieux au politique".

      "Tout palestinien abattu ou exécuté par les Israéliens est un shaïd qui par sa mort dans un affrontement inégal atteste son appartenance à sa communauté et témoigne de la brutalisation de l'ennemi."

      Pour les martyrs palestiniens, le sacrifice ou la mort est une réponse à une "vie impossible à quoi la mort viendrait tragiquement redonner du sens".

      L'auteur cite la photojournaliste Fatima Assuna : "Quant à la mort qui est inévitable, si je meurs, je veux une mort retentissante, je ne veux pas être une simple brève dans un flash info ni un chiffre parmi d'autres, je veux une mort dont le monde entier entendra parler, une empreinte qui restera à jamais, des émotions, des images immortelles que ni le temps ni l'espace ne pourront enterrer."

      IV. Les médiations technologiques du témoignage

      Le témoignage ne s'exprime pas seulement par la parole, l'écrit ou le corps (dans le cas du martyr), mais aussi par des "médiations dans lesquelles les technologies peuvent être mobilisées".

      L'exemple le plus innovant est Forensic Architecture (fondée en 2010 par Eyal Weizman), une agence qui développe des "techniques, méthodes et concepts pour conduire des investigations sur la violence d'État et la violence en entreprise".

      • En combinant "l'imagerie spatiale par satellite, les caméras de surveillance, les enregistrements audio et vidéo, les témoignages individuels et collectifs", Forensic Architecture reconstitue en 3D des événements de violence qui ont été occultés.

      Parmi les nombreux cas étudiés, on trouve le génocide des Herero et Nama, les massacres israéliens pendant la Nakba, l'assassinat d'otages en Colombie, le meurtre de Mark Duggan au Royaume-Uni, l'utilisation d'armes européennes au Yémen, et des événements en France (Adama Traoré, Zineb Reddouane).

      Ces technologies permettent de "révéler de nombreuses violences, des crimes de guerre identifiés, des coupables reconnus, des versions officielles démenties, certaines vérités dites et la justice parfois rendue".

      Elles "renforcent, enrichissent et parfois même remplacent le témoignage humain".

      V. Conclusion : La complexité du témoignage pour faire exister la vérité

      En résumé, l'auteur a esquissé cinq figures idéaltypiques du témoin :

      • Le testis : présent au moment des faits, dont il peut raconter.
      • Le superstess : survivant, qui peut transmettre ce qu'il a vécu.
      • L'auctor : agent extérieur, qui apporte la crédibilité.
      • L'histor : expert légitime, qui conduit une enquête.
      • Le martous : victime sacrificielle, qui affirme la justesse de sa cause par son renoncement.

      • Chacune de ces figures "engage des formes politiques et morales : la véracité du testis, l'authenticité du superstès, l'autorité de l'actor, la neutralité de l'histor, l'engagement du Martus."

      Ces figures ne sont pas étanches et "se mêlent, se combinent, se déplacent, se complexifient" dans la réalité.

      Au-delà de ces distinctions, "l'enjeu du témoignage c'est de faire exister une vérité et notamment... de la faire exister contre la dissimulation, l'invisibilisation, la dénégation".

      C'est là toute l'importance de "celles et ceux qui ont pour projet de révéler la vérité ou tout au moins une part de la vérité à laquelle ils ont eu accès."

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The authors make fairly strong claims that "arousal-related fluctuations are isolated from neurons in the deep layers of the SC" (emphasis added). This conclusion is based on comparisons between a "slow drift axis", a low-dimensional representation of neuronal drift, and other measures of arousal (Figures 2C, 3) and motor output sensitivity (Figures 2B, 3B). However, the metrics used to compare the slow-drift axis and motor activity were computed during separate task epochs: the delay period (600-1100 ms) and a perisaccade epoch (25 ms before and after saccade initiation), respectively. As the authors reference, deep-layer SC neurons are typically active only around the time of a saccade. Therefore, it is not clear if the lack of arousal-related modulations reported for deep-layer SC neurons is because those neurons are truly insensitive to those modulations, or if the modulations were not apparent because they were assessed in an epoch in which the neurons were not active. A potentially more valuable comparison would be to calculate a slow-drift axis aligned to saccade onset. 

      The reviewer makes an important point that the calculation of an axis can depend critically on the time window of neuronal response. We find when considering this that the slow drift axis is less sensitive to this issue because it is calculated on time-averaged activity over multiple trials. In previous work we found that slow drift calculated on the stimulus evoked response in V4 was very well aligned to slow drift calculated on pre-stimulus spontaneous activity (Cowley et al, Neuron, 2020, Supplemental Figure 3A and 3B). To address this issue in the present data, we compared the axis computed for an example session for neural activity during the delay period and neural activity aligned to saccade onset. As shown new Figure 2 – figure supplement 1 in the revised manuscript, we found a similar lack of arousal-related modulations for deep-layer SC neurons when slow drift was computed using the saccade epoch (25ms before to 25ms after the onset of the saccade). Figure 2 – figure supplement 1A shows loadings for the SC slow drift axis when it was computed using spiking responses during the delay period (as in the main manuscript analysis). In contrast, Figure 2 – figure supplement 1B shows loadings from the same session when the SC slow drift axis was computed using spiking responses during the saccade epoch. The plots are highly similar and in both cases the loadings were weaker for neurons recorded from channels at the bottom of the probe which have a higher motor index. Finally, we found that projections onto the SC slow drift axis for this session were strongly correlated when the slow drift axis was computed using spiking responses during the delay period and the saccade epoch (r = 0.66, p < 0.001, Figure 1C). Taken together, these results suggest that arousal-related modulations are less evident in deep-layer SC neurons irrespective of whether slow drift was computed during the delay or saccade epoch (see also Public Reviews, Reviewer 1, Point 2).

      (2) More generally, arousal-related signals may persist throughout multiple different epochs of the task. It would be worthwhile to determine whether similar "slow-drift" dynamics are observed for baseline, sensory-evoked, and saccade-related activity. Although it may not be possible to examine pupil responses during a saccade, there may be systematic relationships between baseline and evoked responses. 

      Similar to the point above, slow drift dynamics tend to be similar across different response epochs because they are averaged across many trials and seem to tap into responsivity trends that are robust across epochs. As shown in Author response image 1 below, and the Figure 2 – figure supplement 1 in the revised manuscript, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. We did not investigate differences between baseline and evoked pupil responses in the current paper. However, these effects were characterized in one of our previous papers that focused exclusively on the relationship between slow drift and eye-related metrics (Johnston et al., 2022, Cereb. Cortex, Figure 6). In this previous work, we found a negative correlation between baseline and evoked pupil size. Both variables were significantly correlated with slow drift, the only difference being the sign of the correlation.

      Author response image 1.

      (A-C) Dynamics of slow drift for three example sessions when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade epochs. Baseline = 100ms before the onset of the target stimulus; Delay = 600 to 1100ms after the offset of the target stimulus; Stim = 25ms to 125ms after the onset of the target stimulus; Sac = 25ms before to 25ms after the onset of the saccade.

      Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.

      (3) The relationships between changes in SC activity and pupil size are quite small (Figures 2C & 5C). Although the distribution across sessions (Figure 2C) is greater than chance, they are nearly 1/4 of the size compared to the PFC-SC axis comparisons. Likewise, the distribution of r2 values relating pupil size and spiking activity directly (Figure 5) is quite low. We remain skeptical that these drifts are truly due to arousal and cannot be accounted for by other factors. For example, does the relationship persist if accounting for a very simple, monotonic (e.g., linear) drift in pupil size and overall firing rate over the course of an individual session? 

      Firstly, it is important to note that the strength of the relationship between projections onto the SC slow drift axis and pupil size (r<sup>2</sup> = 0.06) is within the range reported by Joshi et al. (2016, Neuron, Figure 3). They investigated the median variance explained between the spiking responses of individual SC neurons and pupil size and found it to be approximately 0.02 across sessions. Secondly, our statistical approach of testing the actual distribution of r<sup>2</sup> values against a shuffled distribution was specifically designed to rule out the possibility that the relationship between SC spiking responses and pupil size occurred due to linear drifts. The shuffled distribution in Figure 2C of the main manuscript represents the variance that can be explained by one session’s slow drift correlated with another session’s pupil, which would contain effects that occurred due to linear drifts alone. That the actual proportion of variance explained was significantly greater than this distribution suggests that the relationship between projections onto the SC slow drift axis and pupil size reflects changes in arousal rather than other factors related to linear drifts.

      Joshi S, Li Y, Kalwani RM, Gold JI (2016) Relationships between Pupil Diameter and Neuronal Activity in the Locus Coeruleus, Colliculi, and Cingulate Cortex. Neuron 89:221–234.

      (4) It is not clear how the final analysis (Figure 6) contributes to the authors' conclusions. The authors perform PCA on: (i) residual spiking responses during the delay period binned according to pupil size, and (ii) spiking responses in the saccade epoch binned according to target location (i.e., the saccade tuning curve). The corresponding PCs are the spike-pupil axis and the saccade tuning axis, respectively. Unsurprisingly, the spikepupil axis that captures variance associated with arousal (and removes variance associated with saccade direction) was not correlated with a saccade-tuning axis that captures variance associated with saccade direction and omits arousal. Had these measures been related it would imply a unique association between a neuron's preferred saccade direction and pupil control- which seems unlikely. The separation of these axes thus seems trivial and does not provide evidence of a "mechanism...in the SC to prevent arousal-related signals interfering with the motor output." It remains unknown whether, for example, arousal-related signals may impact trial-by-trial changes in neuronal gain near the time of a saccade, or alter saccade dynamics such as acceleration, precision, and reaction time. 

      The reviewer makes a good point, and we agree that more evidence is needed to determine if the separation of the pupil size axis and saccade tuning axis is the mechanism through which cognitive and arousal-related signals can be intermixed in the SC. In the revised manuscript (lines 679-682), we have raised this as a possible explanation that necessitates further study rather than stating definitively that it is the exact mechanism through which these signals are kept separate. Our analysis here is similar to the one from Smoulder et al (2024, Neuron, Fig. 2F), in which the interactions between reward signals and target tuning in M1 were examined (and found to be orthogonal). While we agree with the reviewer that it may seem “trivial” for these axes to be orthogonal, it does not have to be so. If, for example, neural tuning curves shifted with changes in pupil size through gain changes that revealed tuning or affected tuning curve shape, there could be projections of the pupil axis onto the target tuning axis. Thus, while we agree with the reviewer that it appears sensible for these two axes to be orthogonal, our result is nonetheless a novel finding. We have edited the text in our revised manuscript, however, to make sure the nuance of this point is conveyed to the reader.

      Smoulder AL, Marino PJ, Oby ER, Snyder SE, Miyata H, Pavlovsky NP, Bishop WE, Yu BM, Chase SM, Batista AP. A neural basis of choking under pressure. Neuron. 2024 Oct 23;112(20):3424-33.

      Reviewer #2 (Public Review):

      (1) The greatest weakness in the present research is the fact that arousal is a functionally less important non-motoric variable. The authors themselves introduce the problem with a discussion of attention, which is without any doubt the most important cognitive process that needs to be functionally isolated from oculomotor processes. Given this introduction, one cannot help but wonder, why the authors did not design an experiment, in which spatial attention and oculomotor control are differentiated. Absent such an experiment, the authors should spend more time explaining the importance of arousal and how it could interfere with oculomotor behavior. 

      Although attention does represent an important cognitive process, we did not design an experiment in which attention and oculomotor control are differentiated because attention does not appear to be related to slow drift. In our first paper that reported on this phenomenon, we investigated the effects of spatial attention on slow fluctuations in neural activity by cueing the monkeys to attend to a stimulus in the left or right visual field in a block-wise manner. Each block lasted ~20 minutes and we found that slow drift did not covary with the timing of cued blocks (see Figure 4A, Cowley et al., 2020, Neuron). Furthermore, there is a large body of work showing that arousal also impacts motor behavior leading to changes in a range of eye-related metrics (e.g., pupil size, microsaccade rate and saccadic reaction time - for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). We also note that the terms attention and arousal are often used in nonspecific and overlapping ways in the literature, adding to some potential confusion here. Nonetheless, pupil-linked arousal is an important variable that impacts motor performance. This has now been stated clearly in the Introduction of the revised manuscript (lines 108-114) to address the reviewer’s concerns and highlight the importance of studying how precise fixation and eye movements are maintained even in the presence of signals related to ongoing changes in brain state. 

      Cowley BR, Snyder AC, Acar K, Williamson RC, Yu BM, Smith MA (2020) Slow Drift of Neural Activity as a Signature of Impulsivity in Macaque Visual and Prefrontal Cortex. Neuron 108:551-567.e8.

      (2) In this context, it is particularly puzzling that one actually would expect effects of arousal on oculomotor behavior. Specifically, saccade reaction time, accuracy, and speed could be influenced by arousal. The authors should include an analysis of such effects. They should also discuss the absence or presence of such effects and how they affect their other results. 

      As described above, several studies across species have demonstrated that arousal impacts motor behavior e.g., saccade reaction time, saccade velocity and microsaccade rate (for review, see Di Stasi et al. 2013, Neurosci. Biobehav. Rev.). This has been clarified in the Introduction of the revised manuscript to address the reviewer's concerns (lines 108-114). Our prior work (Johnston et al, Cerebral Cortex, 2022) shows that slow drift impacts several types of oculomotor behavior. Overall, these studies highlight the impact of arousal on eye movements as a robust effect, and support the present investigation into arousal and oculomotor control signals. While we agree reaction time, accuracy, and speed all can be influenced by arousal depending on task demands, the present study is focused on the connection between slow fluctuations in neural activity, linked to arousal, and different subpopulations of SC neurons. 

      Di Stasi LL, Catena A, Cañas JJ, Macknik SL, Martinez-Conde S (2013) Saccadic velocity as an arousal index in naturalistic tasks. Neurosci Biobehav Rev 37:968–975.

      Johnston R, Snyder AC, Khanna SB, Issar D, Smith MA (2022) The eyes reflect an internal cognitive state hidden in the population activity of cortical neurons. Cereb Cortex 32:3331–3346.

      (3) The authors use the analysis shown in Figure 6D to argue that across recording sessions the activity components capturing variance in pupil size and saccade tuning are uncorrelated. however, the distribution (green) seems to be non-uniform with a peak at very low and very high correlation specifically. The authors should test if such an interpretation is correct. If yes, where are the low and high correlations respectively? Are there potentially two functional areas in SC? 

      We agree with the reviewer that our actual data distribution was non-uniform. We examined individual sessions with high and low variance explained and did not find notable differences. One source of this variation has to do with session length. Longer sessions in principle should have a chance distribution of variance explained closer to zero because they contained more time bins. Given that we had no specific hypothesis for a non-uniform distribution, we have simply displayed the full distribution of values in our figure and the statistical result of a comparison to a shuffled distribution.

      Reviewer #3 (Public Review):

      (1) However, I am concerned about two main points: First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC? In other words, it seems important to show distributions of encountered neurons (regardless of the motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of Figure Supplement 1. I elaborate more on these points in the detailed comments below. 

      The reviewer makes a good point about the efferent signals from SC. It is true that electrical thresholds are often lowest in intermediate layers, though deep layers do project to the oculomotor nuclei (Sparks, 1986; Sparks & Hartwich-Young, 1989) and often intermediate and deep layers are considered to function together to control eye movements (Wurtz & Albano, 1980). As suggested by the reviewer, we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index, as well as included the above references and points about the intermediate and deep layers (Lines 73-81). Aside from the question of which layers of the SC function as the “motor output”, the reviewer raises a separate and important question – are our deep recordings still in SC. Here, we can say definitively that they are. We removed neurons if they did not exhibit elevated (above baseline) firing rates during the visual or saccade epochs of the MGS task (see Methods section on “Exclusion criteria”). All included neurons possessed a visual, visuomotor or motor response, consistent with the response properties of neurons in the SC. In addition, we found a number of neurons well above the bottom of the probe with strong motor responses and minimal loadings onto the slow drift axis (see Figure 2 – figure supplement 1A), consistent with the reviewer’s comment that intermediate layer neurons are tuned for movement and play a role in saccade production.

      Mohler CW, Wurtz RH. Organization of monkey superior colliculus: intermediate layer cells discharging before eye movements. Journal of neurophysiology. 1976 Jul 1;39(4):722-44.

      Sparks DL. Translation of sensory signals into commands for control of saccadic eye movements: role of primate superior colliculus. Physiol Rev. 1986 Jan;66(1):118-71. doi: 10.1152/physrev.1986.66.1.118. PMID: 3511480.

      Sparks DL, Hartwich-Young R. The deep layers of the superior colliculus. Reviews of oculomotor research. 1989 Jan 1;3:213-55.

      Wurtz RH, Albano JE. Visual-motor function of the primate superior colliculus. Annu Rev Neurosci. 1980;3:189-226. doi: 10.1146/annurev.ne.03.030180.001201. PMID: 6774653.

      (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC. 

      The reviewer makes an important point about the SC’s visual responses. Neurons with a low motor index are, conversely, likely to have a stronger visual response index. However, we do not believe that changes in luminance can explain why the correlation between SC spiking response and pupil size is weaker for neurons with a lower motor index. Firstly, the changes in pupil size observed in the current paper and our previous work are slow and occur on a timescale of minutes (Cowley et al., 2020, Neuron) and are correlated with eye movement measures such as reaction time and microsaccade rate (Johnston et al., 2022, Cerebral Cortex). This is in stark contrast to luminance-evoked changes in pupil size that occur on a timescale of less than a second. Secondly, as shown the new Figure 5 – figure supplement 1 in the revised manuscript, very similar results were found when SC spiking responses were correlated with pupil size during the baseline period, when only the fixation point was on the screen. Although the luminance of the small peripheral target stimulus can result in small luminance-evoked changes in pupil size, no changes in luminance occurred during the baseline period which was defined as 100ms before the onset of the target stimulus. In Figure 2 – figure supplement 1 and Author response image 1 above, we show that slow drift is the same whether calculated on the baseline response, delay period, or peri-saccadic epoch. Thus, the measurement of slow drift is insensitive to the precise timing of the selection of both the window for the spiking response and the window for the pupil measurement. If luminance were the explanation for the slow changes in firing observed in visually responsive SC neurons, it would require those neurons to exhibit robust, sustained tuned responses to the small changes in retinal illuminance induced by the relatively small fluctuations in pupil size we observed from minute to minute. We are aware of no reports of such behavior in visually-responsive neurons in SC. We have included these analyses and this reasoning in the revised manuscript on lines 478-495.

      Reviewer#1 (Recommendations for the author):

      (1) It would be useful to provide line numbers in subsequent manuscripts for reviewers.

      Line numbers have been added in the revised version of the manuscript.

      (2) Page #6; last sentence: "...even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in motor output." I do not believe the authors have provided evidence that arousal levels were not associated with changes in motor output.

      As suggested by Reviewer 3 (see Public Reviews, Reviewer 3, Point 2), we have edited the text throughout the manuscript to say that slow drift was less evident in SC neurons with a higher motor index. This sentence in the revised manuscript now reads:

      “This provides a potential mechanism through which signals related to cognition and arousal can exist in the SC, and even impact processing at the early to mid stages of the visuomotor transformation, without leading to unwanted changes in SC neurons that are linked to saccade execution.”

      (3) Page #8; last paragraph: Although deep-layer SC neurons may not have been obtained during every recording session, a summary of the motor index scores observed along the probe across sessions would be useful to confirm their assumptions. 

      See Author response image 2 below which shows the motor index of each recoded SC neuron on the x-axis and session number on the y-axis. The points are colored by to the squared factor loading which represents the variance explained between the response a neuron and the slow drift axis (see Figure 3B of the main manuscript). You can see from this plot that neurons with a stronger component loading (shown in teal to yellow) typically have a lower motor index whereas the opposite is true for neurons with a weaker component loading (shown in dark blue).

      Author response image 2.

      Scatter plot showing the motor index of each recorded neuron along with the session number in which it was recorded. The points are colored by to the squared factor loading for each neuron along the slow drift axis. Note that loadings above 0.5 (33 data points in total) have been thresholded at 0.5 so that we could effectively use the color range to show all of the slow drift axis loadings.

      (4) Page #10; first paragraph: The authors should state the time window of the delay period used, since it may be distinct from the pupil analysis (first 200ms of delay). 

      This has been stated in the revised version of the manuscript. The sentence now reads:

      “We first asked if arousal-related fluctuations are present in the SC. As in previous studies that recorded from neurons in the cortex (Cowley et al., 2020), we found that the mean spiking responses of individual SC neurons during the delay period (chosen at random on each trial from a uniform distribution spanning 600-1100ms, see Methods) fluctuated over the course of a session while the monkeys performed the MGS task (Figure 2A, left).”

      (5) Page #10; second paragraph: Extra period at the end of a sentence: " most variance in the data..". 

      Fixed in the revised version of the manuscript.

      (6) Page #12: "between projections onto the SC slow drift axis and mean pupil size during the first 200ms of the delay period when a task-related pupil response could be observed." What criteria was used to determine whether a task-related pupil response was observed? 

      This was chosen based on the results of a previous study in our lab that used the same memory-guided saccade task to investigate the relationship between slow drift and changes in based and evoked pupil size (see Johnston et al., 2022, Cereb. Cortex, Figure 6B). The period was chosen based on plotting the average pupil size aligned on different trial epochs. As we show in Figure 5-figure supplement 3 above, the pupil interactions with slow drift did not depend on the particular time window of the pupil we chose.  

      (7) Page #14; Figure 2A: The axes for the individual channels are strangely floating and quite different from all other figures. Please label the channel in the figure legend that was used as an example of the projected values onto the slow drift axis.

      The figure has been changed in the revised version of the manuscript so that the tick mark denoting zero residual spikes per second is on the top layer of each plot. A scale bar was chosen instead of individual axes to reduce clutter in the figure as it was used to demonstrate how slow drift was computed. Residual spiking responses from all neurons were projected on the slow drift axis to generate the scatter plot in the bottom right-hand corner of Figure 2A. There is no single neuron to label.

      (8) Page #16: "These results demonstrate that even though arousal-related fluctuations are present in the SC, they are isolated from deep-layer neurons that elicit a strong saccadic response and presumably reside closer to the motor output." In line with our major comments, lack of arousal-related activity during the delay period is meaningless for deep-layer SC neurons that are generally inactive during this time. It does not imply that there is no arousal signal! 

      Addressed in Public Reviews, Reviewer 1, Point 1 & 2. We found a similar lack of arousal-related modulations reported for deep-layer SC neurons when slow drift was computed using the saccade epoch (Figure 1 above). In addition, similar dynamics were observed when the SC slow drift axis was computed using spiking responses during the baseline, delay, visual and saccade period (Figure 2).

      (9) Page #18: "These findings provide additional support for the hypothesis that arousalrelated fluctuations are isolated from neurons in the deep layers of the SC." The same criticism from above applies.

      Addressed in Public Reviews, Reviewer 1, Point 1 & 2.

      (10) Page #20; paragraph 3: "Taken together, the findings outlined above..." Would be useful to be more specific when referring to "activity" ; e.g., "...these neurons did not exhibit large fluctuations in delay-period activity over time".

      This sentence has been changed in the revised manuscript in light of the reviewer’s comments. It now reads:

      “In addition to being more weakly correlated with pupil size, the spiking responses of these neurons did not exhibit large fluctuations over time (Figure 2), and when considering the neuronal population as a whole, explained less variance in the slow drift axis when it was computed using population activity in the SC (Figure 3) and PFC (Figure 4).”

      Reviewer #3 (Recommendations for the author):

      The paper is clear and well-written. However, I am concerned about two main points: 

      (1) First, the authors repeatedly say that the "output" layers of the SC are the ones with the highest motor indices. This might not necessarily be accurate. For example, current thresholds for evoking saccades are lowest in the intermediate layers, and Mohler & Wurtz 1972 suggested that the output of the SC might be in the intermediate layers. Also, even if it were true that the high motor index neurons are the output, they are very few in the authors' data (this is also true in a lot of other labs, where it is less likely to see purely motor neurons in the SC). So, this makes one wonder if the electrode channels were simply too deep and already out of the SC. In other words, it seems important to show distributions of encountered neurons (regardless of motor index) across depth, in order to better know how to interpret the tails of the distributions in the motor index histogram and in the other panels of the figure supplement 1. I elaborate more on these points in the detailed comments below. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (2) Second, the authors find that the SC cells with a low motor index are modulated by pupil diameter. However, this could be completely independent of an "arousal signal". These cells have substantial visual responses. If the pupil diameter changes, then their activity should be influenced since the monkey is watching a luminous display. So, in this regard, the fact that they do not see "an arousal signal" in most motor neurons (through the pupil diameter analyses) is not evidence that the arousal signal is filtered out from the motor neurons. It could simply be that these neurons simply do not get affected by the pupil diameter because they do not have visual sensitivity. So, even with the pupil data, it is still a bit tricky for me to interpret that arousal signals are excluded from the "output layers" of the SC. 

      Addressed in Public Reviews, Reviewer 3, Point 2.

      (3) I think that a remedy to the first point above is to change the text to make it a bit more descriptive and less interpretive. For example, just say that the slow drifts were less evident among the neurons with high motor index. 

      We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 1).

      (4) For the second point, I think that it is important to consider the alternative caveat of different amounts of light entering the system. Changes in light level caused by pupil diameter variations can be quite large. 

      We thank the reviewer for this suggestion (see Public Reviews, Reviewer 3, Point 2).

      (5) Line 31: I'm a bit underwhelmed by this kind of statement. i.e. we already know that cognitive processes and brain states do alter eye movements, so why is it "critical" that high precision fixation and eye movements are maintained? And, isn't the next sentence already nulling this idea of criticality because it does show that the brain state alters the SC neurons? In fact, cognitive processes are already known to be most prevalent in the intermediate and deep layers of the SC. 

      It seems clear that while cognitive state does affect eye movements, it is desirable to have some separation between cognitive state and eye movement control. Covert attention, for instance, is precisely a situation where eye movement control is maintained to avoid overt saccades to the attended stimulus, and yet there are clear indications of attention’s impact on microsaccades and fixation. We stand by our statement that an important goal of vision is to have precise fixation and movements of the eye, and yet at the same time the eyes are subject to numerous influences by cognitive state.

      (6) Line 65: it is better to clarify that these are "functional layers" because there are actually more anatomical layers. 

      We have edited this sentence in the revised version of the manuscript so that it now reads:

      “The role of these projections in the visuomotor transformation depends on the functional layer of the SC in which they terminate”.

      (7) Line 73: this makes it sound like only the deepest layers are topographically organized, which is not true. Also, as early as Mohler & Wurtz, 1972, it was suggested that the intermediate layers have the biggest impacts downstream of the SC. This is also consistent with electrical microstimulation current thresholds for evoking saccades from the SC. 

      We have addressed the reviewers’ comments about the intermediate layers having the biggest impact downstream of the SC in Public Reviews, Reviewer 3, Point 1. Furthermore, line 73 has been changed in the revised manuscript so that it now reads:

      “As is the case for neurons in the superficial and intermediate layers, they [SC motor neurons] form a topographically organized map of visual space (White et al. 2017; Robinson 1972; Katnani and Gandhi 2011)”.  

      (8) Line 100: there is an analogous literature regarding the question of why unwanted muscle contractions do not happen. Specifically, in the context of why SC visual bursts do not automatically cause saccades (which is a similar problem to the ones you mention about cognitive signals interfering by generating unwanted eye movements), both Jagadisan & Gandhi, Curr Bio, 2022 and Baumann et al, PNAS, 2023 also showed that SC population activity not only has different temporal structure (Jagadisan & Gandhi) but also occupy different subspaces (Baumann et al) under these two different conditions (visual burst versus saccade burst). This is conceptually similar to the idea that you are mentioning here with respect to arousal. So, it is worth it to mention these studies here and again in the discussion. 

      We are grateful to the reviewer for these suggestions and have included text in the Introduction (Lines 125-128) and Discussion (Lines 678-682) of the revised manuscript along with the references cited above.

      (9) Line 147: as mentioned above, it is now generally accepted that there are quite a few "pure" motor neurons in the SC. This is consistent with what you find. E.g. Baumann et al., 2023. And, again see Mohler and Wurtz in the 1970's. So, I wonder how useful it is to go too much into this idea of the deeper motor neurons (e.g. the correlations in the other panels of the Figure 1 supplement). 

      This is related to the reviewer’s comment that the output of the SC might be in the intermediate layers. This concern has been addressed in Public Reviews, Reviewer 3, Point 1.

      (10) Figure 1 should say where the RF was for the shown spike rasters. i.e. were these the same saccade target across trials? And where was that location relative to the RF? It would help also in the text to say whether the saccade was always to the RF center or whether you were randomizing the target location. 

      We centered the array of saccade targets using the microstimulation-evoked eye movement for SC (see Methods section “Memory-guided saccade task”) to find the evoked eccentricity, and then used saccade targets with equal spacing of 45 degrees starting at zero (rightward saccade target). We did not do extensive RF mapping beyond this microstimulation centering. In Figure 1, the spike rasters are shown for a target that was visually identified to be within the neuron’s RF based on assessing responses to all 8 target angles. We have added information about this to the figure caption.

      (11) Line 218: but were there changes in the eye movement statistics? For example, the slow drift eye movements during fixation? Or even the microsaccades? 

      Addressed in Public Reviews, Reviewer 2, Point 2.  

      (12) Line 248: shuffling what exactly? I think that more explanation would be needed here. 

      Addressed in Public Reviews, Reviewer 1, Point 3.  

      (13) Line 263: but isn't this reflecting a sensory transient in the pupil diameter, since the target just disappeared? 

      Addressed in Public Reviews, Reviewer 3, Point 2.  

      (14) Line 271: I suspect that slow drift eye movements (in between microsaccades) would show higher correlations. Not sure how well you can analyze those with a video-based eye tracker. 

      We agree that fixational drift would be a worthwhile metric, but it is not one we have focused on here and to our knowledge does require higher precision tracking. 

      (15) Line 286: again, see above about similar demonstrations with respect to the visual and motor burst intervals, which clearly cause the same problem (even stronger) as the one studied here. 

      See reply, including Figure 2.

      (16) Line 330: again, I'm not sure deeper necessarily automatically means closer to the output. For example, current thresholds for evoked saccades grow higher as you go deeper. Maybe the authors can ask their colleague Neeraj Gandhi about this point specifically, just to be safe. Maybe the safest would be to remain descriptive about the data, and just say something like: arousal-related fluctuations were absent in our deepest recorded sites. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (17) Line 332: likewise, statements like this one here would be qualified if the output was the intermediate layers......anyway if I understand what I read so far in the paper, the signal will be anyway orthogonal to the motor burst population subspace. So, maybe there's no need to emphasize that it goes away in the very deepest layers. 

      See reply above, Public Reviews, Reviewer 1, Point 4.

      (18) Figure 3A: related to the above, I think one issue could be that the deeper contacts might already be out of the SC. Maybe some cell count distribution from each channel should help in this regard. i.e. were you finding way fewer saccade-related neurons in the deepest channels (even though the few that you found were with high motor index)? If so, then wouldn't this just mean that the channel was too deep? I think there needs to be an analysis like this, to convince readers that the channels were still in the SC. Ideally, electrical stimulation current thresholds for evoking saccades at different depths would be tested, but I understand that this can be difficult at this stage. 

      Addressed in Public Reviews, Reviewer 3, Point 1.

      (19) I keep repeating this because in general, cognitive effects are stronger in the intermediate/deeper layers than in the superficial layers. If these interfere with eye movements like arousal, then why should arousal be different?

      Few studies have investigated the effects of attention on “pure” movement SC neurons that only discharge during a saccade. One study, which we cited in Introduction (Ignashchenkova et al., 2004, Nat. Neurosci.), found significant differences in spiking responses between trials with and without attentional cueing for visual and visuomotor neurons. No significant difference was found for motor neurons, consistent with our hypothesis that signals related to cognition and arousal are kept separate from saccade-related signals in the SC.

      (20) The problem with Figure 5 and its related text is that the neurons with low motor index are additionally visual. So, of course, they can be modulated if the pupil diameter changes!

      Addressed in Public Reviews, Reviewer 3, Point 2.  

      (21) I had a hard time understanding Figure 6. 

      See reply above, Public Reviews, Reviewer 1, Point 4.

      (22) Line 586: these cells have more visual responses and will be affected by the amount of light entering the eye. 

      Addressed in Public Reviews, Reviewer 3, Point 2.

    1. Synthèse sur la situation des enfants sans abri logés dans les écoles en France

      Résumé

      Le sans-abrisme infantile connaît une augmentation alarmante en France, avec une hausse de 133 % depuis 2020, exacerbée par l'inflation et la crise du logement.

      Face à ce que le reportage décrit comme les "carences de l'État", des collectifs citoyens, notamment "Jamais sans toi" à Lyon, organisent l'occupation d'établissements scolaires pour offrir un abri nocturne à des familles à la rue.

      Ce document de synthèse se penche sur ce phénomène à travers le témoignage d'une famille d'origine angolaise – une mère et ses enfants – hébergée dans une école lyonnaise.

      Leur parcours met en lumière la précarité extrême, le traumatisme d'une tentative d'expulsion avortée, et l'impact psychologique profond sur les enfants.

      La situation révèle une tension critique entre la solidarité citoyenne, incarnée par les enseignants et les parents d'élèves, et l'inaction des pouvoirs publics, qui non seulement échouent à proposer des solutions de logement pérennes, mais exercent également une pression administrative sur les acteurs de cette solidarité.

      1. Le Phénomène du Sans-abrisme Infantile et la Réponse Citoyenne

      Le reportage met en évidence une crise sociale majeure : l'explosion du nombre d'enfants sans domicile fixe en France.

      Expansion et Causes :

      ◦ Le sans-abrisme infantile a augmenté de 133 % depuis 2020.   

      ◦ Les facteurs identifiés sont l'inflation, la multiplication des expulsions locatives et la pénurie de logements sociaux.   

      ◦ Les solutions d'urgence, conçues pour être temporaires, "s'éternisent".

      En 2023, les familles logées dans des écoles y sont restées en moyenne plus de six mois.

      L'Occupation des Écoles comme Palliatif :

      ◦ Face à cette situation, des collectifs citoyens comme "Jamais sans toi" à Lyon organisent l'occupation d'écoles pour héberger des familles.     ◦ Ampleur du phénomène à Lyon :      

      ▪ Actuellement, 17 écoles de la métropole lyonnaise accueillent 25 familles.       

      ▪ Depuis 2014, une soixantaine d'établissements ont servi de refuge à plus de 1000 enfants.   

      ◦ Ce mouvement n'est pas limité à Lyon ; des initiatives similaires existent à Strasbourg, Rennes et Paris.   

      ◦ Ce soutien repose sur la "générosité citoyenne" (parents d'élèves, professeurs, habitants) qui compense les défaillances de l'État.

      2. Étude de Cas : Le Parcours d'une Famille Angolaise

      Le reportage se concentre sur le témoignage poignant de Lucy (16 ans), Lina (12 ans) et leur mère, qui illustre la réalité humaine derrière les statistiques.

      De l'Angola à la Précarité en France :

      ◦ Arrivée en France lorsque Lucy avait 10 ans et Lina 5 ou 6 ans.   

      ◦ Premières expériences d'hébergement précaire : le 115 à Dijon dans une chambre partagée, puis un foyer à Digoin.   

      ◦ La journée, la famille devait quitter le 115 et trouver refuge dans des associations (Secours Populaire, églises) pour manger.   

      ◦ Lina décrit sa déception face à la réalité française, loin de l'image idéalisée des dessins animés :

      « Un pays super bien, que tout se passait bien, qu'on avait une vie normale ».  

      ◦ Elle a également été victime de moqueries et de racisme à l'école en raison de sa langue et de ses cheveux.

      Le Traumatisme de l'Expulsion Manquée (OQTF) :

      ◦ Il y a deux ans, la famille a fait l'objet d'une Obligation de Quitter le Territoire Français (OQTF).  

      ◦ La police est intervenue en pleine nuit dans leur appartement. Lucy, alors âgée de 14 ans, décrit une scène de panique et de violence :

      ses parents criant, son père menotté, et les enfants enfermés dans une chambre avec des policiers.   

      ◦ La famille a été conduite à Paris après 5 heures de route et placée dans un centre de détention pendant 4 heures.   

      ◦ À l'aéroport, leur vol pour l'Angola a été annulé. Les autorités les ont alors "abandonnés à l'aéroport", leur ordonnant simplement "de plus retourner où [ils] étaient".

      La Rupture Familiale et l'Errance :

      ◦ Après cet épisode, la famille est revenue à Lyon.

      Le mariage des parents n'étant pas reconnu en France, leur séparation a suivi. La mère s'est retrouvée seule avec ses enfants.   

      ◦ Ils ont enchaîné les solutions d'hébergement temporaires :

      un camping à Trévoux, un appartement à Bellecour, puis une association qui les a logés avec d'autres femmes, avant de trouver refuge dans l'école.

      3. La Vie Quotidienne dans une Salle de Classe

      L'école, bien qu'offrant un toit, impose des conditions de vie extrêmement contraignantes et précaires.

      Aspect

      Description

      Logement

      La famille dort sur des matelas gonflables dans une salle de classe. Les vêtements sont stockés dans les armoires de la classe et des valises.

      Routine

      Lever obligatoire entre 6h30 et 6h50.

      La famille doit quitter les lieux avant 8h30 et ne peut revenir qu'après 18h00, une fois tous les élèves partis.

      Discrétion

      La nuit, il est interdit d'allumer les lumières pour ne pas attirer l'attention.

      La famille utilise les lampes de poche des téléphones pour s'éclairer.

      Insecurité

      Des jeunes jouant dans la cour sont déjà montés et ont fouillé dans leurs affaires, profitant d'une porte laissée ouverte.

      Perturbations

      La vie de la famille est rythmée par la sonnerie de l'école, qui retentit "toutes les heures".

      Lutte de la mère

      Elle cherche activement du travail (nettoyage, restauration) et des formations gratuites, mais sa situation rend les démarches très difficiles.

      4. Impacts Psychologiques et Sociaux sur les Enfants

      La précarité et l'instabilité ont des conséquences profondes sur le bien-être et le développement des enfants.

      Le Poids du Secret et de la Honte :

      ◦ Lucy cache sa situation à la plupart de ses amies par peur du jugement :

      « J'angoisse un peu, sachant que beaucoup de jeunes de mon âge [...] se permettent de juger tout simplement. »  

      ◦ Elle exprime un profond désir de normalité : « Des fois, je me dis que j'aimerais juste avoir une vie normale comme plein d'ados de mon âge. »  

      ◦ Lina exprime également la peur d'être mise à l'écart par ses camarades parce qu'elle vit dans une école.

      Aspirations et Résilience :

      ◦ Malgré les épreuves, Lucy est une bonne élève et aspire à devenir avocate.

      Son ambition est directement liée à son vécu : « J'ai envie d'être avocate, de défendre les gens parce que je me dis que tout le monde a le droit à une deuxième chance. »   

      ◦ Face à la détresse, elle a développé une stratégie de contrôle émotionnel : « Quand c'est dur, bah je prends sur moi et puis je me dis ça va aller. »  

      ◦ Sa plus grande peur reste matérielle et existentielle : « J'ai peur de me retrouver à la rue. Ça me fait peur. »

      5. La Solidarité Face à l'Inaction Institutionnelle

      Le reportage oppose la solidarité active du terrain à la réponse passive, voire répressive, des institutions.

      Le Soutien du Corps Enseignant :

      ◦ Une enseignante de l'école s'est fortement impliquée, dormant sur place la première nuit pour rassurer l'équipe périscolaire.  

      ◦ Elle a accueilli la famille chez elle pendant les vacances de Noël, une période particulièrement symbolique car la famille avait passé le Noël précédent dehors.  

      ◦ Une cagnotte organisée par ses collègues a permis d'offrir des cadeaux et un repas de fête à la famille.

      La Pression de la Hiérarchie :

      ◦ Suite à l'occupation, l'enseignante et ses collègues ont été convoquées par l'inspectrice d'académie.   

      ◦ La rencontre est décrite comme "un bon remontage de bretelle", où elles se sont fait "engueuler".

      L'inspectrice les a qualifiées d' "inconscientes", leur faisant porter "toute la responsabilité" sans reconnaître la vulnérabilité de la famille.

      L'Absence de Solutions Pérennes :

      ◦ Près d'un an après le début de l'occupation, "il n'y a aucune proposition de la mairie, de la métropole, aucune perspective, rien."   

      ◦ L'occupation de l'école a donc dû se poursuivre au-delà de l'année scolaire, mais avec des règles plus strictes :

      la famille n'a plus le droit d'être dans le bâtiment pendant les heures de classe.